1
|
Masella M, Léonforté F. The multi-scale polarizable pseudo-particle solvent coarse-grained approach: From NaCl salt solutions to polyelectrolyte hydration. J Chem Phys 2024; 160:204902. [PMID: 38780384 DOI: 10.1063/5.0194968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/22/2024] [Indexed: 05/25/2024] Open
Abstract
We discuss key parameters that affect the reliability of hybrid simulations in the aqueous phase based on an efficient multi-scale coarse-grained polarizable pseudo-particle approach, denoted as pppl, to model the solvent water, whereas solutes are modeled using an all atom polarizable force field. Among those parameters, the extension of the solvent domain (SD) at the solute vicinity (domain in which each solvent particle corresponds to a single water molecule) and the magnitude of solute/solvent short range polarization damping effects are shown to be pivotal to model NaCl salty aqueous solutions and the hydration of charged systems, such as the hydrophobic polyelectrolyte polymer that we have recently investigated [Masella et al., J. Chem. Phys. 155, 114903 (2021)]. Strong short range damping is pivotal to simulate aqueous salt NaCl solutions at moderate concentration (up to 1.0M). The SD extension (as well as short range damping) has a weak effect on the polymer conformation; however, it plays a pivotal role in computing accurate polymer/solvent interaction energies. As the pppl approach is up to two orders of magnitude computationally more efficient than all atom polarizable force field methods, our results show it to be an efficient alternative route to investigate the equilibrium properties of complex charged molecular systems in extended chemical environments.
Collapse
Affiliation(s)
- Michel Masella
- Laboratoire de Biologie Structurale et Radiobiologie, Service de Bioénergétique, Biologie Structurale et Mécanismes, Institut de Biologie et de Technologies de Saclay, CEA Saclay, F-91191 Gif sur Yvette Cedex, France
| | - Fabien Léonforté
- L'Oréal Group, Research and Innovation, Aulnay-Sous-Bois, France
| |
Collapse
|
2
|
Arts M, Garcia Satorras V, Huang CW, Zügner D, Federici M, Clementi C, Noé F, Pinsler R, van den Berg R. Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics. J Chem Theory Comput 2023; 19:6151-6159. [PMID: 37688551 DOI: 10.1021/acs.jctc.3c00702] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields, and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several protein simulations for systems up to 56 amino acids, reproducing the CG equilibrium distribution and preserving the dynamics of all-atom simulations such as protein folding events.
Collapse
Affiliation(s)
- Marloes Arts
- Department of Computer Science, University of Copenhagen, Universitetsparken 1, Copenhagen 2100, Denmark
| | - Victor Garcia Satorras
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| | - Chin-Wei Huang
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| | - Daniel Zügner
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
| | - Marco Federici
- Informatics Institute, University of Amsterdam, Science Park 904, Amsterdam 1098 XH, The Netherlands
| | - Cecilia Clementi
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
- Department of Physics, Freie Universität Berlin, Arnimalle 12, Berlin 14195, Germany
| | - Frank Noé
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
| | - Robert Pinsler
- AI4Science, Microsoft Research, 21 Station Road, Cambridge CB1 2FB, U.K
| | - Rianne van den Berg
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| |
Collapse
|
3
|
Zaporozhets I, Clementi C. Multibody Terms in Protein Coarse-Grained Models: A Top-Down Perspective. J Phys Chem B 2023; 127:6920-6927. [PMID: 37499123 DOI: 10.1021/acs.jpcb.3c04493] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Coarse-grained models allow computational investigation of biomolecular processes occurring on long time and length scales, intractable with atomistic simulation. Traditionally, many coarse-grained models rely mostly on pairwise interaction potentials. However, the decimation of degrees of freedom should, in principle, lead to a complex many-body effective interaction potential. In this work, we use experimental data on mutant stability to parametrize coarse-grained models for two proteins with and without many-body terms. We demonstrate that many-body terms are necessary to reproduce quantitatively the effects of point mutations on protein stability, particularly to implicitly take into account the effect of the solvent.
Collapse
Affiliation(s)
- Iryna Zaporozhets
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| | - Cecilia Clementi
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| |
Collapse
|
4
|
Raddi RM, Ge Y, Voelz VA. BICePs v2.0: Software for Ensemble Reweighting Using Bayesian Inference of Conformational Populations. J Chem Inf Model 2023; 63:2370-2381. [PMID: 37027181 PMCID: PMC10278562 DOI: 10.1021/acs.jcim.2c01296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
Bayesian Inference of Conformational Populations (BICePs) version 2.0 (v2.0) is a free, open-source Python package that reweights theoretical predictions of conformational state populations using sparse and/or noisy experimental measurements. In this article, we describe the implementation and usage of the latest version of BICePs (v2.0), a powerful, user-friendly and extensible package which makes several improvements upon the previous version. The algorithm now supports many experimental NMR observables (NOE distances, chemical shifts, J-coupling constants, and hydrogen-deuterium exchange protection factors), and enables convenient data preparation and processing. BICePs v2.0 can perform automatic analysis of the sampled posterior, including visualization, and evaluation of statistical significance and sampling convergence. We provide specific coding examples for these topics, and present a detailed example illustrating how to use BICePs v2.0 to reweight a theoretical ensemble using experimental measurements.
Collapse
Affiliation(s)
- Robert M Raddi
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Yunhui Ge
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| | - Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
5
|
Köhler J, Chen Y, Krämer A, Clementi C, Noé F. Flow-Matching: Efficient Coarse-Graining of Molecular Dynamics without Forces. J Chem Theory Comput 2023; 19:942-952. [PMID: 36668906 DOI: 10.1021/acs.jctc.3c00016] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Coarse-grained (CG) molecular simulations have become a standard tool to study molecular processes on time and length scales inaccessible to all-atom simulations. Parametrizing CG force fields to match all-atom simulations has mainly relied on force-matching or relative entropy minimization, which require many samples from costly simulations with all-atom or CG resolutions, respectively. Here we present flow-matching, a new training method for CG force fields that combines the advantages of both methods by leveraging normalizing flows, a generative deep learning method. Flow-matching first trains a normalizing flow to represent the CG probability density, which is equivalent to minimizing the relative entropy without requiring iterative CG simulations. Subsequently, the flow generates samples and forces according to the learned distribution in order to train the desired CG free energy model via force-matching. Even without requiring forces from the all-atom simulations, flow-matching outperforms classical force-matching by an order of magnitude in terms of data efficiency and produces CG models that can capture the folding and unfolding transitions of small proteins.
Collapse
Affiliation(s)
- Jonas Köhler
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Center for Theoretical Biological Physics, Rice University, Houston, Texas77005, United States.,Department of Physics, Rice University, Houston, Texas77005, United States.,Department of Chemistry, Rice University, Houston, Texas77005, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Chemistry, Rice University, Houston, Texas77005, United States.,Microsoft Research AI4Science, Karl-Liebknecht Strasse 32, 10178Berlin, Germany
| |
Collapse
|
6
|
Combining machine‐learning and molecular‐modeling methods for drug‐target affinity predictions. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
7
|
Dhamankar S, Webb MA. Chemically specific coarse‐graining of polymers: Methods and prospects. JOURNAL OF POLYMER SCIENCE 2021. [DOI: 10.1002/pol.20210555] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Satyen Dhamankar
- Department of Chemical and Biological Engineering Princeton University Princeton New Jersey USA
| | - Michael A. Webb
- Department of Chemical and Biological Engineering Princeton University Princeton New Jersey USA
| |
Collapse
|
8
|
Chen Y, Krämer A, Charron NE, Husic BE, Clementi C, Noé F. Machine learning implicit solvation for molecular dynamics. J Chem Phys 2021; 155:084101. [PMID: 34470360 DOI: 10.1063/5.0059915] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Accurate modeling of the solvent environment for biological molecules is crucial for computational biology and drug design. A popular approach to achieve long simulation time scales for large system sizes is to incorporate the effect of the solvent in a mean-field fashion with implicit solvent models. However, a challenge with existing implicit solvent models is that they often lack accuracy or certain physical properties compared to explicit solvent models as the many-body effects of the neglected solvent molecules are difficult to model as a mean field. Here, we leverage machine learning (ML) and multi-scale coarse graining (CG) in order to learn implicit solvent models that can approximate the energetic and thermodynamic properties of a given explicit solvent model with arbitrary accuracy, given enough training data. Following the previous ML-CG models CGnet and CGSchnet, we introduce ISSNet, a graph neural network, to model the implicit solvent potential of mean force. ISSNet can learn from explicit solvent simulation data and be readily applied to molecular dynamics simulations. We compare the solute conformational distributions under different solvation treatments for two peptide systems. The results indicate that ISSNet models can outperform widely used generalized Born and surface area models in reproducing the thermodynamics of small protein systems with respect to explicit solvent. The success of this novel method demonstrates the potential benefit of applying machine learning methods in accurate modeling of solvent effects for in silico research and biomedical applications.
Collapse
Affiliation(s)
- Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | - Brooke E Husic
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Rice University, Houston, Texas 77005, USA
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| |
Collapse
|
9
|
Unke O, Chmiela S, Sauceda HE, Gastegger M, Poltavsky I, Schütt KT, Tkatchenko A, Müller KR. Machine Learning Force Fields. Chem Rev 2021; 121:10142-10186. [PMID: 33705118 PMCID: PMC8391964 DOI: 10.1021/acs.chemrev.0c01111] [Citation(s) in RCA: 419] [Impact Index Per Article: 139.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Indexed: 12/27/2022]
Abstract
In recent years, the use of machine learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail, and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.
Collapse
Affiliation(s)
- Oliver
T. Unke
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Stefan Chmiela
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Huziel E. Sauceda
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Michael Gastegger
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Igor Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Kristof T. Schütt
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BIFOLD−Berlin
Institute for the Foundations of Learning and Data, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck
Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- Google
Research, Brain Team, Berlin, Germany
| |
Collapse
|
10
|
Lindorff-Larsen K, Kragelund BB. On the potential of machine learning to examine the relationship between sequence, structure, dynamics and function of intrinsically disordered proteins. J Mol Biol 2021; 433:167196. [PMID: 34390736 DOI: 10.1016/j.jmb.2021.167196] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 08/03/2021] [Accepted: 08/04/2021] [Indexed: 11/29/2022]
Abstract
Intrinsically disordered proteins (IDPs) constitute a broad set of proteins with few uniting and many diverging properties. IDPs-and intrinsically disordered regions (IDRs) interspersed between folded domains-are generally characterized as having no persistent tertiary structure; instead they interconvert between a large number of different and often expanded structures. IDPs and IDRs are involved in an enormously wide range of biological functions and reveal novel mechanisms of interactions, and while they defy the common structure-function paradigm of folded proteins, their structural preferences and dynamics are important for their function. We here discuss open questions in the field of IDPs and IDRs, focusing on areas where machine learning and other computational methods play a role. We discuss computational methods aimed to predict transiently formed local and long-range structure, including methods for integrative structural biology. We discuss the many different ways in which IDPs and IDRs can bind to other molecules, both via short linear motifs, as well as in the formation of larger dynamic complexes such as biomolecular condensates. We discuss how experiments are providing insight into such complexes and may enable more accurate predictions. Finally, we discuss the role of IDPs in disease and how new methods are needed to interpret the mechanistic effects of genomic variants in IDPs.
Collapse
Affiliation(s)
- Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| | - Birthe B Kragelund
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| |
Collapse
|
11
|
Wang J, Charron N, Husic B, Olsson S, Noé F, Clementi C. Multi-body effects in a coarse-grained protein force field. J Chem Phys 2021; 154:164113. [PMID: 33940848 DOI: 10.1063/5.0041022] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The use of coarse-grained (CG) models is a popular approach to study complex biomolecular systems. By reducing the number of degrees of freedom, a CG model can explore long time- and length-scales inaccessible to computational models at higher resolution. If a CG model is designed by formally integrating out some of the system's degrees of freedom, one expects multi-body interactions to emerge in the effective CG model's energy function. In practice, it has been shown that the inclusion of multi-body terms indeed improves the accuracy of a CG model. However, no general approach has been proposed to systematically construct a CG effective energy that includes arbitrary orders of multi-body terms. In this work, we propose a neural network based approach to address this point and construct a CG model as a multi-body expansion. By applying this approach to a small protein, we evaluate the relative importance of the different multi-body terms in the definition of an accurate model. We observe a slow convergence in the multi-body expansion, where up to five-body interactions are needed to reproduce the free energy of an atomistic model.
Collapse
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Nicholas Charron
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Brooke Husic
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
12
|
Hays JM, Boland E, Kasson PM. Inference of Joint Conformational Distributions from Separately Acquired Experimental Measurements. J Phys Chem Lett 2021; 12:1606-1611. [PMID: 33596657 PMCID: PMC8310705 DOI: 10.1021/acs.jpclett.0c03623] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Flexible proteins serve vital roles in a multitude of biological processes. However, determining their full conformational ensembles is extremely difficult because this requires detailed knowledge about the heterogeneity of the protein's degrees of freedom. Label-based experiments such as double electron-electron resonance (DEER) are very useful in studying flexible proteins, as they provide distributional data on heterogeneity. These experiments are typically performed separately, so information about correlation between distributions is lost. We have developed a method to recover correlation information using nonequilibrium work estimates in molecular dynamics refinement. We tested this method on a simple model of an alternating-access transporter for which the true joint distributions are known, and it successfully recovered the true joint distribution. We also applied our method to the protein syntaxin-1a, where it discarded physically implausible conformations. Our method thus provides a way to recover correlation structure in separate experimental measurements of conformational ensembles and refines the resulting structural ensemble.
Collapse
Affiliation(s)
- Jennifer M. Hays
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology, University of Virginia, Charlottesville, VA, USA
| | - Emily Boland
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology, University of Virginia, Charlottesville, VA, USA
| | - Peter M. Kasson
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology, University of Virginia, Charlottesville, VA, USA
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala, 75124 Sweden
- Corresponding Author:
| |
Collapse
|
13
|
Susanty M, Rajab TE, Hertadi R. A Review of Protein Structure Prediction using Deep Learning. BIO WEB OF CONFERENCES 2021. [DOI: 10.1051/bioconf/20214104003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Proteins are macromolecules composed of 20 types of amino acids in a specific order. Understanding how proteins fold is vital because its 3-dimensional structure determines the function of a protein. Prediction of protein structure based on amino acid strands and evolutionary information becomes the basis for other studies such as predicting the function, property or behaviour of a protein and modifying or designing new proteins to perform certain desired functions. Machine learning advances, particularly deep learning, are igniting a paradigm shift in scientific study. In this review, we summarize recent work in applying deep learning techniques to tackle problems in protein structural prediction. We discuss various deep learning approaches used to predict protein structure and future achievements and challenges. This review is expected to help provide perspectives on problems in biochemistry that can take advantage of the deep learning approach. Some of the unanswered challenges with current computational approaches are predicting the location and precision orientation of protein side chains, predicting protein interactions with DNA, RNA and other small molecules and predicting the structure of protein complexes.
Collapse
|
14
|
Empereur-Mot C, Pesce L, Doni G, Bochicchio D, Capelli R, Perego C, Pavan GM. Swarm-CG: Automatic Parametrization of Bonded Terms in MARTINI-Based Coarse-Grained Models of Simple to Complex Molecules via Fuzzy Self-Tuning Particle Swarm Optimization. ACS OMEGA 2020; 5:32823-32843. [PMID: 33376921 PMCID: PMC7758974 DOI: 10.1021/acsomega.0c05469] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 11/26/2020] [Indexed: 05/23/2023]
Abstract
We present Swarm-CG, a versatile software for the automatic iterative parametrization of bonded parameters in coarse-grained (CG) models, ideal in combination with popular CG force fields such as MARTINI. By coupling fuzzy self-tuning particle swarm optimization to Boltzmann inversion, Swarm-CG performs accurate bottom-up parametrization of bonded terms in CG models composed of up to 200 pseudo atoms within 4-24 h on standard desktop machines, using default settings. The software benefits from a user-friendly interface and two different usage modes (default and advanced). We particularly expect Swarm-CG to support and facilitate the development of new CG models for the study of complex molecular systems interesting for bio- and nanotechnology. Excellent performances are demonstrated using a benchmark of 9 molecules of diverse nature, structural complexity, and size. Swarm-CG is available with all its dependencies via the Python Package Index (PIP package: swarm-cg). Demonstration data are available at: www.github.com/GMPavanLab/SwarmCG.
Collapse
Affiliation(s)
- Charly Empereur-Mot
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Galleria 2, Via Cantonale 2c, CH-6928 Manno, Switzerland
| | - Luca Pesce
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Galleria 2, Via Cantonale 2c, CH-6928 Manno, Switzerland
| | - Giovanni Doni
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Galleria 2, Via Cantonale 2c, CH-6928 Manno, Switzerland
| | - Davide Bochicchio
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Galleria 2, Via Cantonale 2c, CH-6928 Manno, Switzerland
| | - Riccardo Capelli
- Department of Applied Science and Techology, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy
| | - Claudio Perego
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Galleria 2, Via Cantonale 2c, CH-6928 Manno, Switzerland
| | - Giovanni M. Pavan
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Galleria 2, Via Cantonale 2c, CH-6928 Manno, Switzerland
- Department of Applied Science and Techology, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy
| |
Collapse
|
15
|
Wang J, Chmiela S, Müller KR, Noé F, Clementi C. Ensemble learning of coarse-grained molecular dynamics force fields with a kernel approach. J Chem Phys 2020; 152:194106. [DOI: 10.1063/5.0007276] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
- Max Planck Institute for Informatics, Saarbrücken 66123, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
- Department of Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
16
|
Nerattini F, Figliuzzi M, Cardelli C, Tubiana L, Bianco V, Dellago C, Coluzza I. Identification of Protein Functional Regions. Chemphyschem 2020; 21:335-347. [PMID: 31944517 DOI: 10.1002/cphc.201900898] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 11/01/2019] [Indexed: 11/12/2022]
Abstract
Protein sequence stores the information relative to both functionality and stability, thus making it difficult to disentangle the two contributions. However, the identification of critical residues for function and stability has important implications for the mapping of the proteome interactions, as well as for many pharmaceutical applications, e. g. the identification of ligand binding regions for targeted pharmaceutical protein design. In this work, we propose a computational method to identify critical residues for protein functionality and stability and to further categorise them in strictly functional, structural and intermediate. We evaluate single site conservation and use Direct Coupling Analysis (DCA) to identify co-evolved residues both in natural and artificial evolution processes. We reproduce artificial evolution using protein design and base our approach on the hypothesis that artificial evolution in the absence of any functional constraint would exclusively lead to site conservation and co-evolution events of the structural type. Conversely, natural evolution intrinsically embeds both functional and structural information. By comparing the lists of conserved and co-evolved residues, outcomes of the analysis on natural and artificial evolution, we identify the functional residues without the need of any a priori knowledge of the biological role of the analysed protein.
Collapse
Affiliation(s)
- Francesca Nerattini
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Matteo Figliuzzi
- Sorbonne Universites, UPMC, Institut de Biologie Paris-Seine, CNRS, Laboratoire de Biologie Computationnelle et Quantitative UMR, 7238, Paris, France
| | - Chiara Cardelli
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Luca Tubiana
- Physics Department, Universitá degli studi di Trento, via Sommarive 14, 38123, Trento, IT
| | - Valentino Bianco
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria.,Faculty of Chemistry, Chemical Physics Department, Universidad Complutense de Madrid, Plaza de las Ciencias, Ciudad Universitaria, Madrid, 28040, Spain
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Ivan Coluzza
- CIC biomaGUNE, Paseo Miramon 182, 20014 San Sebastian, Spain, and IKERBASQUE, Basque Foundation for Science, 48013, Bilbao, Spain
| |
Collapse
|
17
|
Machine learning for protein folding and dynamics. Curr Opin Struct Biol 2020; 60:77-84. [DOI: 10.1016/j.sbi.2019.12.005] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 12/17/2022]
|
18
|
Orioli S, Larsen AH, Bottaro S, Lindorff-Larsen K. How to learn from inconsistencies: Integrating molecular simulations with experimental data. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 170:123-176. [PMID: 32145944 DOI: 10.1016/bs.pmbts.2019.12.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Molecular simulations and biophysical experiments can be used to provide independent and complementary insights into the molecular origin of biological processes. A particularly useful strategy is to use molecular simulations as a modeling tool to interpret experimental measurements, and to use experimental data to refine our biophysical models. Thus, explicit integration and synergy between molecular simulations and experiments is fundamental for furthering our understanding of biological processes. This is especially true in the case where discrepancies between measured and simulated observables emerge. In this chapter, we provide an overview of some of the core ideas behind methods that were developed to improve the consistency between experimental information and numerical predictions. We distinguish between situations where experiments are used to refine our understanding and models of specific systems, and situations where experiments are used more generally to refine transferable models. We discuss different philosophies and attempt to unify them in a single framework. Until now, such integration between experiments and simulations have mostly been applied to equilibrium data, and we discuss more recent developments aimed to analyze time-dependent or time-resolved data.
Collapse
Affiliation(s)
- Simone Orioli
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Haahr Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Atomistic Simulations Laboratory, Istituto Italiano di Tecnologia, Genova, Italy
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
19
|
Recent Progress towards Chemically-Specific Coarse-Grained Simulation Models with Consistent Dynamical Properties. COMPUTATION 2019. [DOI: 10.3390/computation7030042] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Coarse-grained (CG) models can provide computationally efficient and conceptually simple characterizations of soft matter systems. While generic models probe the underlying physics governing an entire family of free-energy landscapes, bottom-up CG models are systematically constructed from a higher-resolution model to retain a high level of chemical specificity. The removal of degrees of freedom from the system modifies the relationship between the relative time scales of distinct dynamical processes through both a loss of friction and a “smoothing” of the free-energy landscape. While these effects typically result in faster dynamics, decreasing the computational expense of the model, they also obscure the connection to the true dynamics of the system. The lack of consistent dynamics is a serious limitation for CG models, which not only prevents quantitatively accurate predictions of dynamical observables but can also lead to qualitatively incorrect descriptions of the characteristic dynamical processes. With many methods available for optimizing the structural and thermodynamic properties of chemically-specific CG models, recent years have seen a stark increase in investigations addressing the accurate description of dynamical properties generated from CG simulations. In this review, we present an overview of these efforts, ranging from bottom-up parameterizations of generalized Langevin equations to refinements of the CG force field based on a Markov state modeling framework. We aim to make connections between seemingly disparate approaches, while laying out some of the major challenges as well as potential directions for future efforts.
Collapse
|
20
|
Hays JM, Cafiso DS, Kasson PM. Hybrid Refinement of Heterogeneous Conformational Ensembles Using Spectroscopic Data. J Phys Chem Lett 2019; 10:3410-3414. [PMID: 31181934 PMCID: PMC6605767 DOI: 10.1021/acs.jpclett.9b01407] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Multistructured biomolecular systems play crucial roles in a wide variety of cellular processes but have resisted traditional methods of structure determination, which often resolve only a few low-energy states. High-resolution structure determination using experimental methods that yield distributional data remains extremely difficult, especially when the underlying conformational ensembles are quite heterogeneous. We have therefore developed a method to integrate sparse, multimultimodal spectroscopic data to obtain high-resolution estimates of conformational ensembles. We have tested our method by incorporating double electron-electron resonance data on the soluble N-ethylmaleimide-sensitive factor attachment receptor (SNARE) protein syntaxin-1a into biased molecular dynamics simulations. We find that our method substantially outperforms existing state-of-the-art methods in capturing syntaxin's open-closed conformational equilibrium and further yields new conformational states that are consistent with experimental data and may help in understanding syntaxin's function. Our improved methods for refining heterogeneous conformational ensembles from spectroscopic data will greatly accelerate the structural understanding of such systems.
Collapse
Affiliation(s)
- Jennifer M. Hays
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, 22903
- Department of Molecular Physiology and Biophysics, University of Virginia, Charlottesville, VA, 22903
| | - David S. Cafiso
- Department of Molecular Physiology and Biophysics, University of Virginia, Charlottesville, VA, 22903
- Department of Chemistry, University of Virginia, Charlottesville, VA, 22903
| | - Peter M. Kasson
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, 22903
- Department of Molecular Physiology and Biophysics, University of Virginia, Charlottesville, VA, 22903
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, 75124 Uppsala,
Sweden
| |
Collapse
|
21
|
Wang J, Olsson S, Wehmeyer C, Pérez A, Charron NE, de Fabritiis G, Noé F, Clementi C. Machine Learning of Coarse-Grained Molecular Dynamics Force Fields. ACS CENTRAL SCIENCE 2019; 5:755-767. [PMID: 31139712 PMCID: PMC6535777 DOI: 10.1021/acscentsci.8b00913] [Citation(s) in RCA: 199] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Indexed: 05/17/2023]
Abstract
Atomistic or ab initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force-matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse-grained beads and no solvent, while classical coarse-graining methods fail to capture crucial features of the free energy surface. Thus, CGnets are able to capture multibody terms that emerge from the dimensionality reduction.
Collapse
Affiliation(s)
- Jiang Wang
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Simon Olsson
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Christoph Wehmeyer
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Adrià Pérez
- Computational
Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr Aiguader 88, 08003 Barcelona, Spain
| | - Nicholas E. Charron
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Physics, Rice University, Houston, Texas 77005, United States
| | - Gianni de Fabritiis
- Computational
Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr Aiguader 88, 08003 Barcelona, Spain
- Institucio
Catalana de Recerca i Estudis Avanats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| | - Frank Noé
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department
of Physics, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
22
|
Chen J, Schafer NP, Wolynes PG, Clementi C. Localizing Frustration in Proteins Using All-Atom Energy Functions. J Phys Chem B 2019; 123:4497-4504. [PMID: 31063375 DOI: 10.1021/acs.jpcb.9b01545] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The problems of protein folding and protein design are two sides of the same coin. Protein folding involves exploring a protein's configuration space given a fixed sequence, whereas protein design involves searching in sequence space given a particular target structure. For a protein to fold quickly and reliably, its energy landscape must be biased toward the folded ensemble throughout its configuration space and must lack deep kinetic traps that would otherwise frustrate folding. Evolution has "designed" the sequences of many naturally occurring proteins, through an eons-long process of random mutation and selection, to yield landscapes with a minimal degree of frustration. The task facing humans hoping to design protein sequences that fold into particular structures is to use the available approximate energy functions to sculpt funneled landscapes that work in the laboratory. In this work, we demonstrate how to calculate several localized frustration measures using an all-atom energy function. Specifically, we employ the Rosetta energy function, which has been used successfully to design proteins and which has a natural pairwise decomposition that is suitably solvent-averaged. We calculate these newly developed frustration measures for both a mutated WW domain, FiP35, and a three-helix bundle that was designed completely by humans, Alpha3D. The structure of FiP35 exhibits less localized frustration than that of Alpha3D. A mutation toward the consensus sequence for WW domains in FiP35, which has been shown unexpectedly in experiment to disrupt folding, induces localized frustration by disrupting the hydrophobic core. By performing a limited redesign on the sequence of Alpha3D, we show that some, but not all, mutations that lower the energy also result in decreased frustration. The results suggest that, in addition to being useful for detecting residual frustration in protein structures, optimizing the localized frustration measures presented here may be a useful and automatic means of balancing positive and negative design in protein design tasks.
Collapse
|
23
|
Cardelli C, Nerattini F, Tubiana L, Bianco V, Dellago C, Sciortino F, Coluzza I. General Methodology to Identify the Minimum Alphabet Size for Heteropolymer Design. ADVANCED THEORY AND SIMULATIONS 2019. [DOI: 10.1002/adts.201900031] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Chiara Cardelli
- Faculty of PhysicsUniversity of ViennaBoltzmanngasse 5 1090 Vienna Austria
| | | | - Luca Tubiana
- Faculty of PhysicsUniversity of ViennaBoltzmanngasse 5 1090 Vienna Austria
| | - Valentino Bianco
- Faculty of ChemistryChemical Physics DepartmentUniversidad Complutense de Madrid, Plaza de las Ciencias, Ciudad UniversitariaMadrid 28040 Spain
| | - Christoph Dellago
- Faculty of PhysicsUniversity of ViennaBoltzmanngasse 5 1090 Vienna Austria
| | - Francesco Sciortino
- Dipartimento di FisicaSapienza Università di RomaPiazzale Aldo Moro 2 00185 Rome Italy
| | - Ivan Coluzza
- CIC biomaGUNEPaseo Miramon 182 20014 San Sebastian Spain
- IKERBASQUEBasque Foundation for Science48013 Bilbao Spain
| |
Collapse
|
24
|
Cesari A, Bottaro S, Lindorff-Larsen K, Banáš P, Šponer J, Bussi G. Fitting Corrections to an RNA Force Field Using Experimental Data. J Chem Theory Comput 2019; 15:3425-3431. [DOI: 10.1021/acs.jctc.9b00206] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Andrea Cesari
- Scuola Internazionale
Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory and Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory and Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Pavel Banáš
- Regional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University, tř. 17 listopadu 12, 771 46, Olomouc, Czech Republic
| | - Jiří Šponer
- Regional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University, tř. 17 listopadu 12, 771 46, Olomouc, Czech Republic
- Institute of Biophysics
of the Czech Academy of Sciences, Kralovopolska 135, Brno 612 65, Czech Republic
| | - Giovanni Bussi
- Scuola Internazionale
Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
25
|
Latham AP, Zhang B. Improving Coarse-Grained Protein Force Fields with Small-Angle X-ray Scattering Data. J Phys Chem B 2019; 123:1026-1034. [PMID: 30620594 DOI: 10.1021/acs.jpcb.8b10336] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Small-angle X-ray scattering (SAXS) experiments provide valuable structural data for biomolecules in solution. We develop a highly efficient maximum entropy approach to fit SAXS data by introducing minimal biases to a coarse-grained protein force field, the associative memory, water mediated, structure, and energy model (AWSEM). We demonstrate that the resulting force field, AWSEM-SAXS, succeeds in reproducing scattering profiles and models protein structures with shapes that are in much better agreement with experimental results. Quantitative metrics further reveal a modest, but consistent, improvement in the accuracy of modeled structures when SAXS data are incorporated into the force field. Additionally, when applied to a multiconformational protein, we find that AWSEM-SAXS is able to recover the population of different protein conformations from SAXS data alone. We, therefore, conclude that the maximum entropy approach is effective in fine-tuning the force field to better characterize both protein structure and conformational fluctuation.
Collapse
Affiliation(s)
- Andrew P Latham
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Bin Zhang
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
26
|
Advances in coarse-grained modeling of macromolecular complexes. Curr Opin Struct Biol 2018; 52:119-126. [PMID: 30508766 DOI: 10.1016/j.sbi.2018.11.005] [Citation(s) in RCA: 83] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 11/05/2018] [Accepted: 11/17/2018] [Indexed: 01/12/2023]
Abstract
Recent progress in coarse-grained (CG) molecular modeling and simulation has facilitated an influx of computational studies on biological macromolecules and their complexes. Given the large separation of length-scales and time-scales that dictate macromolecular biophysics, CG modeling and simulation are well-suited to bridge the microscopic and mesoscopic or macroscopic details observed from all-atom molecular simulations and experiments, respectively. In this review, we first summarize recent innovations in the development of CG models, which broadly include structure-based, knowledge-based, and dynamics-based approaches. We then discuss recent applications of different classes of CG models to explore various macromolecular complexes. Finally, we conclude with an outlook for the future in this ever-growing field of biomolecular modeling.
Collapse
|
27
|
Bottaro S, Lindorff-Larsen K. Biophysical experiments and biomolecular simulations: A perfect match? Science 2018; 361:355-360. [DOI: 10.1126/science.aat4010] [Citation(s) in RCA: 135] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
A fundamental challenge in biological research is achieving an atomic-level description and mechanistic understanding of the function of biomolecules. Techniques for biomolecular simulations have undergone substantial developments, and their accuracy and scope have expanded considerably. Progress has been made through an increasingly tight integration of experiments and simulations, with experiments being used to refine simulations and simulations used to interpret experiments. Here we review the underpinnings of this progress, including methods for more efficient conformational sampling, accuracy of the physical models used, and theoretical approaches to integrate experiments and simulations. These developments are enabling detailed studies of complex biomolecular assemblies.
Collapse
|