1
|
Meng X, Templeton C, Clementi C, Veit M. The role of an amphiphilic helix and transmembrane region in the efficient acylation of the M2 protein from influenza virus. Sci Rep 2023; 13:18928. [PMID: 37919373 PMCID: PMC10622425 DOI: 10.1038/s41598-023-45945-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 10/26/2023] [Indexed: 11/04/2023] Open
Abstract
Protein palmitoylation, a cellular process occurring at the membrane-cytosol interface, is orchestrated by members of the DHHC enzyme family and plays a pivotal role in regulating various cellular functions. The M2 protein of the influenza virus, which is acylated at a membrane-near amphiphilic helix serves as a model for studying the intricate signals governing acylation and its interaction with the cognate enzyme, DHHC20. We investigate it here using both experimental and computational assays. We report that altering the biophysical properties of the amphiphilic helix, particularly by shortening or disrupting it, results in a substantial reduction in M2 palmitoylation, but does not entirely abolish the process. Intriguingly, DHHC20 exhibits an augmented affinity for some M2 mutants compared to the wildtype M2. Molecular dynamics simulations unveil interactions between amino acids of the helix and the catalytically significant DHHC and TTXE motifs of DHHC20. Our findings suggest that the binding of M2 to DHHC20, while not highly specific, is mediated by requisite contacts, possibly instigating the transfer of fatty acids. A comprehensive comprehension of protein palmitoylation mechanisms is imperative for the development of DHHC-specific inhibitors, holding promise for the treatment of diverse human diseases.
Collapse
Affiliation(s)
- Xiaorong Meng
- Institute of Virology, Veterinary Faculty, Freie Universität Berlin, Berlin, Germany
| | - Clark Templeton
- Theoretical and Computational Biophysics, Department of Physics, Freie Universität Berlin, Berlin, Germany
| | - Cecilia Clementi
- Theoretical and Computational Biophysics, Department of Physics, Freie Universität Berlin, Berlin, Germany
| | - Michael Veit
- Institute of Virology, Veterinary Faculty, Freie Universität Berlin, Berlin, Germany.
| |
Collapse
|
2
|
Arts M, Garcia Satorras V, Huang CW, Zügner D, Federici M, Clementi C, Noé F, Pinsler R, van den Berg R. Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics. J Chem Theory Comput 2023; 19:6151-6159. [PMID: 37688551 DOI: 10.1021/acs.jctc.3c00702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields, and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several protein simulations for systems up to 56 amino acids, reproducing the CG equilibrium distribution and preserving the dynamics of all-atom simulations such as protein folding events.
Collapse
Affiliation(s)
- Marloes Arts
- Department of Computer Science, University of Copenhagen, Universitetsparken 1, Copenhagen 2100, Denmark
| | - Victor Garcia Satorras
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| | - Chin-Wei Huang
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| | - Daniel Zügner
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
| | - Marco Federici
- Informatics Institute, University of Amsterdam, Science Park 904, Amsterdam 1098 XH, The Netherlands
| | - Cecilia Clementi
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
- Department of Physics, Freie Universität Berlin, Arnimalle 12, Berlin 14195, Germany
| | - Frank Noé
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
| | - Robert Pinsler
- AI4Science, Microsoft Research, 21 Station Road, Cambridge CB1 2FB, U.K
| | - Rianne van den Berg
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| |
Collapse
|
3
|
Majewski M, Pérez A, Thölke P, Doerr S, Charron NE, Giorgino T, Husic BE, Clementi C, Noé F, De Fabritiis G. Machine learning coarse-grained potentials of protein thermodynamics. Nat Commun 2023; 14:5739. [PMID: 37714883 PMCID: PMC10504246 DOI: 10.1038/s41467-023-41343-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/29/2023] [Indexed: 09/17/2023] Open
Abstract
A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.
Collapse
Affiliation(s)
- Maciej Majewski
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain
| | - Adrià Pérez
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain
| | - Philipp Thölke
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Stefan Doerr
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain
| | - Nicholas E Charron
- Department of Physics, Rice University, Houston, TX, 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA
- Department of Physics, FU Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Toni Giorgino
- Biophysics Institute, National Research Council (CNR-IBF), 20133, Milan, Italy
| | - Brooke E Husic
- Department of Mathematics and Computer Science, FU Berlin, Arnimallee 12, 14195, Berlin, Germany
- Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08540, USA
- Princeton Center for Theoretical Science, Princeton University, Princeton, NJ, 08540, USA
- Center for the Physics of Biological Function, Princeton University, Princeton, NJ, 08540, USA
| | - Cecilia Clementi
- Department of Physics, Rice University, Houston, TX, 77005, USA.
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA.
- Department of Physics, FU Berlin, Arnimallee 12, 14195, Berlin, Germany.
- Department of Chemistry, Rice University, Houston, TX, 77005, USA.
| | - Frank Noé
- Department of Physics, FU Berlin, Arnimallee 12, 14195, Berlin, Germany.
- Department of Mathematics and Computer Science, FU Berlin, Arnimallee 12, 14195, Berlin, Germany.
- Department of Chemistry, Rice University, Houston, TX, 77005, USA.
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany.
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain.
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
4
|
Abstract
Coarse-grained models allow computational investigation of biomolecular processes occurring on long time and length scales, intractable with atomistic simulation. Traditionally, many coarse-grained models rely mostly on pairwise interaction potentials. However, the decimation of degrees of freedom should, in principle, lead to a complex many-body effective interaction potential. In this work, we use experimental data on mutant stability to parametrize coarse-grained models for two proteins with and without many-body terms. We demonstrate that many-body terms are necessary to reproduce quantitatively the effects of point mutations on protein stability, particularly to implicitly take into account the effect of the solvent.
Collapse
Affiliation(s)
- Iryna Zaporozhets
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| | - Cecilia Clementi
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| |
Collapse
|
5
|
Conev A, Rigo MM, Devaurs D, Fonseca AF, Kalavadwala H, de Freitas MV, Clementi C, Zanatta G, Antunes DA, Kavraki LE. EnGens: a computational framework for generation and analysis of representative protein conformational ensembles. Brief Bioinform 2023; 24:bbad242. [PMID: 37418278 PMCID: PMC10359083 DOI: 10.1093/bib/bbad242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 05/23/2023] [Accepted: 06/10/2023] [Indexed: 07/08/2023] Open
Abstract
Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in the number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing representative protein conformational ensembles. In this work, we: (1) provide an overview of existing methods and tools for representative protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples from the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein-ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations.
Collapse
Affiliation(s)
- Anja Conev
- Department of Computer Science, Rice University, Houston 77005, TX, USA
| | | | - Didier Devaurs
- MRC Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
| | | | - Hussain Kalavadwala
- Department of Biology and Biochemistry, University of Houston, Houston 77004, TX, USA
| | | | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Berlin 14195, Germany
| | - Geancarlo Zanatta
- Department of Biophysics, Institute of Biosciences, Federal University of Rio Grande do Sul, Porto Alegre 91501-970, Brazil
| | - Dinler Amaral Antunes
- Department of Biology and Biochemistry, University of Houston, Houston 77004, TX, USA
| | - Lydia E Kavraki
- Department of Computer Science, Rice University, Houston 77005, TX, USA
| |
Collapse
|
6
|
Conev A, Rigo MM, Devaurs D, Fonseca AF, Kalavadwala H, de Freitas MV, Clementi C, Zanatta G, Antunes DA, Kavraki L. EnGens: a computational framework for generation and analysis of representative protein conformational ensembles. bioRxiv 2023:2023.04.24.538094. [PMID: 37163076 PMCID: PMC10168271 DOI: 10.1101/2023.04.24.538094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing protein conformational ensembles. In this work we: (1) provide an overview of existing methods and tools for protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples found in the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein-ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations.
Collapse
|
7
|
Krämer A, Durumeric AEP, Charron NE, Chen Y, Clementi C, Noé F. Statistically Optimal Force Aggregation for Coarse-Graining Molecular Dynamics. J Phys Chem Lett 2023; 14:3970-3979. [PMID: 37079800 DOI: 10.1021/acs.jpclett.3c00444] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Machine-learned coarse-grained (CG) models have the potential for simulating large molecular complexes beyond what is possible with atomistic molecular dynamics. However, training accurate CG models remains a challenge. A widely used methodology for learning bottom-up CG force fields maps forces from all-atom molecular dynamics to the CG representation and matches them with a CG force field on average. We show that there is flexibility in how to map all-atom forces to the CG representation and that the most commonly used mapping methods are statistically inefficient and potentially even incorrect in the presence of constraints in the all-atom simulation. We define an optimization statement for force mappings and demonstrate that substantially improved CG force fields can be learned from the same simulation data when using optimized force maps. The method is demonstrated on the miniproteins chignolin and tryptophan cage and published as open-source code.
Collapse
Affiliation(s)
- Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Aleksander E P Durumeric
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Nicholas E Charron
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77251, United States
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- International Max Planck Research School for Biology and Computation (IMPRS-BAC), Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Cecilia Clementi
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77251, United States
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Microsoft Research AI4Science, Karl-Liebknecht Straße 32, 10178 Berlin, Germany
| |
Collapse
|
8
|
Durumeric AEP, Charron NE, Templeton C, Musil F, Bonneau K, Pasos-Trejo AS, Chen Y, Kelkar A, Noé F, Clementi C. Machine learned coarse-grained protein force-fields: Are we there yet? Curr Opin Struct Biol 2023; 79:102533. [PMID: 36731338 PMCID: PMC10023382 DOI: 10.1016/j.sbi.2023.102533] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/14/2022] [Accepted: 12/18/2022] [Indexed: 02/04/2023]
Abstract
The successful recent application of machine learning methods to scientific problems includes the learning of flexible and accurate atomic-level force-fields for materials and biomolecules from quantum chemical data. In parallel, the machine learning of force-fields at coarser resolutions is rapidly gaining relevance as an efficient way to represent the higher-body interactions needed in coarse-grained force-fields to compensate for the omitted degrees of freedom. Coarse-grained models are important for the study of systems at time and length scales exceeding those of atomistic simulations. However, the development of transferable coarse-grained models via machine learning still presents significant challenges. Here, we discuss recent developments in this field and current efforts to address the remaining challenges.
Collapse
Affiliation(s)
- Aleksander E P Durumeric
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Nicholas E Charron
- Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, 77005, Texas, USA
| | - Clark Templeton
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/pbrun03
| | - Félix Musil
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/FelixMusil
| | - Klara Bonneau
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Aldo S Pasos-Trejo
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/sayeg84
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/hello_yaoyi
| | - Atharva Kelkar
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Frank Noé
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, Berlin, 10178, Berlin, Germany; Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Department of Chemistry, Rice University, 6100 Main Street, Houston, 77005, Texas, USA. https://twitter.com/FrankNoeBerlin
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Chemistry, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, 77005, Texas, USA.
| |
Collapse
|
9
|
Yang W, Templeton C, Rosenberger D, Bittracher A, Nüske F, Noé F, Clementi C. Slicing and Dicing: Optimal Coarse-Grained Representation to Preserve Molecular Kinetics. ACS Cent Sci 2023; 9:186-196. [PMID: 36844497 PMCID: PMC9951291 DOI: 10.1021/acscentsci.2c01200] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Indexed: 05/05/2023]
Abstract
The aim of molecular coarse-graining approaches is to recover relevant physical properties of the molecular system via a lower-resolution model that can be more efficiently simulated. Ideally, the lower resolution still accounts for the degrees of freedom necessary to recover the correct physical behavior. The selection of these degrees of freedom has often relied on the scientist's chemical and physical intuition. In this article, we make the argument that in soft matter contexts desirable coarse-grained models accurately reproduce the long-time dynamics of a system by correctly capturing the rare-event transitions. We propose a bottom-up coarse-graining scheme that correctly preserves the relevant slow degrees of freedom, and we test this idea for three systems of increasing complexity. We show that in contrast to this method existing coarse-graining schemes such as those from information theory or structure-based approaches are not able to recapitulate the slow time scales of the system.
Collapse
Affiliation(s)
- Wangfei Yang
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas77005, United States
- Graduate
Program in Systems, Synthetic and Physical Biology, Rice University, Houston, Texas77005, United States
| | - Clark Templeton
- Department
of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - David Rosenberger
- Department
of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Andreas Bittracher
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Feliks Nüske
- Max
Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, 39106Magdeburg, Germany
| | - Frank Noé
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas77005, United States
- Department
of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 12, 14195Berlin, Germany
- Department
of Chemistry, Rice University, Houston, Texas77005, United States
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas77005, United States
- Department
of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
- Department
of Chemistry, Rice University, Houston, Texas77005, United States
- Department
of Physics, Rice University, Houston, Texas77005, United States
- E-mail:
| |
Collapse
|
10
|
Abstract
Coarse-grained (CG) molecular simulations have become a standard tool to study molecular processes on time and length scales inaccessible to all-atom simulations. Parametrizing CG force fields to match all-atom simulations has mainly relied on force-matching or relative entropy minimization, which require many samples from costly simulations with all-atom or CG resolutions, respectively. Here we present flow-matching, a new training method for CG force fields that combines the advantages of both methods by leveraging normalizing flows, a generative deep learning method. Flow-matching first trains a normalizing flow to represent the CG probability density, which is equivalent to minimizing the relative entropy without requiring iterative CG simulations. Subsequently, the flow generates samples and forces according to the learned distribution in order to train the desired CG free energy model via force-matching. Even without requiring forces from the all-atom simulations, flow-matching outperforms classical force-matching by an order of magnitude in terms of data efficiency and produces CG models that can capture the folding and unfolding transitions of small proteins.
Collapse
Affiliation(s)
- Jonas Köhler
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Center for Theoretical Biological Physics, Rice University, Houston, Texas77005, United States.,Department of Physics, Rice University, Houston, Texas77005, United States.,Department of Chemistry, Rice University, Houston, Texas77005, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Chemistry, Rice University, Houston, Texas77005, United States.,Microsoft Research AI4Science, Karl-Liebknecht Strasse 32, 10178Berlin, Germany
| |
Collapse
|
11
|
Abstract
We combine replica exchange (parallel tempering) with normalizing flows, a class of deep generative models. These two sampling strategies complement each other, resulting in an efficient method for sampling molecular systems characterized by rare events, which we call learned replica exchange (LREX). In LREX, a normalizing flow is trained to map the configurations of the fastest-mixing replica into configurations belonging to the target distribution, allowing direct exchanges between the two without the need to simulate intermediate replicas. This can significantly reduce the computational cost compared to standard replica exchange. The proposed method also offers several advantages with respect to Boltzmann generators that directly use normalizing flows to sample the target distribution. We apply LREX to some prototypical molecular dynamics systems, highlighting the improvements over previous methods.
Collapse
Affiliation(s)
- Michele Invernizzi
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, 14195Berlin, Germany
- Department of Chemistry, Rice University, 77005Houston, United States
- Center for Theoretical Biological Physics, Rice University, 77005Houston, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195Berlin, Germany
- Department of Physics, Freie Universität Berlin, 14195Berlin, Germany
- Department of Chemistry, Rice University, 77005Houston, United States
- Microsoft Research AI4Science, 10178Berlin, Germany
| |
Collapse
|
12
|
Mardt A, Hempel T, Clementi C, Noé F. Deep learning to decompose macromolecules into independent Markovian domains. Nat Commun 2022; 13:7101. [PMID: 36402768 PMCID: PMC9675806 DOI: 10.1038/s41467-022-34603-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 10/27/2022] [Indexed: 11/21/2022] Open
Abstract
The increasing interest in modeling the dynamics of ever larger proteins has revealed a fundamental problem with models that describe the molecular system as being in a global configuration state. This notion limits our ability to gather sufficient statistics of state probabilities or state-to-state transitions because for large molecular systems the number of metastable states grows exponentially with size. In this manuscript, we approach this challenge by introducing a method that combines our recent progress on independent Markov decomposition (IMD) with VAMPnets, a deep learning approach to Markov modeling. We establish a training objective that quantifies how well a given decomposition of the molecular system into independent subdomains with Markovian dynamics approximates the overall dynamics. By constructing an end-to-end learning framework, the decomposition into such subdomains and their individual Markov state models are simultaneously learned, providing a data-efficient and easily interpretable summary of the complex system dynamics. While learning the dynamical coupling between Markovian subdomains is still an open issue, the present results are a significant step towards learning Ising models of large molecular complexes from simulation data.
Collapse
Affiliation(s)
- Andreas Mardt
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany
| | - Tim Hempel
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany ,grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany
| | - Cecilia Clementi
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany ,grid.21940.3e0000 0004 1936 8278Rice University, Department of Chemistry, Houston, TX USA ,grid.509984.90000 0004 5907 3802Rice University, Center for Theoretical Biological Physics, Houston, TX USA
| | - Frank Noé
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany ,grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany ,grid.21940.3e0000 0004 1936 8278Rice University, Department of Chemistry, Houston, TX USA ,Microsoft Research AI4Science, Berlin, Germany
| |
Collapse
|
13
|
Musil F, Zaporozhets I, Noé F, Clementi C, Kapil V. Quantum dynamics using path integral coarse-graining. J Chem Phys 2022; 157:181102. [DOI: 10.1063/5.0120386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Vibrational spectra of condensed and gas-phase systems containing light nuclei are influenced by their quantum-mechanical behaviour. The quantum dynamics of light nuclei can be approximated by the imaginary time path integral (PI) formulation, but still at a large computational cost that increases sharply with decreasing temperature. By leveraging advances in machine-learned coarse-graining, we develop a PI method with the reduced computational cost of a classical simulation. We also propose a simple temperature elevation scheme to significantly attenuate the artefacts of standard PI approaches and also eliminate the unfavourable temperature scaling of the computational cost. We illustrate the approach, by calculating vibrational spectra using standard models of water molecules and bulk water, demonstrating significant computational savings and dramatically improved accuracy compared to more expensive reference approaches. We believe that our simple, efficient and accurate method could enable routine calculations of vibrational spectra including nuclear quantum effects for a wide range of molecular systems.
Collapse
Affiliation(s)
| | | | - Frank Noé
- Mathematics and Computer Science, Freie Universität Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin Fachbereich Physik, Germany
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge Department of Chemistry, United Kingdom
| |
Collapse
|
14
|
Clementi C. Fast track to structural biology. Nat Chem 2021; 13:1032-1034. [PMID: 34707232 DOI: 10.1038/s41557-021-00814-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Cecilia Clementi
- Department of Physics, The Free University of Berlin, Berlin, Germany. .,Center for Theoretical Biological Physics, Department of Chemistry, Department of Physics, and Department of Chemical and Biomolecular Engineering, Rice University, Houston, TX, USA.
| |
Collapse
|
15
|
Abstract
Accurate modeling of the solvent environment for biological molecules is crucial for computational biology and drug design. A popular approach to achieve long simulation time scales for large system sizes is to incorporate the effect of the solvent in a mean-field fashion with implicit solvent models. However, a challenge with existing implicit solvent models is that they often lack accuracy or certain physical properties compared to explicit solvent models as the many-body effects of the neglected solvent molecules are difficult to model as a mean field. Here, we leverage machine learning (ML) and multi-scale coarse graining (CG) in order to learn implicit solvent models that can approximate the energetic and thermodynamic properties of a given explicit solvent model with arbitrary accuracy, given enough training data. Following the previous ML-CG models CGnet and CGSchnet, we introduce ISSNet, a graph neural network, to model the implicit solvent potential of mean force. ISSNet can learn from explicit solvent simulation data and be readily applied to molecular dynamics simulations. We compare the solute conformational distributions under different solvation treatments for two peptide systems. The results indicate that ISSNet models can outperform widely used generalized Born and surface area models in reproducing the thermodynamics of small protein systems with respect to explicit solvent. The success of this novel method demonstrates the potential benefit of applying machine learning methods in accurate modeling of solvent effects for in silico research and biomedical applications.
Collapse
Affiliation(s)
- Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | - Brooke E Husic
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Rice University, Houston, Texas 77005, USA
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| |
Collapse
|
16
|
Affiliation(s)
- Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Cecilia Clementi
- Freie Universität Berlin, Department of Physics, Arnimallee 14, 14195 Berlin, Germany
| | | |
Collapse
|
17
|
Abstract
Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss feature representation of molecular systems and present state-of-the-art algorithms of dimensionality reduction, density estimation, and clustering, and kinetic models. We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.
Collapse
Affiliation(s)
- Aldo Glielmo
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
| | - Brooke E. Husic
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
| | - Alex Rodriguez
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| | - Cecilia Clementi
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Frank Noé
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Alessandro Laio
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| |
Collapse
|
18
|
Abstract
The use of coarse-grained (CG) models is a popular approach to study complex biomolecular systems. By reducing the number of degrees of freedom, a CG model can explore long time- and length-scales inaccessible to computational models at higher resolution. If a CG model is designed by formally integrating out some of the system's degrees of freedom, one expects multi-body interactions to emerge in the effective CG model's energy function. In practice, it has been shown that the inclusion of multi-body terms indeed improves the accuracy of a CG model. However, no general approach has been proposed to systematically construct a CG effective energy that includes arbitrary orders of multi-body terms. In this work, we propose a neural network based approach to address this point and construct a CG model as a multi-body expansion. By applying this approach to a small protein, we evaluate the relative importance of the different multi-body terms in the definition of an accurate model. We observe a slow convergence in the multi-body expansion, where up to five-body interactions are needed to reproduce the free energy of an atomistic model.
Collapse
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Nicholas Charron
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Brooke Husic
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
19
|
Lin X, George JT, Schafer NP, Chau KN, Birnbaum ME, Clementi C, Onuchic JN, Levine H. Rapid Assessment of T-Cell Receptor Specificity of the Immune Repertoire. Nat Comput Sci 2021; 1:362-373. [PMID: 36090450 PMCID: PMC9455901 DOI: 10.1038/s43588-021-00076-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Accurate assessment of TCR-antigen specificity at the whole immune repertoire level lies at the heart of improved cancer immunotherapy, but predictive models capable of high-throughput assessment of TCR-peptide pairs are lacking. Recent advances in deep sequencing and crystallography have enriched the data available for studying TCR-p-MHC systems. Here, we introduce a pairwise energy model, RACER, for rapid assessment of TCR-peptide affinity at the immune repertoire level. RACER applies supervised machine learning to efficiently and accurately resolve strong TCR-peptide binding pairs from weak ones. The trained parameters further enable a physical interpretation of interacting patterns encoded in each specific TCR-p-MHC system. When applied to simulate thymic selection of an MHC-restricted T-cell repertoire, RACER accurately estimates recognition rates for tumor-associated neoantigens and foreign peptides, thus demonstrating its utility in helping address the large computational challenge of reliably identifying the properties of tumor antigen-specific T-cells at the level of an individual patient's immune repertoire.
Collapse
Affiliation(s)
- Xingcheng Lin
- Center for Theoretical Biological Physics, Rice University, Houston, TX
- Department of Physics and Astronomy, Rice University, Houston, TX
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA
| | - Jason T. George
- Center for Theoretical Biological Physics, Rice University, Houston, TX
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX
| | - Nicholas P. Schafer
- Center for Theoretical Biological Physics, Rice University, Houston, TX
- Departments of Chemistry, Rice University, Houston, TX
| | - Kevin Ng Chau
- Department of Physics, Northeastern University, Boston, MA
| | - Michael E. Birnbaum
- Koch Institute for Integrative Cancer Research, Cambridge, MA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA
- Ragon Institute of MIT, MGH, and Harvard, Cambridge, MA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, TX
- Departments of Chemistry, Rice University, Houston, TX
- Department of Physics, Freie Universität, Berlin, Germany
| | - José N. Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX
- Department of Physics and Astronomy, Rice University, Houston, TX
- Departments of Chemistry, Rice University, Houston, TX
- Department of Biosciences, Rice University, Houston, TX
- To whom correspondence should be addressed: ,
| | - Herbert Levine
- Center for Theoretical Biological Physics, Rice University, Houston, TX
- Department of Physics, Northeastern University, Boston, MA
- To whom correspondence should be addressed: ,
| |
Collapse
|
20
|
Abstract
Over recent years, the use of statistical learning techniques applied to chemical problems has gained substantial momentum. This is particularly apparent in the realm of physical chemistry, where the balance between empiricism and physics-based theory has traditionally been rather in favor of the latter. In this guest Editorial for the special topic issue on "Machine Learning Meets Chemical Physics," a brief rationale is provided, followed by an overview of the topics covered. We conclude by making some general remarks.
Collapse
Affiliation(s)
- Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
| | | |
Collapse
|
21
|
Doerr S, Majewski M, Pérez A, Krämer A, Clementi C, Noe F, Giorgino T, De Fabritiis G. TorchMD: A Deep Learning Framework for Molecular Simulations. J Chem Theory Comput 2021; 17:2355-2363. [PMID: 33729795 PMCID: PMC8486166 DOI: 10.1021/acs.jctc.0c01343] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Indexed: 11/28/2022]
Abstract
Molecular dynamics simulations provide a mechanistic description of molecules by relying on empirical potentials. The quality and transferability of such potentials can be improved leveraging data-driven models derived with machine learning approaches. Here, we present TorchMD, a framework for molecular simulations with mixed classical and machine learning potentials. All force computations including bond, angle, dihedral, Lennard-Jones, and Coulomb interactions are expressed as PyTorch arrays and operations. Moreover, TorchMD enables learning and simulating neural network potentials. We validate it using standard Amber all-atom simulations, learning an ab initio potential, performing an end-to-end training, and finally learning and simulating a coarse-grained model for protein folding. We believe that TorchMD provides a useful tool set to support molecular simulations of machine learning potentials. Code and data are freely available at github.com/torchmd.
Collapse
Affiliation(s)
| | - Maciej Majewski
- Computational
Science Laboratory, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Adrià Pérez
- Computational
Science Laboratory, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Andreas Krämer
- Department
of Mathematics and Computer Science, Freie
Universität, 14195 Berlin, Germany
| | - Cecilia Clementi
- Department
of Physics, Freie Universität, 14195 Berlin, Germany
- Department
of Chemistry, Rice University, Houston, 77005 Texas, United States
| | - Frank Noe
- Department
of Mathematics and Computer Science, Freie
Universität, 14195 Berlin, Germany
- Department
of Physics, Freie Universität, 14195 Berlin, Germany
- Department
of Chemistry, Rice University, Houston, 77005 Texas, United States
| | - Toni Giorgino
- Biophysics
Institute, National Research Council (CNR-IBF), 20133 Milano, Italy
- Department
of Biosciences, Università degli
Studi di Milano, 20133 Milano, Italy
| | - Gianni De Fabritiis
- Acellera, 08005 Barcelona, Spain
- Computational
Science Laboratory, Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institució
Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain
| |
Collapse
|
22
|
Abella JR, Antunes D, Jackson K, Lizée G, Clementi C, Kavraki LE. Markov state modeling reveals alternative unbinding pathways for peptide-MHC complexes. Proc Natl Acad Sci U S A 2020; 117:30610-30618. [PMID: 33184174 PMCID: PMC7720115 DOI: 10.1073/pnas.2007246117] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Peptide binding to major histocompatibility complexes (MHCs) is a central component of the immune system, and understanding the mechanism behind stable peptide-MHC binding will aid the development of immunotherapies. While MHC binding is mostly influenced by the identity of the so-called anchor positions of the peptide, secondary interactions from nonanchor positions are known to play a role in complex stability. However, current MHC-binding prediction methods lack an analysis of the major conformational states and might underestimate the impact of secondary interactions. In this work, we present an atomically detailed analysis of peptide-MHC binding that can reveal the contributions of any interaction toward stability. We propose a simulation framework that uses both umbrella sampling and adaptive sampling to generate a Markov state model (MSM) for a coronavirus-derived peptide (QFKDNVILL), bound to one of the most prevalent MHC receptors in humans (HLA-A24:02). While our model reaffirms the importance of the anchor positions of the peptide in establishing stable interactions, our model also reveals the underestimated importance of position 4 (p4), a nonanchor position. We confirmed our results by simulating the impact of specific peptide mutations and validated these predictions through competitive binding assays. By comparing the MSM of the wild-type system with those of the D4A and D4P mutations, our modeling reveals stark differences in unbinding pathways. The analysis presented here can be applied to any peptide-MHC complex of interest with a structural model as input, representing an important step toward comprehensive modeling of the MHC class I pathway.
Collapse
Affiliation(s)
- Jayvee R Abella
- Department of Computer Science, Rice University, Houston, TX 77005
| | - Dinler Antunes
- Department of Computer Science, Rice University, Houston, TX 77005
| | - Kyle Jackson
- Department of Melanoma Medical Oncology-Research, The University of Texas MD Anderson Cancer Center, Houston, TX 77030
| | - Gregory Lizée
- Department of Melanoma Medical Oncology-Research, The University of Texas MD Anderson Cancer Center, Houston, TX 77030
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005
- Department of Chemistry, Rice University, Houston, TX 77005
| | - Lydia E Kavraki
- Department of Computer Science, Rice University, Houston, TX 77005;
| |
Collapse
|
23
|
Chen M, Chen X, Schafer NP, Clementi C, Komives EA, Ferreiro DU, Wolynes PG. Surveying biomolecular frustration at atomic resolution. Nat Commun 2020; 11:5944. [PMID: 33230150 PMCID: PMC7683549 DOI: 10.1038/s41467-020-19560-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 10/13/2020] [Indexed: 01/12/2023] Open
Abstract
To function, biomolecules require sufficient specificity of interaction as well as stability to live in the cell while still being able to move. Thermodynamic stability of only a limited number of specific structures is important so as to prevent promiscuous interactions. The individual interactions in proteins, therefore, have evolved collectively to give funneled minimally frustrated landscapes but some strategic parts of biomolecular sequences located at specific sites in the structure have been selected to be frustrated in order to allow both motion and interaction with partners. We describe a framework efficiently to quantify and localize biomolecular frustration at atomic resolution by examining the statistics of the energy changes that occur when the local environment of a site is changed. The location of patches of highly frustrated interactions correlates with key biological locations needed for physiological function. At atomic resolution, it becomes possible to extend frustration analysis to protein-ligand complexes. At this resolution one sees that drug specificity is correlated with there being a minimally frustrated binding pocket leading to a funneled binding landscape. Atomistic frustration analysis provides a route for screening for more specific compounds for drug discovery. The analysis of biomolecular frustration yielded insights into several aspects of protein behavior. Here the authors describe a framework to efficiently quantify and localize biomolecular frustration within proteins at atomic resolution, and observe that drug specificity is correlated with a minimally frustrated binding pocket leading to a funneled binding landscape.
Collapse
Affiliation(s)
- Mingchen Chen
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
| | - Xun Chen
- Center for Theoretical Biological Physics, Department of Chemistry, Rice University, Houston, TX, USA
| | - Nicholas P Schafer
- Center for Theoretical Biological Physics, Department of Chemistry, Rice University, Houston, TX, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Department of Chemistry, Rice University, Houston, TX, USA
| | - Elizabeth A Komives
- Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, CA, USA
| | - Diego U Ferreiro
- Protein Physiology Laboratory, University of Buenos Aires, Buenos Aires, Argentina
| | - Peter G Wolynes
- Center for Theoretical Biological Physics, Department of Chemistry, Rice University, Houston, TX, USA. .,Department of Biosciences, Rice University, Houston, TX, USA.
| |
Collapse
|
24
|
Husic BE, Charron NE, Lemm D, Wang J, Pérez A, Majewski M, Krämer A, Chen Y, Olsson S, de Fabritiis G, Noé F, Clementi C. Coarse graining molecular dynamics with graph neural networks. J Chem Phys 2020; 153:194101. [PMID: 33218238 PMCID: PMC7671749 DOI: 10.1063/5.0026133] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 10/27/2020] [Indexed: 11/14/2022] Open
Abstract
Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at an atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proved that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features to machine learn the force field. In the present contribution, we build upon the advance of Wang et al. and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learn their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.
Collapse
Affiliation(s)
| | | | - Dominik Lemm
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | | | - Adrià Pérez
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | - Maciej Majewski
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | | | | |
Collapse
|
25
|
Abstract
The accurate sampling of protein dynamics is an ongoing challenge despite the utilization of high-performance computer (HPC) systems. Utilizing only "brute force" molecular dynamics (MD) simulations requires an unacceptably long time to solution. Adaptive sampling methods allow a more effective sampling of protein dynamics than standard MD simulations. Depending on the restarting strategy, the speed up can be more than 1 order of magnitude. One challenge limiting the utilization of adaptive sampling by domain experts is the relatively high complexity of efficiently running adaptive sampling on HPC systems. We discuss how the ExTASY framework can set up new adaptive sampling strategies and reliably execute resulting workflows at scale on HPC platforms. Here, the folding dynamics of four proteins are predicted with no a priori information.
Collapse
Affiliation(s)
- Eugen Hruska
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Department of Physics & Astronomy, Rice University, Houston, Texas 77005, United States
| | - Vivekanandan Balasubramanian
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Hyungro Lee
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Shantenu Jha
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Department of Physics & Astronomy, Rice University, Houston, Texas 77005, United States.,Department of Physics, Freie Universität, 14195 Berlin, Germany.,Department of Chemistry, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
26
|
Abella JR, Antunes DA, Clementi C, Kavraki LE. Large-Scale Structure-Based Prediction of Stable Peptide Binding to Class I HLAs Using Random Forests. Front Immunol 2020; 11:1583. [PMID: 32793224 PMCID: PMC7387700 DOI: 10.3389/fimmu.2020.01583] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 06/15/2020] [Indexed: 01/13/2023] Open
Abstract
Prediction of stable peptide binding to Class I HLAs is an important component for designing immunotherapies. While the best performing predictors are based on machine learning algorithms trained on peptide-HLA (pHLA) sequences, the use of structure for training predictors deserves further exploration. Given enough pHLA structures, a predictor based on the residue-residue interactions found in these structures has the potential to generalize for alleles with little or no experimental data. We have previously developed APE-Gen, a modeling approach able to produce pHLA structures in a scalable manner. In this work we use APE-Gen to model over 150,000 pHLA structures, the largest dataset of its kind, which were used to train a structure-based pan-allele model. We extract simple, homogenous features based on residue-residue distances between peptide and HLA, and build a random forest model for predicting stable pHLA binding. Our model achieves competitive AUROC values on leave-one-allele-out validation tests using significantly less data when compared to popular sequence-based methods. Additionally, our model offers an interpretation analysis that can reveal how the model composes the features to arrive at any given prediction. This interpretation analysis can be used to check if the model is in line with chemical intuition, and we showcase particular examples. Our work is a significant step toward using structure to achieve generalizable and more interpretable prediction for stable pHLA binding.
Collapse
Affiliation(s)
- Jayvee R. Abella
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Dinler A. Antunes
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, TX, United States
- Department of Chemistry, Rice University, Houston, TX, United States
| | - Lydia E. Kavraki
- Department of Computer Science, Rice University, Houston, TX, United States
| |
Collapse
|
27
|
Wang J, Chmiela S, Müller KR, Noé F, Clementi C. Ensemble learning of coarse-grained molecular dynamics force fields with a kernel approach. J Chem Phys 2020; 152:194106. [DOI: 10.1063/5.0007276] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, South Korea
- Max Planck Institute for Informatics, Saarbrücken 66123, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
- Department of Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
28
|
Morandi A, Zambon A, Di Santo SG, Mazzone A, Cherubini A, Mossello E, Bo M, Marengoni A, Bellelli G, Rispoli V, Malara A, Spadea F, Di Cello S, Ceravolo F, Fabiano F, Chiaradia G, Gabriele A, Lenino P, Andrea T, Settembrini V, Capomolla D, Citrino A, Scriva A, Bruno I, Secchi R, De Martino E, Muccinelli R, Lupi G, Paonessa P, Fabbri A, Passuti MT, Castellari S, Po A, Gaggioli G, Varesi M, Moneti P, Capurso S, Latini V, Ghidotti S, Riccardelli F, Macchi M, Rigo R, Claudio P, Angelo B, Flavio C, Benedetta B, Boffelli S, Cassinadri A, Franzoni S, Spazzini E, Andretto D, Tonini G, Andreani L, Coralli M, Balotta A, Cancelliere R, Ballardini G, Simoncelli M, Mancini A, Strazzacapa M, Fabio S, De Filippi F, Giudice C, Dentizzi C, Azzini M, Cazzadori M, Mastroeni V, Bertassello P, Claudia Benati HS, Nesta E, Tobaldini C, Guerini F, Elena T, Mombelloni P, Fontanini F, Gabriella L, Pizzorni C, Oliverio M, Del Grosso LL, Giavedoni C, Bidoli G, Mazzei B, Corsonello A, Fusco S, Vena S, De Vuono T, Maiuri G, Luca FF, Andrea A, Giovanni S, Rossella N, Castegnaro E, De Rosa S, Sechi RB, Benvenuti E, Del Lungo I, Giardini S, Giulietti C, Mauro DB, Eleonora B, Roberto F, Paolo B, DuranteMangoni E, Testoni M, Fabio DS, Loredana S, Valeria S, Fabiano M, Annabella DG, Salvatore DC, Martina P, Greco A, Grazia D, Daniele S, Gianluca R, Renzo G, Sergio M, Morena B, Vitali M, Marina P, Paolo DC, Irene F, Cristina S, Alessandra F, Orlandini F, La Regina M, Desirée A, Mirella F, Marco F, Mario B, Paola P, Giuliana B, Riccardo B, Michela T, Eleonora C, Padulo F, Cristina M, Dario R, Giancarla M, Guido R, Elena M, Prete C, Marileda N, Federica S, Igor B, Nicole B, Elena R, Paolillo C, Riccardi A, Claudia B, Barbara R, Francesca M, Silvia V, Chiara C, Ilaria DL, Oliver B, Mauro C, Eleonora M, Giuseppe P, Rosaria T, Maria C, Davide D, Stefania C, Marco C, Massimo P, Bertoletti E, Luca S, Martina DF, Paola V, Lia S, Sandro C, Valentina DS, Erminia B, Paola C, Romina R, Minisola S, D'Amico F, Luciano C, Pasquale A, Ilaria L, Francesca C, Guglielmo S, Marco E, Sara R, Paola A, Claudio A, Francesco R, Caronzolo F, Alessandro C, Simona M, Lara F, Paola R, Simonetta C, Antonella C, Generoso U, Fernando G, Giuliano C, Emanuela S, Grippa A, Mariolina S, Alessandro D, Chiara P, Giulia L, Alessandro G, Famularo S, Sandini M, Pinotti E, Gianotti L, Antonella B, Lombardo G, Giulia P, Sante G, Rossi A, Rubele S, Sant S, Marco V, Danila C, Fabio R, Bandirali MP, Nicoletta C, Pipicella T, Laura B, Paolo T, Luciano T, Leonello A, Margherita S, Stefania DN, Pierluigi DS, Laura R, Fabiana T, Giovanna C, Antonino S, Antonino A, Felice C, Giuseppe B, Danilo F, Giovanna DB, Francesco L, Salini S, Angela BM, De Filippi F, Giorgetta C, Francesco C, Giovanni G, Paola C, Gerardo B, Silvio R, Letizia S, Sabrina P, Davide B, Rosaria RM, Maria DA, Raffaele P, Valeria PG, Palmieri VO, Palasciano G, Belfiore A, Portincasa P, Carlo S, Vincenzo S, Alessia D, Valiani V, Carolina B, Tiziana C, Daniela L, Giuseppe M, Francesca C, Giordano C, Roberto S, Paola T, Ugo P, Federica R, Giacomo P, Castellano M, Anna G, Domenico C, Elisa C, Federica C, Antonietta CM, Luigi M, Fabio L, Salvatore B, Giuseppe M, Gelosa G, Viviana AT, Piras V, Giorgio B, Andrea C, Alessandra B, Coen D, Magliola R, Milanesio D, Muzzulini CL, Paolo F, Marinella T, Sofia CM, Marta B, March A, Siano P, Capo G, Napoletano R, Cecilia P, Mancini C, Del Buono C, De Bartolomeo G, Addolorata M, Carmen C, Roberto C, Nitti MT, Giovanni VA, Moschettini G, Franco M, Daniela R, D'Amico G, Mirella P, Endrizzi C, Trotta L, Ciarambino T, Orazio Z, Felici A, Emanuela T, Marta S, Thomas F, Giacomo T, Ignazio DF, Andrea B, Giuseppe O, Emanuela F, Serena A, Elena D, Pavan S, Anna C, Serena B, Erika N, Roberto S, Elena S, Manuela P, Francesca A, Angelo T, Piazzani F, Lunelli A, Dimori S, Margotta A, Soglia T, Postacchini D, Brunelli R, Santini S, Francavilla M, Macchiati I, Sorvillo F, Giuli C, Mecocci P, Longo A, Perticone F, Addesi D, Rosa PC, Bencardino G, Falbo T, Grillo N, Marco F, Mirella F, Fantò F, Isaia G, Pezzilli S, Bergamo D, Furno E, Rrodhe S, Lucarini S, Dijk B, Dall'Acqua F, Cappelletto F, Calvani D, Becheri D, Giuseppe M, Costanza M, Vito A, Francesca B, Magherini L, Novella M, Franca B, Lucia Gambardella PM, Valente C, Ilaria B, Alice F, Bo M, Porrino P, Ceci G, Giuliana B, Michela T, Eleonora C, Ettore E, Camellini C, Servello A, Grassi A, Rozzini R, Tironi S, Grassi MG, Troisi E, Carlo C, Simona Gabriella DS, Flaminia F, Federica R, Beatrice P, Sofia T, Gabutto A, Quazzo L, Rosatello A, Suraci D, Tagliabue B, Perrone C, Ferrara L, Castagna A, Tremolada ML, Giuseppe C, Stefano B, Davide O, Piano S, Serviddio G, Lo Buglio A, Gurrera T, Merlo V, Rovai C, Cotroneo AM, Carlucci R, Abbaldo A, Monzani F, Qasem AA, Bini G, Tafuto S, Galli G, Bruni AC, Mancuso G, Mancuso G, Calipari D, Giuseppe Massimiliano DL, Bernardini B, Corsini C, Michele C, Sara DF, Cagnin A, Fragiacomo F, Pompanin S, Piero A, Marco C, Zurlo A, Guerra G, Pala M, Menozzi L, Gatti CD, Magon S, Roberto M, Alfredo DG, Fabio F, Ruana T, Elisa M, Benedetta B, Christian M, Marco P, Massimo G, Di Francesco V, Faccioli S, Pellizzari L, Giorgia F, Barbagallo G, Lunardelli ML, Martini E, Ferrari E, Macchiarulo M, Corneli M, Bacci M, Battaglia G, Anastasio L, Lo Storto MS, Seresin C, Simonato M, Loreggian M, Cestonaro F, Durando M, Latella R, Mazzoleni M, Russo G, Ponte M, Valchera A, Salustri G, Petritola D, Costa A, Sinforiani E, Cotta MR, Piano S, Pizio RN, Cester A, Formilan M, Pietro B, Carbone P, Cazzaniga I, Appollonio I, Cereda D, Stabile A, Xhani R, Acampora R, Tremolizzo L, Federico P, Antonio C, Valerio P, Cesare B, Zhirajr M, Giovanni V, Maria A, Mariaelena S, Bottacchi E, Bucciantini E, Di Giovanni M, Franchi F, Lucchetti L, Mariani C, Grande G, Rapazzini P, Marco M, Romanelli G, Marengoni A, Franco N, Alessio M, Stefano B, Nicola L, Laura P, Nazario P, Carlo C, Chiara G, Soccorso P, Andrea S, Luca B, Francesca S, Roberto A, Marco F, Anna C, Francesco C, Anna C, Fugazza L, Guerrini C, De Paduanis G, Iallonardo L, Palumbo P, Zuliani G, Ortolani B, Capatti E, Soavi C, Bianchi L, Francesconi D, Miselli A, Gloria B, Tommaso R, Chiara P, Agata MM, Marco D, Luca M, Gianluca G, Suardi T, Mazzone A, Zaccarini C, Manuela R, Mirra G, Muti E, Bottura R, Gianpaolo M, Secreto P, Bisio E, Cecchettani M, Naldi T, Pallavicino A, Pugliese M, Iozzo RC, Grassi G, Michele B, Raffaella D, Fosca QT, Giorgio GC, Giovanni P, Ernesto C, Soccorso P, Mannironi A, Giorli E, Oberti S, Fierro B, Piccoli T, Giacalone F, Mandas A, Serchisu L, Costaggiu D, Pinna E, Orrù F, Mannai M, Cordioli Z, Pelizzari L, Turcato E, Arduini P, Cacace C, Chiloiro R, Cimino R, Ruberto C, Giovanni R, Pietro G, Laura G, Alberto C, Pietro G, Carmen R, Santo PD, Andriolli A, Burattin G, Rossi L, Andreolli Antonino CG, Giuseppe C, Tezza F, Maddalena P, Laura S, Crippa P, Aloisio P, Di Monda T, Malighetti A, Galbassini G, Salutis D, Ivaldi C, Russo AM, Bennati E, Pino E, Zavarise G, Pesci A, Suigo G, Faverio P, Andrea G, Sabrina P, Zanasi M, Moniello G, Rostagno C, Cartei A, Polidori G, Ungar A, Melis MR, Martellini E, Enrico M, Monica T, Antonella G, Giovanna L, Migliorini M, Caramelli F, Battiston B, Berardino M, Cavallo S, Alessandro M, Anna S, Lombardi B, D'Ippolito P, Furini A, Villani D, Clara R, Guarneri M, Paolucci S, Bassi A, Coiro P, De Angelis D, Morone G, Venturiero V, Palleschi L, Raganato P, Di Niro G, Rosa CA, Loredana B, Imoscopi A, Isaia G, Tibaldi V, Bottignole G G, Calvi E, Clementi C, Zanocchi M, Agosta L, Nortarelli A, Provenzano G, Mari D, Romano FY, Rosini F, Mansi M, Rossi S, Geriatria AR, Inzaghi L, Bonini G, Rossi P, Potena A, Lichii M, Candiani T, Grimaldi W, Bertani E, Alessandra P, Calogero P, Pinto D, Bernardi R, Nicolino F, Galetti C, Gianstefani A, Giulia C, Lorenzo M, Odetti P, Monacelli F, Prefumo M, Fiammetta M, Canepa M, Minaglia C, Paolisso G, Rizzo MR, Prestano R, Dalise AM, Barra D, Bosco LD, Asprinio V, Dallape L, Perina E, Incalzi RA, Bartoli IR, Pluderi A, Maina A, Pecoraro E, Sciarra M, Prudente A, Paola M, Francesca M, Manuel V, Luisella C, Maria PL, Tina S, Benini L, Levato F, Mhiuta V, Alius F, Davidoaia D, Giardini V, Garancini M, Bellamoli C, Terranova L, Bozzini C, Tosoni P, Provoli E, Cascone L, Dioli A, Ferrarin G, Gabutto A, Bucci A, Bua G, Fenu S, Bianchi G, Casella S, Romano V, Maurizio P, Mascherona I, Belotti G, Cavaliere S, Cuni E, Merciuc N, Oberti R, Veneziani S, Capoferri E, De Bernardi E, Colombo K, Bravi M, Nicoletta N, D'Arcangelo P, Montenegro N, Galli G, Montanari R, Lamanna P, Gasperini B, Isabella M, Stefania D, Gaia A, Filippo C, Palamà C, Di Emidio C, Scarpini E, Arighi A, Fumagalli G, Basilico P, De Amicis Margherita M, Marta M, Diletta M, D'Amico F, Granata A, Rostagno C, Ranalli C, Cammilli A, Cavallini MC, Tricca M, Natella D, Gabbani L, Tesi F, Martella L, Gurrera T, Imbrici R, Guerrini G, Scotuzzi AM, Sozzi F, Valenti L, Chiarello A, Monia M, Pilotto A, Prete C, Senesi B, Meta AC, Pendenza E, Monzani F, Pasqualetti G, Polini A, Tognini S, Ballino E, Cherubini A, Dell'Aquila G, Gasparrini PM, Marotti E, Migale M, Scrimieri A, Falsetti L, Salvi A, Toigo G, Ceschia G, Rosso A, Tongiorgi C, Scarpa C, Maurizio P, De Dominicis L, Pucci E, Renzi S, Cartechini E, Tomassini PF, Del Gobbo M, Ugenti F, Romeo P, Nardelli A, Lauretani F, Visioli S, Montanari I, Ermini F, Giordano A, Pigato G, Simeone E, Barbujani M, Giampieri M, Amoruso R, Piccinini M, Ferrari C, Gambetti C, Sfrappini M, Semeraro L, Striuli R, Mariani C, Pelliccioni G, Marinelli D, Fabi K, Rossi T, Pesallaccia M, Sabbatini D, Gobbi B, Cerqua R, Tagliani G, Schlauser E, Caser L, Caramello E, Sandigliano F, Rosso G, Ferrari A, Bendini C, Luisa DM, Casella M, Prampolini R, Scevola M, Vitale E, Roberto B, Carlo F, Sergio F, Alberto S, Daniela Z, Giulia B, Serena G, Michele B, Maugeri D, Sorace R, Anzaldi M, De Gesu R, Morrone G, Davolio F, Fabbo A, Palmieri M, Barbagallo G, Zoli M, Forti P, Pirazzoli L, Fabbri E, Terenzi L, Bergolari F, Wenter C, Ruffini I, Insam M, Abraham E, Kirchlechner C, Cucinotta D, Antonino L, Basile G, Grazia AM, Parise P, Boccali A, Amici S, Gambacorta M, Ferrari A, Lasagni A, Lovati R, Giovinazzo F, Kimak E, Zappa P, Medici F, Lo Castro M, Mauro F, De Luca A, Sancesario G, Martorana A, Scaricamazza B, Toniolo S, Di Lorenzo F, Liguori C, Lasco A, Basile G, Vita N, Giomi M, Dimori S, Forte F, Padovani A, Rozzini L, Ceraso A, Salvatore C, Padovani A, Cottino M, Vitali S, Marelli E, Tripi G, Miceli S, Urso G, Grioni G, Vezzadini G, Misaggi G, Forlani C, Avanzi S, Serena S, Claudia C, Marilena V, Alberto L, Diego G, Alessandro G, Iemolo F, Giordano A, Sanzaro E, D'Asta G, Proietto M, Carnemolla A, Razza G, Spadaro D, Bertolotti M, Mussi C, Neviani F, Roberto C, Valentina G, Linda M, Francesca V, Tarozzi A, Balestri F, Monica T, Mannarino G, Tesi F, Bigolari M, Natale A, Grassi S, Bottaro C, Stefanelli S, Bovone U, Tortorolo U, Quadri R, Leone G, Ponzetto M, Frasson P, Annoni G, Bellelli G, Bruni A, Confalonieri R, Corsi M, Moretti D, Teruzzi F, Umidi S, Mazzola P, Perego S, Persico I, Olivieri G, Bonfanti A, Hajnalka S, Galeazzi M, Massariello F, Anzuini A, Caffarra P, Barocco F, Spallazzi M, Paolo CG, Simonetta M, Andrea A, Chioatto P, Bortolamei S, Soattin L, Ruotolo G, Beneamino B, Pietro G, Giuseppe B, Carmen R, Castagna A, Bertazzoli M, Rota E, Adobati A, Scarpa A, Granziera S, Zuccher P, Fabbro AD, Zara D, Lo Nigro A, Franchetti L, Toniolo M, Marcuzzo C, Piano S, Rollone M, Guerriero F, Sgarlata C, Massè A, Berardino M, Cavallo S, Anna S, Zatti G, Piatti M, Graci J, Benati G, Boschi F, Biondi M, Fiumi N, Erika T, Locatelli SM, Mauri S, Beretta M, Margheritis L, Desideri G, Liberatore E, Carucci AC, Bonino P, Caput M, Antonietti MP, Polistena G, De la Pierre F, Mari M, Massignani P, Tombesi F, Selvaggio F, Verbo B, Bodoni P, Marchionni N, Mossello E, Cavallini MC, Sabatini T, Mussio E, Magni E, Bianchetti A, Crucitti A, Titoldini G, Cossu B, Fascendini S, Licini C, Tomasoni A, Calderazzo M, Daniela T, Valentina L, Ferrari A, Prampolini R, Melotti RM, Lilli A, Buda S, Adversi M, Noro G, Turco R, Ubezio MC, Mantovani AR, Viola MC, Serrati C, Pretta S, Infante M, Gentile S, Morandi A, D'Ambrosio V, Mazzanti P, Brambilla C, Sportelli S, Platto C, Faraci B, Quattrocchi D, Pernigotti LM, Pisu C, Sicuro F, Oliverio M, Del Grosso LL, Zagnoni P, Ghiglia S, Mosca M, Corazzin I, Deola M, Biagini CA, Bencini F, Cantini C, Tonon E, Pierinelli S, Onofrj M, Thomas A, Filomena B, Bonanni L, Gabriella C, Comi G, Magnani G, Santangelo R, Mazzeo S, Giuseppe M, Francesca C, Giordano C, Roberto S, Barbieri C, Giroldi L, Davolio F, Bandini F, Masina M, Malservisi S, Cicognani A, Ricca L, Ricca L, Piccininni M, Ferrari C, Gambetti C, Tassinari T, Brogi D, Sugo A, Alessandra F, Sonia M, Valerio V, Andrea UC, Enrico C, Vera RF, Assunta S, Gianmaria Z, Mauro P, Pietro B, Roberto M, Salvatore C, Barone A, Razzano M, Giuseppe I, Angela B, Francesco S, Valeria D, Federico G, Lucia P, Antonella V, Elisabetta DC, Cristina R, Nadia C, Maria S, Luciano A, Chiara C, Bini P, Pignata M, Enrico B, Maria V, Giovanni C, Giorgio C, Andrea T, Marco M, Anna C, Piera R, Alberto Z, Ceccon A, Magrin L, Marin S, Barbara S, Marco M, Laura G, Matteo M, Marco P, Caterina PM, Carla R, Federica G, Clara T, Melania C, Giampaolo B, Stefano G, Valeria G, Lucia M, Giovambattista D, Ester L, Cecilia CA, Maurizio T, Alessandra F, Vera RF, Nadia B, Grillo A, Arenare F, Tonino M, David K, Giorgio VP, Ubaldo B, Vincenzo S, Stefano M, Marino F, Busonera Flavio MT, Paolo A, Monica M, Francesco B. Understanding Factors Associated With Psychomotor Subtypes of Delirium in Older Inpatients With Dementia. J Am Med Dir Assoc 2020; 21:486-492.e7. [DOI: 10.1016/j.jamda.2020.02.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 02/15/2020] [Accepted: 02/19/2020] [Indexed: 12/12/2022]
|
29
|
Abstract
Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for an ML revolution and have already been profoundly affected by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, on coarse-grained molecular dynamics, on the extraction of free energy surfaces and kinetics, and on generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into ML structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation.
Collapse
Affiliation(s)
- Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; .,Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany.,Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA;
| | - Alexandre Tkatchenko
- Physics and Materials Science Research Unit, University of Luxembourg, 1511 Luxembourg, Luxembourg;
| | - Klaus-Robert Müller
- Department of Computer Science, Technical University Berlin, 10587 Berlin, Germany; .,Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany.,Department of Brain and Cognitive Engineering, Korea University, Seoul 136-713, South Korea
| | - Cecilia Clementi
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; .,Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA; .,Department of Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
30
|
Lehmann M, Lukonin I, Noé F, Schmoranzer J, Clementi C, Loerke D, Haucke V. Nanoscale coupling of endocytic pit growth and stability. Sci Adv 2019; 5:eaax5775. [PMID: 31807703 PMCID: PMC6881173 DOI: 10.1126/sciadv.aax5775] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 09/25/2019] [Indexed: 05/21/2023]
Abstract
Clathrin-mediated endocytosis, an essential process for plasma membrane homeostasis and cell signaling, is characterized by stunning heterogeneity in the size and lifetime of clathrin-coated endocytic pits (CCPs). If and how CCP growth and lifetime are coupled and how this relates to their physiological function are unknown. We combine computational modeling, automated tracking of CCP dynamics, electron microscopy, and functional rescue experiments to demonstrate that CCP growth and lifetime are closely correlated and mechanistically linked by the early-acting endocytic F-BAR protein FCHo2. FCHo2 assembles at the rim of CCPs to control CCP growth and lifetime by coupling the invagination of early endocytic intermediates to clathrin lattice assembly. Our data suggest a mechanism for the nanoscale control of CCP growth and stability that may similarly apply to other metastable structures in cells.
Collapse
Affiliation(s)
- Martin Lehmann
- Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP), 13125 Berlin, Germany
- Corresponding author. (V.H.); (M.L.)
| | - Ilya Lukonin
- Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP), 13125 Berlin, Germany
| | - Frank Noé
- Freie Universität Berlin, Department of Mathematics and Computer Science and Department of Physics, 14195 Berlin, Germany
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, TX 77005, USA
| | - Jan Schmoranzer
- Charité Universitätsmedizin Berlin, Virchowweg 6, 10117 Berlin, Germany
| | - Cecilia Clementi
- Freie Universität Berlin, Department of Mathematics and Computer Science and Department of Physics, 14195 Berlin, Germany
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, TX 77005, USA
| | - Dinah Loerke
- Department of Physics and Astronomy, University of Denver, Denver, CO 80208, USA
| | - Volker Haucke
- Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP), 13125 Berlin, Germany
- Freie Universität Berlin, Faculty of Biology, Chemistry, Pharmacy, 14195 Berlin, Germany
- Corresponding author. (V.H.); (M.L.)
| |
Collapse
|
31
|
Affiliation(s)
- Feliks Nüske
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, Texas 77005-1892, USA
| | - Lorenzo Boninsegna
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, Texas 77005-1892, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, Texas 77005-1892, USA
| |
Collapse
|
32
|
Wang J, Olsson S, Wehmeyer C, Pérez A, Charron NE, de Fabritiis G, Noé F, Clementi C. Machine Learning of Coarse-Grained Molecular Dynamics Force Fields. ACS Cent Sci 2019; 5:755-767. [PMID: 31139712 PMCID: PMC6535777 DOI: 10.1021/acscentsci.8b00913] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Indexed: 05/17/2023]
Abstract
Atomistic or ab initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force-matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse-grained beads and no solvent, while classical coarse-graining methods fail to capture crucial features of the free energy surface. Thus, CGnets are able to capture multibody terms that emerge from the dimensionality reduction.
Collapse
Affiliation(s)
- Jiang Wang
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Simon Olsson
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Christoph Wehmeyer
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Adrià Pérez
- Computational
Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr Aiguader 88, 08003 Barcelona, Spain
| | - Nicholas E. Charron
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Physics, Rice University, Houston, Texas 77005, United States
| | - Gianni de Fabritiis
- Computational
Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr Aiguader 88, 08003 Barcelona, Spain
- Institucio
Catalana de Recerca i Estudis Avanats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| | - Frank Noé
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department
of Physics, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
33
|
Abstract
The problems of protein folding and protein design are two sides of the same coin. Protein folding involves exploring a protein's configuration space given a fixed sequence, whereas protein design involves searching in sequence space given a particular target structure. For a protein to fold quickly and reliably, its energy landscape must be biased toward the folded ensemble throughout its configuration space and must lack deep kinetic traps that would otherwise frustrate folding. Evolution has "designed" the sequences of many naturally occurring proteins, through an eons-long process of random mutation and selection, to yield landscapes with a minimal degree of frustration. The task facing humans hoping to design protein sequences that fold into particular structures is to use the available approximate energy functions to sculpt funneled landscapes that work in the laboratory. In this work, we demonstrate how to calculate several localized frustration measures using an all-atom energy function. Specifically, we employ the Rosetta energy function, which has been used successfully to design proteins and which has a natural pairwise decomposition that is suitably solvent-averaged. We calculate these newly developed frustration measures for both a mutated WW domain, FiP35, and a three-helix bundle that was designed completely by humans, Alpha3D. The structure of FiP35 exhibits less localized frustration than that of Alpha3D. A mutation toward the consensus sequence for WW domains in FiP35, which has been shown unexpectedly in experiment to disrupt folding, induces localized frustration by disrupting the hydrophobic core. By performing a limited redesign on the sequence of Alpha3D, we show that some, but not all, mutations that lower the energy also result in decreased frustration. The results suggest that, in addition to being useful for detecting residual frustration in protein structures, optimizing the localized frustration measures presented here may be a useful and automatic means of balancing positive and negative design in protein design tasks.
Collapse
|
34
|
Clementi C, Carlotti B, Burattini C, Pellegrino RM, Romani A, Elisei F. Effect of hydrogen bonding interaction on the photophysics of α-amino-orcein. Spectrochim Acta A Mol Biomol Spectrosc 2019; 214:522-530. [PMID: 30818151 DOI: 10.1016/j.saa.2019.02.057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 11/27/2018] [Accepted: 02/16/2019] [Indexed: 06/09/2023]
Abstract
This paper reports for the first time a detailed spectroscopic investigation into the ground- and excited-state properties of α-amino-orcein (α-AO), one of the main components of the orcein dye, in solvents of different proticity and water at different pHs. In order to gain insight into the nature of the involved transitions and excited state deactivation pathways, the study was carried out by means of UV-Visible steady state and ultrafast spectroscopic techniques with the support of quantum mechanical calculations (DFT and TDDFT). The results highlight that the photophysical and photodynamic behaviour of α-AO are highly sensitive to the solvent proticity and pH. In particular, protic environment induces a red shift (55 nm) of the absorption spectrum together with a relevant decrease of the fluorescence quantum yield (from 0.19 in acetonitrile to 6.6 × 10-3 in methanol) and radiative rate constant (two orders of magnitude). A notable red shift is also caused by increasing the pH leading the molecule from monocationic to neutral and then monoanionic form through two deprotonation steps (pKa = 3.539 ± 0.006 and 11.180 ± 0.006). Following deprotonation, the molecule assumes spectral and photophysical properties very similar to those retrieved in protic media. The observed behaviour has been rationalized through the occurrence of hydrogen bonding, likely involving to a greater extent the carbonyl oxygen of α-AO and the protic solvent, that favours the charge delocalization on the whole chromophore as well as fast non-radiative excited state deactivation. The ultrafast spectroscopic investigation revealed in fact the presence, in protic solvent, of a short living component (tens of picoseconds), assignable to solvent complexed S1 state, alongside the long living component (few nanoseconds) observed in aprotic media and attributed to the solvent free S1 state. The results achieved in this study for α-AO provides an important contribution to the interpretation of absorption and fluorescence features of orcein dye mixture in more complex systems (protein based substrates within the many aspects of the cultural heritage and biomedical field) where hydrogen bonds are expected to play a crucial role in mediating the interaction with the environment.
Collapse
Affiliation(s)
- C Clementi
- Department of Chemistry Biology and Biotechnology, University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy.
| | - B Carlotti
- Department of Chemistry Biology and Biotechnology, University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy
| | - C Burattini
- Department of Chemistry Biology and Biotechnology, University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy
| | - R M Pellegrino
- Department of Chemistry Biology and Biotechnology, University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy
| | - A Romani
- Department of Chemistry Biology and Biotechnology, University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy; Center of Excellence on Scientific Methodologies applied to Archaeology and Art (SMAArt), University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy
| | - F Elisei
- Department of Chemistry Biology and Biotechnology, University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy; Center of Excellence on the Innovative Nanostructured Materials (CEMIN), University of Perugia, via Elce di Sotto 8, 06123 Perugia, Italy
| |
Collapse
|
35
|
Abella JR, Antunes DA, Clementi C, Kavraki LE. APE-Gen: A Fast Method for Generating Ensembles of Bound Peptide-MHC Conformations. Molecules 2019; 24:E881. [PMID: 30832312 PMCID: PMC6429480 DOI: 10.3390/molecules24050881] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 02/26/2019] [Accepted: 02/27/2019] [Indexed: 11/16/2022] Open
Abstract
The Class I Major Histocompatibility Complex (MHC) is a central protein in immunology as it binds to intracellular peptides and displays them at the cell surface for recognition by T-cells. The structural analysis of bound peptide-MHC complexes (pMHCs) holds the promise of interpretable and general binding prediction (i.e., testing whether a given peptide binds to a given MHC). However, structural analysis is limited in part by the difficulty in modelling pMHCs given the size and flexibility of the peptides that can be presented by MHCs. This article describes APE-Gen (Anchored Peptide-MHC Ensemble Generator), a fast method for generating ensembles of bound pMHC conformations. APE-Gen generates an ensemble of bound conformations by iterated rounds of (i) anchoring the ends of a given peptide near known pockets in the binding site of the MHC, (ii) sampling peptide backbone conformations with loop modelling, and then (iii) performing energy minimization to fix steric clashes, accumulating conformations at each round. APE-Gen takes only minutes on a standard desktop to generate tens of bound conformations, and we show the ability of APE-Gen to sample conformations found in X-ray crystallography even when only sequence information is used as input. APE-Gen has the potential to be useful for its scalability (i.e., modelling thousands of pMHCs or even non-canonical longer peptides) and for its use as a flexible search tool. We demonstrate an example for studying cross-reactivity.
Collapse
Affiliation(s)
- Jayvee R Abella
- Department of Computer Science, Rice University, Houston, TX 77005, USA.
| | - Dinler A Antunes
- Department of Computer Science, Rice University, Houston, TX 77005, USA.
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA.
- Department of Chemistry, Rice University, Houston, TX 77005, USA.
| | - Lydia E Kavraki
- Department of Computer Science, Rice University, Houston, TX 77005, USA.
| |
Collapse
|
36
|
Clementi C. Multiscale Modeling of Biomolecular Processes by Combining Experiment and Simulation. Biophys J 2019. [DOI: 10.1016/j.bpj.2018.11.226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022] Open
|
37
|
Hruska E, Abella JR, Nüske F, Kavraki LE, Clementi C. Quantitative comparison of adaptive sampling methods for protein dynamics. J Chem Phys 2019; 149:244119. [PMID: 30599712 DOI: 10.1063/1.5053582] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Adaptive sampling methods, often used in combination with Markov state models, are becoming increasingly popular for speeding up rare events in simulation such as molecular dynamics (MD) without biasing the system dynamics. Several adaptive sampling strategies have been proposed, but it is not clear which methods perform better for different physical systems. In this work, we present a systematic evaluation of selected adaptive sampling strategies on a wide selection of fast folding proteins. The adaptive sampling strategies were emulated using models constructed on already existing MD trajectories. We provide theoretical limits for the sampling speed-up and compare the performance of different strategies with and without using some a priori knowledge of the system. The results show that for different goals, different adaptive sampling strategies are optimal. In order to sample slow dynamical processes such as protein folding without a priori knowledge of the system, a strategy based on the identification of a set of metastable regions is consistently the most efficient, while a strategy based on the identification of microstates performs better if the goal is to explore newer regions of the conformational space. Interestingly, the maximum speed-up achievable for the adaptive sampling of slow processes increases for proteins with longer folding times, encouraging the application of these methods for the characterization of slower processes, beyond the fast-folding proteins considered here.
Collapse
Affiliation(s)
- Eugen Hruska
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Jayvee R Abella
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | - Feliks Nüske
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Lydia E Kavraki
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
38
|
Krylov A, Windus TL, Barnes T, Marin-Rimoldi E, Nash JA, Pritchard B, Smith DGA, Altarawy D, Saxe P, Clementi C, Crawford TD, Harrison RJ, Jha S, Pande VS, Head-Gordon T. Perspective: Computational chemistry software and its advancement as illustrated through three grand challenge cases for molecular science. J Chem Phys 2018; 149:180901. [DOI: 10.1063/1.5052551] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Anna Krylov
- Department of Chemistry, University of Southern California, Los Angeles, California 90089, USA
| | - Theresa L. Windus
- Department of Chemistry, Iowa State University, Ames, Iowa 50011, USA
| | - Taylor Barnes
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | | | - Jessica A. Nash
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | | | | | - Doaa Altarawy
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | - Paul Saxe
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | - Cecilia Clementi
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, USA
- Department of Mathematics and Computer Science, Freie Universitt Berlin, Arnimallee 6, 14195 Berlin, Germany
| | | | - Robert J. Harrison
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York 11794, USA
| | - Shantenu Jha
- Electrical and Computer Engineering, Rutgers The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | - Vijay S. Pande
- Department of Bioengineering, Stanford University, Stanford, California 94305, USA
| | - Teresa Head-Gordon
- Department of Chemistry, Department of Bioengineering, Department of Chemical and Biomolecular Engineering, Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, USA
| |
Collapse
|
39
|
Yrazu FM, Pinamonti G, Clementi C. The Effect of Electrostatic Interactions on the Folding Kinetics of a 3-α-Helical Bundle Protein Family. J Phys Chem B 2018; 122:11800-11806. [PMID: 30277393 DOI: 10.1021/acs.jpcb.8b08676] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The trio of protein segment repeats called spectrins diverges by more than 2 orders of magnitude in their folding and unfolding rates, despite having very similar stabilities and almost coincidental topologies. Experimental studies revealed that the mutation of five particular residues dramatically alters the kinetic rates in the slow folders, making them similar to the rates of the fast folder. This is considered to be an exceptional behavior which seems in principle to challenge the current understanding of the protein folding process. In this work, we analyze this scenario, using a simplified computational model, combined with state-of-the-art kinetic analysis techniques. Our model faithfully separates the kinetics of the fast and slow folders and captures the effect of the five mutations. We show that the inclusion of electrostatics in the model is necessary to explain the experimental findings.
Collapse
Affiliation(s)
- Fernando Miguel Yrazu
- Department of Chemical and Biomolecular Engineering , Rice University , Houston , Texas 77005 , United States
| | - Giovanni Pinamonti
- Department of Informatics and Mathematics , Freie Universität Berlin , 14195 Berlin , Germany
| | - Cecilia Clementi
- Department of Chemical and Biomolecular Engineering , Rice University , Houston , Texas 77005 , United States.,Department of Informatics and Mathematics , Freie Universität Berlin , 14195 Berlin , Germany.,Center for Theoretical Biological Physics and Department of Chemistry , Rice University , Houston , Texas 77005 , United States
| |
Collapse
|
40
|
Galarneau G, Fontanillas P, Hu-Seliger T, Clementi C, Schick U, Colaci D, Parfitt D, Hinds D, Yurttas Beim P. Premature menopause genome-wide association study in 75,000 women of European ancestry. Fertil Steril 2018. [DOI: 10.1016/j.fertnstert.2018.07.083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
41
|
Abstract
With the rapid increase of available data for complex systems, there is great interest in the extraction of physically relevant information from massive datasets. Recently, a framework called Sparse Identification of Nonlinear Dynamics (SINDy) has been introduced to identify the governing equations of dynamical systems from simulation data. In this study, we extend SINDy to stochastic dynamical systems which are frequently used to model biophysical processes. We prove the asymptotic correctness of stochastic SINDy in the infinite data limit, both in the original and projected variables. We discuss algorithms to solve the sparse regression problem arising from the practical implementation of SINDy and show that cross validation is an essential tool to determine the right level of sparsity. We demonstrate the proposed methodology on two test systems, namely, the diffusion in a one-dimensional potential and the projected dynamics of a two-dimensional diffusion process.
Collapse
Affiliation(s)
- Lorenzo Boninsegna
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Feliks Nüske
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
42
|
Abstract
This Special Topic Issue on Reaction Pathways collects original research articles illustrating the state of the art in the development and application of methods to describe complex chemical systems in terms of relatively simple mechanisms and collective coordinates. A broad range of applications is presented, spanning the sub-fields of biophysics and material science, in an attempt to showcase the similarities in the formulation of the approaches and highlight the different needs of the different application domains.
Collapse
Affiliation(s)
- Cecilia Clementi
- Department of Chemistry, Rice University, Houston, Texas 77005, USA
| | - Graeme Henkelman
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
43
|
Affiliation(s)
- Justin Chen
- Department of Physics and Astronomy, Rice University, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - Jiming Chen
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
| | - Giovanni Pinamonti
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Cecilia Clementi
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas 77005, United States
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
44
|
Litzinger F, Boninsegna L, Wu H, Nüske F, Patel R, Baraniuk R, Noé F, Clementi C. Rapid Calculation of Molecular Kinetics Using Compressed Sensing. J Chem Theory Comput 2018; 14:2771-2783. [DOI: 10.1021/acs.jctc.8b00089] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Florian Litzinger
- Freie Universität Berlin, Department of Mathematics and Computer Science, Arnimallee 6, 14195 Berlin, Germany
| | - Lorenzo Boninsegna
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| | - Hao Wu
- Freie Universität Berlin, Department of Mathematics and Computer Science, Arnimallee 6, 14195 Berlin, Germany
| | - Feliks Nüske
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| | - Raajen Patel
- Rice University, Department of Electrical and Computer Engineering, Houston, Texas 77005, United States
| | - Richard Baraniuk
- Rice University, Department of Electrical and Computer Engineering, Houston, Texas 77005, United States
| | - Frank Noé
- Freie Universität Berlin, Department of Mathematics and Computer Science, Arnimallee 6, 14195 Berlin, Germany
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| | - Cecilia Clementi
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| |
Collapse
|
45
|
Abstract
Macromolecular systems are composed of a very large number of atomic degrees of freedom. There is strong evidence suggesting that structural changes occurring in large biomolecular systems at long time scale dynamics may be captured by models coarser than atomistic, although a suitable or optimal coarse-graining is a priori unknown. Here we propose a systematic approach to learning a coarse representation of a macromolecule from microscopic simulation data. In particular, the definition of effective coarse variables is achieved by partitioning the degrees of freedom both in the structural (physical) space and in the conformational space. The identification of groups of microscopic particles forming dynamical coherent states in different metastable states leads to a multiscale description of the system, in space and time. The application of this approach to the folding dynamics of two proteins provides a revised view of the classical idea of prestructured regions (foldons) that combine during a protein-folding process and suggests a hierarchical characterization of the assembly process of folded structures.
Collapse
Affiliation(s)
- Lorenzo Boninsegna
- Department of Chemistry, and Center for Theoretical Biological Physics, Rice University , 6100 Main Street, Houston, Texas 77005, United States
| | - Ralf Banisch
- Department of Mathematics and Computer Science, Freie Universität Berlin , Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Department of Chemistry, and Center for Theoretical Biological Physics, Rice University , 6100 Main Street, Houston, Texas 77005, United States.,Department of Mathematics and Computer Science, Freie Universität Berlin , Arnimallee 6, 14195 Berlin, Germany
| |
Collapse
|
46
|
Clementi C, Casu G, Gremigni P. An Abbreviated Version of the Mindful Eating Questionnaire. J Nutr Educ Behav 2017; 49:352-356.e1. [PMID: 28391799 DOI: 10.1016/j.jneb.2017.01.016] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2016] [Revised: 01/25/2017] [Accepted: 01/31/2017] [Indexed: 06/07/2023]
Abstract
OBJECTIVE To assess the psychometric properties of the Mindful Eating Questionnaire (MEQ). METHODS A total of 15 mindfulness experts evaluated the content of the 28 items and 5 factors of the MEQ. A sample of 1,067 Italian adults (61.4% women) completed the MEQ and other measures; 62 participants completed a 4-week test-retest. RESULTS Content analysis reduced the MEQ to 20 items. Exploratory and confirmatory factor analyses supported a 2-factor model based on awareness and recognition of hunger and satiety cues. Factors showed adequate internal consistency (α = .75 and .83, respectively) and test-retest reliability (intraclass correlation coefficient = 0.73 and 0.85, respectively), and were associated in expected ways, although with small to moderate effect sizes, with general mindfulness, meditation experience, yoga practice, not being on a diet plan, and body mass index categories. CONCLUSIONS AND IMPLICATIONS Findings provided evidence of validity and reliability for the 20-item MEQ and support its use by clinicians and researchers for addressing eating-related issues.
Collapse
Affiliation(s)
| | - Giulia Casu
- Department of Psychology, University of Bologna, Bologna, Italy
| | - Paola Gremigni
- Department of Psychology, University of Bologna, Bologna, Italy.
| |
Collapse
|
47
|
Noé F, Clementi C. Collective variables for the study of long-time kinetics from molecular trajectories: theory and methods. Curr Opin Struct Biol 2017; 43:141-147. [PMID: 28327454 DOI: 10.1016/j.sbi.2017.02.006] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 02/20/2017] [Indexed: 12/23/2022]
Abstract
Collective variables are an important concept to study high-dimensional dynamical systems, such as molecular dynamics of macromolecules, liquids, or polymers, in particular to define relevant metastable states and state-transition or phase-transition. Over the past decade, a rigorous mathematical theory has been formulated to define optimal collective variables to characterize slow dynamical processes. Here we review recent developments, including a variational principle to find optimal approximations to slow collective variables from simulation data, and algorithms such as the time-lagged independent component analysis. Using these concepts, a distance metric can be defined that quantifies how slowly molecular conformations interconvert. Extensions and open questions are discussed.
Collapse
Affiliation(s)
- Frank Noé
- Department of Mathematics and Computer Science, FU Berlin, Arnimallee 6, 14195 Berlin, Germany.
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, and Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005, United States.
| |
Collapse
|
48
|
Nüske F, Wu H, Prinz JH, Wehmeyer C, Clementi C, Noé F. Markov state models from short non-equilibrium simulations—Analysis and correction of estimation bias. J Chem Phys 2017. [DOI: 10.1063/1.4976518] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Affiliation(s)
- Feliks Nüske
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Hao Wu
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Jan-Hendrik Prinz
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Christoph Wehmeyer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center for Theoretical Biological Physics and Department of Chemistry, Rice University, Houston, Texas 77005, USA
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| |
Collapse
|
49
|
Agarwal A, Clementi C, Delle Site L. Path integral-GC-AdResS simulation of a large hydrophobic solute in water: a tool to investigate the interplay between local microscopic structures and quantum delocalization of atoms in space. Phys Chem Chem Phys 2017; 19:13030-13037. [DOI: 10.1039/c7cp01629h] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We perform large scale quantum (path integral) molecular dynamics simulations of a C60 -like molecule in water.
Collapse
|
50
|
Affiliation(s)
- Frank Noé
- Department
of Mathematics, Computer Science and Bioinformatics, FU Berlin, Arnimallee
6, 14195 Berlin, Germany
| | - Ralf Banisch
- Department
of Mathematics, Computer Science and Bioinformatics, FU Berlin, Arnimallee
6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics, and Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|