1
|
Tamagnone S, Laio A, Gabrié M. Coarse-Grained Molecular Dynamics with Normalizing Flows. J Chem Theory Comput 2024. [PMID: 39223750 DOI: 10.1021/acs.jctc.4c00700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
We propose a sampling algorithm relying on a collective variable (CV) of midsize dimension modeled by a normalizing flow and using nonequilibrium dynamics to propose full configurational moves from the proposition of a refreshed value of the CV made by the flow. The algorithm takes the form of a Markov chain with nonlocal updates, allowing jumps through energy barriers across metastable states. The flow is trained throughout the algorithm to reproduce the free energy landscape of the CV. The output of the algorithm is a sample of thermalized configurations and the trained network that can be used to efficiently produce more configurations. We show the functioning of the algorithm first in a test case with a mixture of Gaussians. Then, we successfully tested it on a higher-dimensional system consisting of a polymer in solution with a compact state and an extended stable state separated by a high free energy barrier.
Collapse
Affiliation(s)
- Samuel Tamagnone
- International School for Advanced Studies (SISSA), Via Bonomea 265, Trieste 34136, Italy
| | - Alessandro Laio
- International School for Advanced Studies (SISSA), Via Bonomea 265, Trieste 34136, Italy
- The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, Trieste 34151, Italy
| | - Marylou Gabrié
- CMAP, CNRS, Institut Polytechnique de Paris, École Polytechnique, 91120 Palaiseau, France
| |
Collapse
|
2
|
Galliano L, Rende R, Coslovich D. Policy-guided Monte Carlo on general state spaces: Application to glass-forming mixtures. J Chem Phys 2024; 161:064503. [PMID: 39132794 DOI: 10.1063/5.0221221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 07/23/2024] [Indexed: 08/13/2024] Open
Abstract
Policy-guided Monte Carlo is an adaptive method to simulate classical interacting systems. It adjusts the proposal distribution of the Metropolis-Hastings algorithm to maximize the sampling efficiency, using a formalism inspired by reinforcement learning. In this work, we first extend the policy-guided method to deal with a general state space, comprising, for instance, both discrete and continuous degrees of freedom, and then apply it to a few paradigmatic models of glass-forming mixtures. We assess the efficiency of a set of physically inspired moves whose proposal distributions are optimized through on-policy learning. Compared to conventional Monte Carlo methods, the optimized proposals are two orders of magnitude faster for an additive soft sphere mixture but yield a much more limited speed-up for the well-studied Kob-Andersen model. We discuss the current limitations of the method and suggest possible ways to improve it.
Collapse
Affiliation(s)
- Leonardo Galliano
- Dipartimento di Fisica, Università di Trieste, Strada Costiera 11, 34151 Trieste, Italy
| | - Riccardo Rende
- International School for Advanced Studies (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| | - Daniele Coslovich
- Dipartimento di Fisica, Università di Trieste, Strada Costiera 11, 34151 Trieste, Italy
| |
Collapse
|
3
|
Boffi NM, Vanden-Eijnden E. Deep learning probability flows and entropy production rates in active matter. Proc Natl Acad Sci U S A 2024; 121:e2318106121. [PMID: 38861599 PMCID: PMC11194503 DOI: 10.1073/pnas.2318106121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 05/01/2024] [Indexed: 06/13/2024] Open
Abstract
Active matter systems, from self-propelled colloids to motile bacteria, are characterized by the conversion of free energy into useful work at the microscopic scale. They involve physics beyond the reach of equilibrium statistical mechanics, and a persistent challenge has been to understand the nature of their nonequilibrium states. The entropy production rate and the probability current provide quantitative ways to do so by measuring the breakdown of time-reversal symmetry. Yet, their efficient computation has remained elusive, as they depend on the system's unknown and high-dimensional probability density. Here, building upon recent advances in generative modeling, we develop a deep learning framework to estimate the score of this density. We show that the score, together with the microscopic equations of motion, gives access to the entropy production rate, the probability current, and their decomposition into local contributions from individual particles. To represent the score, we introduce a spatially local transformer network architecture that learns high-order interactions between particles while respecting their underlying permutation symmetry. We demonstrate the broad utility and scalability of the method by applying it to several high-dimensional systems of active particles undergoing motility-induced phase separation (MIPS). We show that a single network trained on a system of 4,096 particles at one packing fraction can generalize to other regions of the phase diagram, including to systems with as many as 32,768 particles. We use this observation to quantify the spatial structure of the departure from equilibrium in MIPS as a function of the number of particles and the packing fraction.
Collapse
Affiliation(s)
- Nicholas M. Boffi
- Courant Institute of Mathematical Sciences, New York University, New York, NY10012
| | - Eric Vanden-Eijnden
- Courant Institute of Mathematical Sciences, New York University, New York, NY10012
| |
Collapse
|
4
|
Mehdi S, Smith Z, Herron L, Zou Z, Tiwary P. Enhanced Sampling with Machine Learning. Annu Rev Phys Chem 2024; 75:347-370. [PMID: 38382572 PMCID: PMC11213683 DOI: 10.1146/annurev-physchem-083122-125941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Molecular dynamics (MD) enables the study of physical systems with excellent spatiotemporal resolution but suffers from severe timescale limitations. To address this, enhanced sampling methods have been developed to improve the exploration of configurational space. However, implementing these methods is challenging and requires domain expertise. In recent years, integration of machine learning (ML) techniques into different domains has shown promise, prompting their adoption in enhanced sampling as well. Although ML is often employed in various fields primarily due to its data-driven nature, its integration with enhanced sampling is more natural with many common underlying synergies. This review explores the merging of ML and enhanced MD by presenting different shared viewpoints. It offers a comprehensive overview of this rapidly evolving field, which can be difficult to stay updated on. We highlight successful strategies such as dimensionality reduction, reinforcement learning, and flow-based methods. Finally, we discuss open problems at the exciting ML-enhanced MD interface.
Collapse
Affiliation(s)
- Shams Mehdi
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, USA;
- Biophysics Program, University of Maryland, College Park, Maryland, USA
| | - Zachary Smith
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, USA;
- Biophysics Program, University of Maryland, College Park, Maryland, USA
| | - Lukas Herron
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, USA;
- Biophysics Program, University of Maryland, College Park, Maryland, USA
| | - Ziyue Zou
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA
| | - Pratyush Tiwary
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, USA;
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA
| |
Collapse
|
5
|
Zuckerman DM, George A. Bayesian Mechanistic Inference, Statistical Mechanics, and a New Era for Monte Carlo. J Chem Theory Comput 2024; 20:2971-2984. [PMID: 38603773 PMCID: PMC11089648 DOI: 10.1021/acs.jctc.4c00014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2024]
Abstract
On the one hand, much of computational chemistry is concerned with "bottom-up" calculations which elucidate observable behavior starting from exact or approximated physical laws, a paradigm exemplified by typical quantum mechanical calculations and molecular dynamics simulations. On the other hand, "top down" computations aiming to formulate mathematical models consistent with observed data, e.g., parametrizing force fields, binding or kinetic models, have been of interest for decades but recently have grown in sophistication with the use of Bayesian inference (BI). Standard BI provides an estimation of parameter values, uncertainties, and correlations among parameters. Used for "model selection," BI can also distinguish between model structures such as the presence or absence of individual states and transitions. Fortunately for physical scientists, BI can be formulated within a statistical mechanics framework, and indeed, BI has led to a resurgence of interest in Monte Carlo (MC) algorithms, many of which have been directly adapted from or inspired by physical strategies. Certain MC algorithms─notably procedures using an "infinite temperature" reference state─can be successful in a 5-20 parameter BI context which would be unworkable in molecular spaces of 103 coordinates and more. This Review provides a pedagogical introduction to BI and reviews key aspects of BI through a physical lens, setting the computations in terms of energy landscapes and free energy calculations and describing promising sampling algorithms. Statistical mechanics and basic probability theory also provide a reference for understanding intrinsic limitations of Bayesian inference with regard to model selection and the choice of priors.
Collapse
Affiliation(s)
- Daniel M Zuckerman
- Department of Biomedical Engineering, Oregon Health and Science University, Portland, Oregon 97239, United States
| | - August George
- Department of Biomedical Engineering, Oregon Health and Science University, Portland, Oregon 97239, United States
| |
Collapse
|
6
|
Chennakesavalu S, Rotskoff GM. Data-Efficient Generation of Protein Conformational Ensembles with Backbone-to-Side-Chain Transformers. J Phys Chem B 2024; 128:2114-2123. [PMID: 38394363 DOI: 10.1021/acs.jpcb.3c08195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Excitement at the prospect of using data-driven generative models to sample configurational ensembles of biomolecular systems stems from the extraordinary success of these models on a diverse set of high-dimensional sampling tasks. Unlike image generation or even the closely related problem of protein structure prediction, there are currently no data sources with sufficient breadth to parametrize generative models for conformational ensembles. To enable discovery, a fundamentally different approach to building generative models is required: models should be able to propose rare, albeit physical, conformations that may not arise in even the largest data sets. Here we introduce a modular strategy to generate conformations based on "backmapping" from a fixed protein backbone that (1) maintains conformational diversity of the side chains and (2) couples the side-chain fluctuations using global information about the protein conformation. Our model combines simple statistical models of side-chain conformations based on rotamer libraries with the now ubiquitous transformer architecture to sample with atomistic accuracy. Together, these ingredients provide a strategy for rapid data acquisition and hence a crucial ingredient for scalable physical simulation with generative neural networks.
Collapse
Affiliation(s)
| | - Grant M Rotskoff
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
7
|
Rizzi A, Carloni P, Parrinello M. Free energies at QM accuracy from force fields via multimap targeted estimation. Proc Natl Acad Sci U S A 2023; 120:e2304308120. [PMID: 37931103 PMCID: PMC10655219 DOI: 10.1073/pnas.2304308120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 09/25/2023] [Indexed: 11/08/2023] Open
Abstract
Accurate predictions of ligand binding affinities would greatly accelerate the first stages of drug discovery campaigns. However, using highly accurate interatomic potentials based on quantum mechanics (QM) in free energy methods has been so far largely unfeasible due to their prohibitive computational cost. Here, we present an efficient method to compute QM free energies from simulations using cheap reference potentials, such as force fields (FFs). This task has traditionally been out of reach due to the slow convergence of computing the correction from the FF to the QM potential. To overcome this bottleneck, we generalize targeted free energy methods to employ multiple maps-implemented with normalizing flow neural networks (NNs)-that maximize the overlap between the distributions. Critically, the method requires neither a separate expensive training phase for the NNs nor samples from the QM potential. We further propose a one-epoch learning policy to efficiently avoid overfitting, and we combine our approach with enhanced sampling strategies to overcome the pervasive problem of poor convergence due to slow degrees of freedom. On the drug-like molecules in the HiPen dataset, the method accelerates the calculation of the free energy difference of switching from an FF to a DFTB3 potential by three orders of magnitude compared to standard free energy perturbation and by a factor of eight compared to previously published nonequilibrium calculations. Our results suggest that our method, in combination with efficient QM/MM calculations, may be used in lead optimization campaigns in drug discovery and to study protein-ligand molecular recognition processes.
Collapse
Affiliation(s)
- Andrea Rizzi
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich52428, Germany
- Atomistic Simulations, Italian Institute of Technology, Genova16163, Italy
| | - Paolo Carloni
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich52428, Germany
- Department of Physics and Universitätsklinikum, RWTH Aachen University, Aachen52074, Germany
| | - Michele Parrinello
- Atomistic Simulations, Italian Institute of Technology, Genova16163, Italy
| |
Collapse
|
8
|
Wang H, Fu T, Du Y, Gao W, Huang K, Liu Z, Chandak P, Liu S, Van Katwyk P, Deac A, Anandkumar A, Bergen K, Gomes CP, Ho S, Kohli P, Lasenby J, Leskovec J, Liu TY, Manrai A, Marks D, Ramsundar B, Song L, Sun J, Tang J, Veličković P, Welling M, Zhang L, Coley CW, Bengio Y, Zitnik M. Scientific discovery in the age of artificial intelligence. Nature 2023; 620:47-60. [PMID: 37532811 DOI: 10.1038/s41586-023-06221-2] [Citation(s) in RCA: 113] [Impact Index Per Article: 113.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/16/2023] [Indexed: 08/04/2023]
Abstract
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.
Collapse
Affiliation(s)
- Hanchen Wang
- Department of Engineering, University of Cambridge, Cambridge, UK
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
- Department of Research and Early Development, Genentech Inc, South San Francisco, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Tianfan Fu
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Yuanqi Du
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| | - Wenhao Gao
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kexin Huang
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Ziming Liu
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Payal Chandak
- Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA, USA
| | - Shengchao Liu
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- Université de Montréal, Montreal, Quebec, Canada
| | - Peter Van Katwyk
- Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA
- Data Science Institute, Brown University, Providence, RI, USA
| | - Andreea Deac
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- Université de Montréal, Montreal, Quebec, Canada
| | - Anima Anandkumar
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
- NVIDIA, Santa Clara, CA, USA
| | - Karianne Bergen
- Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA
- Data Science Institute, Brown University, Providence, RI, USA
| | - Carla P Gomes
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| | - Shirley Ho
- Center for Computational Astrophysics, Flatiron Institute, New York, NY, USA
- Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA
- Department of Physics, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Physics and Center for Data Science, New York University, New York, NY, USA
| | | | - Joan Lasenby
- Department of Engineering, University of Cambridge, Cambridge, UK
| | - Jure Leskovec
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Arjun Manrai
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Debora Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Le Song
- BioMap, Beijing, China
- Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Jimeng Sun
- University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Jian Tang
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- HEC Montréal, Montreal, Quebec, Canada
- CIFAR AI Chair, Toronto, Ontario, Canada
| | - Petar Veličković
- Google DeepMind, London, UK
- Department of Computer Science and Technology, University of Cambridge, Cambridge, UK
| | - Max Welling
- University of Amsterdam, Amsterdam, Netherlands
- Microsoft Research Amsterdam, Amsterdam, Netherlands
| | - Linfeng Zhang
- DP Technology, Beijing, China
- AI for Science Institute, Beijing, China
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yoshua Bengio
- Mila - Quebec AI Institute, Montreal, Quebec, Canada
- Université de Montréal, Montreal, Quebec, Canada
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Data Science Initiative, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
9
|
Guo P, Zhang R, Zhang J, Shi J, Li B. A Monte Carlo algorithm to improve the measurement efficiency of low-field nuclear magnetic resonance. Sci Rep 2023; 13:10533. [PMID: 37386118 PMCID: PMC10310765 DOI: 10.1038/s41598-023-37731-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/27/2023] [Indexed: 07/01/2023] Open
Abstract
Nuclear magnetic resonance (NMR) has shown good applications in engineering fields such as well logging and rubber material ageing assessment. However, due to the low magnetic field strength of NMR sensors and the complex working conditions of engineering sites, the signal-to-noise ratio (SNR) of NMR signals is low, and it is usually necessary to increase the number of repeated measurements to improve the SNR, which means a longer measurement time. Therefore, it is especially important to set the measurement parameters appropriately for onsite NMR. In this paper, we propose a stochastic simulation using Monte Carlo methods to predict the measurement curves of [Formula: see text] and [Formula: see text] and correct the measurement parameters of the next step according to the previous measurement results. The method can update the measurement parameters in real time and perform automatic measurements. At the same time, this method greatly reduces the measurement time. The experimental results show that the method is suitable for the measurement of the self-diffusion coefficient D0 and longitudinal relaxation time T1, which are frequently used in NMR measurements.
Collapse
Affiliation(s)
- Pan Guo
- College of Physics and Electronic Engineering, Chongqing Normal University, Chongqing, China.
| | - Ruoshuang Zhang
- College of Physics and Electronic Engineering, Chongqing Normal University, Chongqing, China
| | - Jiawen Zhang
- College of Physics and Electronic Engineering, Chongqing Normal University, Chongqing, China
| | - Junhao Shi
- College of Physics and Electronic Engineering, Chongqing Normal University, Chongqing, China
| | - Bing Li
- Urumqi Power Supply Company, State Grid Xinjiang Electric Power Co, LTD, Urumqi, China
| |
Collapse
|
10
|
Chennakesavalu S, Toomer DJ, Rotskoff GM. Ensuring thermodynamic consistency with invertible coarse-graining. J Chem Phys 2023; 158:124126. [PMID: 37003724 DOI: 10.1063/5.0141888] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023] Open
Abstract
Coarse-grained models are a core computational tool in theoretical chemistry and biophysics. A judicious choice of a coarse-grained model can yield physical insights by isolating the essential degrees of freedom that dictate the thermodynamic properties of a complex, condensed-phase system. The reduced complexity of the model typically leads to lower computational costs and more efficient sampling compared with atomistic models. Designing "good" coarse-grained models is an art. Generally, the mapping from fine-grained configurations to coarse-grained configurations itself is not optimized in any way; instead, the energy function associated with the mapped configurations is. In this work, we explore the consequences of optimizing the coarse-grained representation alongside its potential energy function. We use a graph machine learning framework to embed atomic configurations into a low-dimensional space to produce efficient representations of the original molecular system. Because the representation we obtain is no longer directly interpretable as a real-space representation of the atomic coordinates, we also introduce an inversion process and an associated thermodynamic consistency relation that allows us to rigorously sample fine-grained configurations conditioned on the coarse-grained sampling. We show that this technique is robust, recovering the first two moments of the distribution of several observables in proteins such as chignolin and alanine dipeptide.
Collapse
Affiliation(s)
| | - David J Toomer
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Grant M Rotskoff
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
11
|
Hashemi M, Vattikonda AN, Jha J, Sip V, Woodman MM, Bartolomei F, Jirsa VK. Amortized Bayesian inference on generative dynamical network models of epilepsy using deep neural density estimators. Neural Netw 2023; 163:178-194. [PMID: 37060871 DOI: 10.1016/j.neunet.2023.03.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/03/2023]
Abstract
Whole-brain modeling of epilepsy combines personalized anatomical data with dynamical models of abnormal activities to generate spatio-temporal seizure patterns as observed in brain imaging data. Such a parametric simulator is equipped with a stochastic generative process, which itself provides the basis for inference and prediction of the local and global brain dynamics affected by disorders. However, the calculation of likelihood function at whole-brain scale is often intractable. Thus, likelihood-free algorithms are required to efficiently estimate the parameters pertaining to the hypothetical areas, ideally including the uncertainty. In this study, we introduce the simulation-based inference for the virtual epileptic patient model (SBI-VEP), enabling us to amortize the approximate posterior of the generative process from a low-dimensional representation of whole-brain epileptic patterns. The state-of-the-art deep learning algorithms for conditional density estimation are used to readily retrieve the statistical relationships between parameters and observations through a sequence of invertible transformations. We show that the SBI-VEP is able to efficiently estimate the posterior distribution of parameters linked to the extent of the epileptogenic and propagation zones from sparse intracranial electroencephalography recordings. The presented Bayesian methodology can deal with non-linear latent dynamics and parameter degeneracy, paving the way for fast and reliable inference on brain disorders from neuroimaging modalities.
Collapse
|
12
|
Shiba H, Hanai M, Suzumura T, Shimokawabe T. BOTAN: BOnd TArgeting Network for prediction of slow glassy dynamics by machine learning relative motion. J Chem Phys 2023; 158:084503. [PMID: 36859106 DOI: 10.1063/5.0129791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023] Open
Abstract
Recent developments in machine learning have enabled accurate predictions of the dynamics of slow structural relaxation in glass-forming systems. However, existing machine learning models for these tasks are mostly designed such that they learn a single dynamic quantity and relate it to the structural features of glassy liquids. In this study, we propose a graph neural network model, "BOnd TArgeting Network," that learns relative motion between neighboring pairs of particles, in addition to the self-motion of particles. By relating the structural features to these two different dynamical variables, the model autonomously acquires the ability to discern how the self motion of particles undergoing slow relaxation is affected by different dynamical processes, strain fluctuations and particle rearrangements, and thus can predict with high precision how slow structural relaxation develops in space and time.
Collapse
Affiliation(s)
- Hayato Shiba
- Information Technology Center, University of Tokyo, Chiba 277-0882, Japan
| | - Masatoshi Hanai
- Information Technology Center, University of Tokyo, Chiba 277-0882, Japan
| | - Toyotaro Suzumura
- Information Technology Center, University of Tokyo, Chiba 277-0882, Japan
| | | |
Collapse
|
13
|
Köhler J, Chen Y, Krämer A, Clementi C, Noé F. Flow-Matching: Efficient Coarse-Graining of Molecular Dynamics without Forces. J Chem Theory Comput 2023; 19:942-952. [PMID: 36668906 DOI: 10.1021/acs.jctc.3c00016] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Coarse-grained (CG) molecular simulations have become a standard tool to study molecular processes on time and length scales inaccessible to all-atom simulations. Parametrizing CG force fields to match all-atom simulations has mainly relied on force-matching or relative entropy minimization, which require many samples from costly simulations with all-atom or CG resolutions, respectively. Here we present flow-matching, a new training method for CG force fields that combines the advantages of both methods by leveraging normalizing flows, a generative deep learning method. Flow-matching first trains a normalizing flow to represent the CG probability density, which is equivalent to minimizing the relative entropy without requiring iterative CG simulations. Subsequently, the flow generates samples and forces according to the learned distribution in order to train the desired CG free energy model via force-matching. Even without requiring forces from the all-atom simulations, flow-matching outperforms classical force-matching by an order of magnitude in terms of data efficiency and produces CG models that can capture the folding and unfolding transitions of small proteins.
Collapse
Affiliation(s)
- Jonas Köhler
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Center for Theoretical Biological Physics, Rice University, Houston, Texas77005, United States.,Department of Physics, Rice University, Houston, Texas77005, United States.,Department of Chemistry, Rice University, Houston, Texas77005, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Department of Chemistry, Rice University, Houston, Texas77005, United States.,Microsoft Research AI4Science, Karl-Liebknecht Strasse 32, 10178Berlin, Germany
| |
Collapse
|
14
|
Invernizzi M, Krämer A, Clementi C, Noé F. Skipping the Replica Exchange Ladder with Normalizing Flows. J Phys Chem Lett 2022; 13:11643-11649. [PMID: 36484770 DOI: 10.1021/acs.jpclett.2c03327] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
We combine replica exchange (parallel tempering) with normalizing flows, a class of deep generative models. These two sampling strategies complement each other, resulting in an efficient method for sampling molecular systems characterized by rare events, which we call learned replica exchange (LREX). In LREX, a normalizing flow is trained to map the configurations of the fastest-mixing replica into configurations belonging to the target distribution, allowing direct exchanges between the two without the need to simulate intermediate replicas. This can significantly reduce the computational cost compared to standard replica exchange. The proposed method also offers several advantages with respect to Boltzmann generators that directly use normalizing flows to sample the target distribution. We apply LREX to some prototypical molecular dynamics systems, highlighting the improvements over previous methods.
Collapse
Affiliation(s)
- Michele Invernizzi
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, 14195Berlin, Germany
- Department of Chemistry, Rice University, 77005Houston, United States
- Center for Theoretical Biological Physics, Rice University, 77005Houston, United States
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195Berlin, Germany
- Department of Physics, Freie Universität Berlin, 14195Berlin, Germany
- Department of Chemistry, Rice University, 77005Houston, United States
- Microsoft Research AI4Science, 10178Berlin, Germany
| |
Collapse
|
15
|
Mahmoud AH, Masters M, Lee SJ, Lill MA. Accurate Sampling of Macromolecular Conformations Using Adaptive Deep Learning and Coarse-Grained Representation. J Chem Inf Model 2022; 62:1602-1617. [PMID: 35352898 DOI: 10.1021/acs.jcim.1c01438] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Conformational sampling of protein structures is essential for understanding biochemical functions and for predicting thermodynamic properties such as free energies. Where previous approaches rely on sequential sampling procedures, recent developments in generative deep neural networks rendered possible the parallel, statistically independent sampling of molecular configurations. To be able to accurately generate samples of large molecular systems from a high-dimensional multimodal equilibrium distribution function, we developed a hierarchical approach based on expressive normalizing flows with rational quadratic neural splines and coarse-grained representation. Furthermore, system specific priors and adaptive and property-based controlled learning was designed to diminish the likelihood for the generation of high-energy structures during sampling. Finally, backmapping from a coarse-grained to fully atomistic representation is performed through an equivariant transformer model. We demonstrate the applicability of the method on the one-shot configurational sampling of a protein system with more than a hundred amino acids. The results show enhanced expressivity that diminish the invertibility constraints inherent in the normalizing flow framework. Moreover, the capacity of the hierarchical normalizing flow model was tested on a challenging case study of the folding/unfolding dynamics of the peptide chignolin.
Collapse
Affiliation(s)
- Amr H Mahmoud
- Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| | - Matthew Masters
- Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| | - Soo Jung Lee
- Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| | - Markus A Lill
- Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| |
Collapse
|
16
|
Brofos JA, Gabrié M, Brubaker MA, Lederman RR. Adaptation of the Independent Metropolis-Hastings Sampler with Normalizing Flow Proposals. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2022; 151:5949-5986. [PMID: 36789101 PMCID: PMC9923871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Markov Chain Monte Carlo (MCMC) methods are a powerful tool for computation with complex probability distributions. However the performance of such methods is critically dependent on properly tuned parameters, most of which are difficult if not impossible to know a priori for a given target distribution. Adaptive MCMC methods aim to address this by allowing the parameters to be updated during sampling based on previous samples from the chain at the expense of requiring a new theoretical analysis to ensure convergence. In this work we extend the convergence theory of adaptive MCMC methods to a new class of methods built on a powerful class of parametric density estimators known as normalizing flows. In particular, we consider an independent Metropolis-Hastings sampler where the proposal distribution is represented by a normalizing flow whose parameters are updated using stochastic gradient descent. We explore the practical performance of this procedure on both synthetic settings and in the analysis of a physical field system, and compare it against both adaptive and non-adaptive MCMC methods.
Collapse
|