1
|
Lorpaiboon C, Guo SC, Strahan J, Weare J, Dinner AR. Accurate estimates of dynamical statistics using memory. J Chem Phys 2024; 160:084108. [PMID: 38391020 PMCID: PMC10898919 DOI: 10.1063/5.0187145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 01/29/2024] [Indexed: 02/24/2024] Open
Abstract
Many chemical reactions and molecular processes occur on time scales that are significantly longer than those accessible by direct simulations. One successful approach to estimating dynamical statistics for such processes is to use many short time series of observations of the system to construct a Markov state model, which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) is a closely related framework for estimating dynamical statistics, such as committors and mean first passage times, by approximating solutions to their equations with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic errors. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and the AIB9 peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude.
Collapse
Affiliation(s)
- Chatipat Lorpaiboon
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Spencer C. Guo
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - John Strahan
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
2
|
Dorbath E, Gulzar A, Stock G. Log-periodic oscillations as real-time signatures of hierarchical dynamics in proteins. J Chem Phys 2024; 160:074103. [PMID: 38364004 DOI: 10.1063/5.0188220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 01/23/2024] [Indexed: 02/18/2024] Open
Abstract
The time-dependent relaxation of a dynamical system may exhibit a power-law behavior that is superimposed by log-periodic oscillations. D. Sornette [Phys. Rep. 297, 239 (1998)] showed that this behavior can be explained by a discrete scale invariance of the system, which is associated with discrete and equidistant timescales on a logarithmic scale. Examples include such diverse fields as financial crashes, random diffusion, and quantum topological materials. Recent time-resolved experiments and molecular dynamics simulations suggest that discrete scale invariance may also apply to hierarchical dynamics in proteins, where several fast local conformational changes are a prerequisite for a slow global transition to occur. Employing entropy-based timescale analysis and Markov state modeling to a simple one-dimensional hierarchical model and biomolecular simulation data, it is found that hierarchical systems quite generally give rise to logarithmically spaced discrete timescales. By introducing a one-dimensional reaction coordinate that collectively accounts for the hierarchically coupled degrees of freedom, the free energy landscape exhibits a characteristic staircase shape with two metastable end states, which causes the log-periodic time evolution of the system. The period of the log-oscillations reflects the effective roughness of the energy landscape and can, in simple cases, be interpreted in terms of the barriers of the staircase landscape.
Collapse
Affiliation(s)
- Emanuel Dorbath
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| | - Adnan Gulzar
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, 79104 Freiburg, Germany
| |
Collapse
|
3
|
Chang L, Mondal A, Singh B, Martínez-Noa Y, Perez A. Revolutionizing Peptide-Based Drug Discovery: Advances in the Post-AlphaFold Era. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2024; 14:e1693. [PMID: 38680429 PMCID: PMC11052547 DOI: 10.1002/wcms.1693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/18/2023] [Indexed: 05/01/2024]
Abstract
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Arup Mondal
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | - Bhumika Singh
- Department of Chemistry, University of Florida, Gainesville, FL 32611
| | | | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL 32611
| |
Collapse
|
4
|
Strahan J, Guo SC, Lorpaiboon C, Dinner AR, Weare J. Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction. J Chem Phys 2023; 159:014110. [PMID: 37409704 PMCID: PMC10328561 DOI: 10.1063/5.0151309] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 06/02/2023] [Indexed: 07/07/2023] Open
Abstract
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics, such as the likelihood and average time of events (predictions). Here, we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a dataset of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
Collapse
Affiliation(s)
- John Strahan
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Spencer C. Guo
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Chatipat Lorpaiboon
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| |
Collapse
|
5
|
Esmaeeli R, Bauzá A, Perez A. Structural predictions of protein-DNA binding: MELD-DNA. Nucleic Acids Res 2023; 51:1625-1636. [PMID: 36727436 PMCID: PMC9976882 DOI: 10.1093/nar/gkad013] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/27/2022] [Accepted: 01/30/2023] [Indexed: 02/03/2023] Open
Abstract
Structural, regulatory and enzymatic proteins interact with DNA to maintain a healthy and functional genome. Yet, our structural understanding of how proteins interact with DNA is limited. We present MELD-DNA, a novel computational approach to predict the structures of protein-DNA complexes. The method combines molecular dynamics simulations with general knowledge or experimental information through Bayesian inference. The physical model is sensitive to sequence-dependent properties and conformational changes required for binding, while information accelerates sampling of bound conformations. MELD-DNA can: (i) sample multiple binding modes; (ii) identify the preferred binding mode from the ensembles; and (iii) provide qualitative binding preferences between DNA sequences. We first assess performance on a dataset of 15 protein-DNA complexes and compare it with state-of-the-art methodologies. Furthermore, for three selected complexes, we show sequence dependence effects of binding in MELD predictions. We expect that the results presented herein, together with the freely available software, will impact structural biology (by complementing DNA structural databases) and molecular recognition (by bringing new insights into aspects governing protein-DNA interactions).
Collapse
Affiliation(s)
- Reza Esmaeeli
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| | - Antonio Bauzá
- Department of Chemistry, Universitat de les Illes Balears, Palma de Mallorca (Baleares), 07122, Spain
| | - Alberto Perez
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
6
|
Smith Z, Tiwary P. Making High-Dimensional Molecular Distribution Functions Tractable through Belief Propagation on Factor Graphs. J Phys Chem B 2021; 125:11150-11158. [PMID: 34586819 DOI: 10.1021/acs.jpcb.1c05717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Molecular dynamics (MD) simulations provide a wealth of high-dimensional data at all-atom and femtosecond resolution but deciphering mechanistic information from this data is an ongoing challenge in physical chemistry and biophysics. Theoretically speaking, joint probabilities of the equilibrium distribution contain all thermodynamic information, but they prove increasingly difficult to compute and interpret as the dimensionality increases. Here, inspired by tools in probabilistic graphical modeling, we develop a factor graph trained through belief propagation that helps factorize the joint probability into an approximate tractable form that can be easily visualized and used. We validate the study through the analysis of the conformational dynamics of two small peptides with five and nine residues. Our validations include testing the conditional dependency predictions through an intervention scheme inspired by Judea Pearl. Second, we directly use the belief propagation-based approximate probability distribution as a high-dimensional static bias for enhanced sampling, where we achieve spontaneous back-and-forth motion between metastable states that is up to 350 times faster than unbiased MD. We believe this work opens up useful ways to thinking about and dealing with high-dimensional molecular simulations.
Collapse
Affiliation(s)
- Zachary Smith
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, United States
| |
Collapse
|
7
|
Lickert B, Wolf S, Stock G. Data-Driven Langevin Modeling of Nonequilibrium Processes. J Phys Chem B 2021; 125:8125-8136. [PMID: 34270245 DOI: 10.1021/acs.jpcb.1c03828] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Given nonstationary data from molecular dynamics simulations, a Markovian Langevin model is constructed that aims to reproduce the time evolution of the underlying process. While at equilibrium the free energy landscape is sampled, nonequilibrium processes can be associated with a biased energy landscape, which accounts for finite sampling effects and external driving. When the data-driven Langevin equation (dLE) approach [Phys. Rev. Lett. 2015, 115, 050602] is extended to the modeling of nonequilibrium processes, an efficient way to calculate multidimensional Langevin fields is outlined. The dLE is shown to correctly account for various nonequilibrium processes, including the enforced dissociation of sodium chloride in water, the pressure-jump induced nucleation of a liquid of hard spheres, and the conformational dynamics of a helical peptide sampled from nonstationary short trajectories.
Collapse
Affiliation(s)
- Benjamin Lickert
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Steffen Wolf
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| |
Collapse
|
8
|
Rosenberger D, Smith JS, Garcia AE. Modeling of Peptides with Classical and Novel Machine Learning Force Fields: A Comparison. J Phys Chem B 2021; 125:3598-3612. [DOI: 10.1021/acs.jpcb.0c10401] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- David Rosenberger
- Los Alamos National Laboratory, Theoretical Division, Chemistry and Physics of Materials Group, Los Alamos, 87545 New Mexico, United States
- Los Alamos National Laboratory, Theoretical Division, Center for Nonlinear Studies, Los Alamos, 87545 New Mexico, United States
| | - Justin S. Smith
- Los Alamos National Laboratory, Theoretical Division, Chemistry and Physics of Materials Group, Los Alamos, 87545 New Mexico, United States
| | - Angel E. Garcia
- Los Alamos National Laboratory, Theoretical Division, Center for Nonlinear Studies, Los Alamos, 87545 New Mexico, United States
| |
Collapse
|
9
|
Abstract
Every protein has a story-how it folds, what it binds, its biological actions, and how it misbehaves in aging or disease. Stories are often inferred from a protein's shape (i.e., its structure). But increasingly, stories are told using computational molecular physics (CMP). CMP is rooted in the principled physics of driving forces and reveals granular detail of conformational populations in space and time. Recent advances are accessing longer time scales, larger actions, and blind testing, enabling more of biology's stories to be told in the language of atomistic physics.
Collapse
Affiliation(s)
- Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Carlos Simmerling
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA.,Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA
| | - Ken Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA. .,Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New NY 11794, USA
| |
Collapse
|
10
|
Binding Ensembles of p53-MDM2 Peptide Inhibitors by Combining Bayesian Inference and Atomistic Simulations. Molecules 2021; 26:molecules26010198. [PMID: 33401765 PMCID: PMC7795311 DOI: 10.3390/molecules26010198] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 12/26/2020] [Accepted: 12/28/2020] [Indexed: 01/21/2023] Open
Abstract
Designing peptide inhibitors of the p53-MDM2 interaction against cancer is of wide interest. Computational modeling and virtual screening are a well established step in the rational design of small molecules. But they face challenges for binding flexible peptide molecules that fold upon binding. We look at the ability of five different peptides, three of which are intrinsically disordered, to bind to MDM2 with a new Bayesian inference approach (MELD × MD). The method is able to capture the folding upon binding mechanism and differentiate binding preferences between the five peptides. Processing the ensembles with statistical mechanics tools depicts the most likely bound conformations and hints at differences in the binding mechanism. Finally, the study shows the importance of capturing two driving forces to binding in this system: the ability of peptides to adopt bound conformations (ΔGconformation) and the interaction between interface residues (ΔGinteraction).
Collapse
|
11
|
Nagel D, Weber A, Stock G. MSMPathfinder: Identification of Pathways in Markov State Models. J Chem Theory Comput 2020; 16:7874-7882. [PMID: 33141565 DOI: 10.1021/acs.jctc.0c00774] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Markov state models represent a popular means to interpret biomolecular processes in terms of memoryless transitions between metastable conformational states. To gain insight into the underlying mechanism, it is instructive to determine all relevant pathways between initial and final states of the process. Currently available methods, such as Markov chain Monte Carlo and transition path theory, are convenient for identifying the most frequented pathways. They are less suited to account for the typically huge amount of pathways with low probability which, though, may dominate the cumulative flux of the reaction. On the basis of a systematic construction of all possible pathways, the here proposed method MSMPathfinder is able to characterize the multitude of unique pathways (say, up to 1010) in a complex system and to quantitatively calculate their correct weights and associated waiting times with predefined accuracy. Adopting the chiral transitions of a peptide helix and the folding of the villin headpiece as model problems, mechanisms and associated waiting times of these processes are discussed using a kinetic network representation. The analysis reveals that the waiting time distribution may yield only little insight into the diversity of pathways, because the measured folding times do typically not reflect the most probable path lengths but rather the cumulative effect of many different pathways.
Collapse
Affiliation(s)
- Daniel Nagel
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Anna Weber
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University, 79104 Freiburg, Germany
| |
Collapse
|
12
|
Peter EK, Shea JE, Schug A. CORE-MD, a path correlated molecular dynamics simulation method. J Chem Phys 2020; 153:084114. [DOI: 10.1063/5.0015398] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Emanuel K. Peter
- John von Neumann Institute for Computing and Julich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich, Jülich, Germany
| | - Joan-Emma Shea
- Department of Chemistry and Biochemistry, Department of Physics, University of California, Santa Barbara, Santa Barbara, California 93106, USA
| | - Alexander Schug
- John von Neumann Institute for Computing and Julich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich, Jülich, Germany
- Faculty of Biology, University of Duisburg-Essen, Duisburg, Germany
| |
Collapse
|