1
|
Wang D, Qiu Y, Beyerle ER, Huang X, Tiwary P. Information Bottleneck Approach for Markov Model Construction. J Chem Theory Comput 2024; 20:5352-5367. [PMID: 38859575 PMCID: PMC11199095 DOI: 10.1021/acs.jctc.4c00449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]
Abstract
Markov state models (MSMs) have proven valuable in studying the dynamics of protein conformational changes via statistical analysis of molecular dynamics simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multiresolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multiresolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on a specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSM construction.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Eric R. Beyerle
- Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, MD 20852, United States
| |
Collapse
|
2
|
Nagahata Y, Kobayashi M, Toda M, Maeda S, Taketsugu T, Komatsuzaki T. An encompassed representation of timescale hierarchies in first-order reaction network. Proc Natl Acad Sci U S A 2024; 121:e2317781121. [PMID: 38758700 PMCID: PMC11126998 DOI: 10.1073/pnas.2317781121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 04/16/2024] [Indexed: 05/19/2024] Open
Abstract
Complex networks are pervasive in various fields such as chemistry, biology, and sociology. In chemistry, first-order reaction networks are represented by a set of first-order differential equations, which can be constructed from the underlying energy landscape. However, as the number of nodes increases, it becomes more challenging to understand complex kinetics across different timescales. Hence, how to construct an interpretable, coarse-graining scheme that preserves the underlying timescales of overall reactions is of crucial importance. Here, we develop a scheme to capture the underlying hierarchical subsets of nodes, and a series of coarse-grained (reduced-dimensional) rate equations between the subsets as a function of time resolution from the original reaction network. Each of the coarse-grained representations guarantees to preserve the underlying slow characteristic timescales in the original network. The crux is the construction of a lumping scheme incorporating a similarity measure in deciphering the underlying timescale hierarchy, which does not rely on the assumption of equilibrium. As an illustrative example, we apply the scheme to four-state Markovian models and Claisen rearrangement of allyl vinyl ether (AVE), and demonstrate that the reduced-dimensional representation accurately reproduces not only the slowest but also the faster timescales of overall reactions although other reduction schemes based on equilibrium assumption well reproduce the slowest timescale but fail to reproduce the second-to-fourth slowest timescales with the same accuracy. Our scheme can be applied not only to the reaction networks but also to networks in other fields, which helps us encompass their hierarchical structures of the complex kinetics over timescales.
Collapse
Affiliation(s)
- Yutaka Nagahata
- The Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo001-0021, Japan
- Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Sapporo001-0020, Japan
| | - Masato Kobayashi
- The Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo001-0021, Japan
- Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Sapporo001-0020, Japan
- Department of Chemistry, Faculty of Science, Hokkaido University, Sapporo060-0810, Japan
| | - Mikito Toda
- Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Sapporo001-0020, Japan
- Faculty Division of Natural Sciences, Nara Women’s University, Nara630-8506, Japan
- Graduate School of Information Science, University of Hyogo, Kobe650-0047, Japan
| | - Satoshi Maeda
- The Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo001-0021, Japan
- Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Sapporo001-0020, Japan
- Department of Chemistry, Faculty of Science, Hokkaido University, Sapporo060-0810, Japan
| | - Tetsuya Taketsugu
- The Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo001-0021, Japan
- Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Sapporo001-0020, Japan
- Department of Chemistry, Faculty of Science, Hokkaido University, Sapporo060-0810, Japan
| | - Tamiki Komatsuzaki
- The Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo001-0021, Japan
- Research Center of Mathematics for Social Creativity, Research Institute for Electronic Science, Hokkaido University, Sapporo001-0020, Japan
- Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita565-0871, Japan
- The Institute of Scientific and Industrial Research, Osaka University, Ibaraki567-0047, Japan
| |
Collapse
|
3
|
Xu T, Li Y, Gao X, Zhang L. Understanding the Fast-Triggering Unfolding Dynamics of FK-11 upon Photoexcitation of Azobenzene. J Phys Chem Lett 2024; 15:3531-3540. [PMID: 38526058 DOI: 10.1021/acs.jpclett.4c00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Photoswitchable molecules can control the activity and functions of biomolecules by triggering conformational changes. However, it is still challenging to fully understand such fast-triggering conformational evolution from nonequilibrium to equilibrium distribution at the molecular level. Herein, we successfully simulated the unfolding of the FK-11 peptide upon the photoinduced trans-to-cis isomerization of azobenzene based on the Markov state model. We found that the ensemble of FK-11 contains five conformational states, constituting two unfolding pathways. More intriguingly, we observed the microsecond-scale conformational propagation of the FK-11 peptide from the fully folded state to the equilibrium populations of the five states. The computed CD spectra match well with the experimental data, validating our simulation method. Overall, our study not only offers a protocol to study the photoisomerization-induced conformational changes of enzymes but also could orientate the rational design of a photoswitchable molecule to manipulate biological functions.
Collapse
Affiliation(s)
- Tiantian Xu
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yongfang Li
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
| | - Xin Gao
- Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Lu Zhang
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, Fujian 350002, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Fuzhou, Fujian 361005, China
| |
Collapse
|
4
|
Wu Y, Cao S, Qiu Y, Huang X. Tutorial on how to build non-Markovian dynamic models from molecular dynamics simulations for studying protein conformational changes. J Chem Phys 2024; 160:121501. [PMID: 38516972 DOI: 10.1063/5.0189429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open
Abstract
Protein conformational changes play crucial roles in their biological functions. In recent years, the Markov State Model (MSM) constructed from extensive Molecular Dynamics (MD) simulations has emerged as a powerful tool for modeling complex protein conformational changes. In MSMs, dynamics are modeled as a sequence of Markovian transitions among metastable conformational states at discrete time intervals (called lag time). A major challenge for MSMs is that the lag time must be long enough to allow transitions among states to become memoryless (or Markovian). However, this lag time is constrained by the length of individual MD simulations available to track these transitions. To address this challenge, we have recently developed Generalized Master Equation (GME)-based approaches, encoding non-Markovian dynamics using a time-dependent memory kernel. In this Tutorial, we introduce the theory behind two recently developed GME-based non-Markovian dynamic models: the quasi-Markov State Model (qMSM) and the Integrative Generalized Master Equation (IGME). We subsequently outline the procedures for constructing these models and provide a step-by-step tutorial on applying qMSM and IGME to study two peptide systems: alanine dipeptide and villin headpiece. This Tutorial is available at https://github.com/xuhuihuang/GME_tutorials. The protocols detailed in this Tutorial aim to be accessible for non-experts interested in studying the biomolecular dynamics using these non-Markovian dynamic models.
Collapse
Affiliation(s)
- Yue Wu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Data Science Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
5
|
Ojha AA, Thakur S, Ahn SH, Amaro RE. DeepWEST: Deep Learning of Kinetic Models with the Weighted Ensemble Simulation Toolkit for Enhanced Sampling. J Chem Theory Comput 2023; 19:1342-1359. [PMID: 36719802 DOI: 10.1021/acs.jctc.2c00282] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Recent advances in computational power and algorithms have enabled molecular dynamics (MD) simulations to reach greater time scales. However, for observing conformational transitions associated with biomolecular processes, MD simulations still have limitations. Several enhanced sampling techniques seek to address this challenge, including the weighted ensemble (WE) method, which samples transitions between metastable states using many weighted trajectories to estimate kinetic rate constants. However, initial sampling of the potential energy surface has a significant impact on the performance of WE, i.e., convergence and efficiency. We therefore introduce deep-learned kinetic modeling approaches that extract statistically relevant information from short MD trajectories to provide a well-sampled initial state distribution for WE simulations. This hybrid approach overcomes any statistical bias to the system, as it runs short unbiased MD trajectories and identifies meaningful metastable states of the system. It is shown to provide a more refined free energy landscape closer to the steady state that could efficiently sample kinetic properties such as rate constants.
Collapse
Affiliation(s)
- Anupam Anand Ojha
- Department of Chemistry, University of California San Diego, La Jolla, California92093, United States
| | - Saumya Thakur
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, Maharashtra400076, India
| | - Surl-Hee Ahn
- Department of Chemical Engineering, University of California Davis, Davis, California95616, United States
| | - Rommie E Amaro
- Department of Chemistry, University of California San Diego, La Jolla, California92093, United States
| |
Collapse
|
6
|
Unarta IC, Goonetilleke EC, Wang D, Huang X. Nucleotide addition and cleavage by RNA polymerase II: Coordination of two catalytic reactions using a single active site. J Biol Chem 2022; 299:102844. [PMID: 36581202 PMCID: PMC9860460 DOI: 10.1016/j.jbc.2022.102844] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 12/19/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022] Open
Abstract
RNA polymerase II (Pol II) incorporates complementary ribonucleotides into the growing RNA chain one at a time via the nucleotide addition cycle. The nucleotide addition cycle, however, is prone to misincorporation of noncomplementary nucleotides. Thus, to ensure transcriptional fidelity, Pol II backtracks and then cleaves the misincorporated nucleotides. These two reverse reactions, nucleotide addition and cleavage, are catalyzed in the same active site of Pol II, which is different from DNA polymerases or other endonucleases. Recently, substantial progress has been made to understand how Pol II effectively performs its dual role in the same active site. Our review highlights these recent studies and provides an overall model of the catalytic mechanisms of Pol II. In particular, RNA extension follows the two-metal-ion mechanism, and several Pol II residues play important roles to facilitate the catalysis. In sharp contrast, the cleavage reaction is independent of any Pol II residues. Interestingly, Pol II relies on its residues to recognize the misincorporated nucleotides during the backtracking process, prior to cleavage. In this way, Pol II efficiently compartmentalizes its two distinct catalytic functions using the same active site. Lastly, we also discuss a new perspective on the potential third Mg2+ in the nucleotide addition and intrinsic cleavage reactions.
Collapse
Affiliation(s)
- Ilona Christy Unarta
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Eshani C Goonetilleke
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Dong Wang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA; Department of Cellular and Molecular Medicine, School of Medicine, University of California, San Diego, La Jolla, California, USA; Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA.
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin, USA.
| |
Collapse
|
7
|
Mardt A, Hempel T, Clementi C, Noé F. Deep learning to decompose macromolecules into independent Markovian domains. Nat Commun 2022; 13:7101. [PMID: 36402768 PMCID: PMC9675806 DOI: 10.1038/s41467-022-34603-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 10/27/2022] [Indexed: 11/21/2022] Open
Abstract
The increasing interest in modeling the dynamics of ever larger proteins has revealed a fundamental problem with models that describe the molecular system as being in a global configuration state. This notion limits our ability to gather sufficient statistics of state probabilities or state-to-state transitions because for large molecular systems the number of metastable states grows exponentially with size. In this manuscript, we approach this challenge by introducing a method that combines our recent progress on independent Markov decomposition (IMD) with VAMPnets, a deep learning approach to Markov modeling. We establish a training objective that quantifies how well a given decomposition of the molecular system into independent subdomains with Markovian dynamics approximates the overall dynamics. By constructing an end-to-end learning framework, the decomposition into such subdomains and their individual Markov state models are simultaneously learned, providing a data-efficient and easily interpretable summary of the complex system dynamics. While learning the dynamical coupling between Markovian subdomains is still an open issue, the present results are a significant step towards learning Ising models of large molecular complexes from simulation data.
Collapse
Affiliation(s)
- Andreas Mardt
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany
| | - Tim Hempel
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany ,grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany
| | - Cecilia Clementi
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany ,grid.21940.3e0000 0004 1936 8278Rice University, Department of Chemistry, Houston, TX USA ,grid.509984.90000 0004 5907 3802Rice University, Center for Theoretical Biological Physics, Houston, TX USA
| | - Frank Noé
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany ,grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany ,grid.21940.3e0000 0004 1936 8278Rice University, Department of Chemistry, Houston, TX USA ,Microsoft Research AI4Science, Berlin, Germany
| |
Collapse
|
8
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
9
|
Konovalov K, Unarta IC, Cao S, Goonetilleke EC, Huang X. Markov State Models to Study the Functional Dynamics of Proteins in the Wake of Machine Learning. JACS AU 2021; 1:1330-1341. [PMID: 34604842 PMCID: PMC8479766 DOI: 10.1021/jacsau.1c00254] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Indexed: 05/19/2023]
Abstract
Markov state models (MSMs) based on molecular dynamics (MD) simulations are routinely employed to study protein folding, however, their application to functional conformational changes of biomolecules is still limited. In the past few years, the field of computational chemistry has experienced a surge of advancements stemming from machine learning algorithms, and MSMs have not been left out. Unlike global processes, such as protein folding, the application of MSMs to functional conformational changes is challenging because they mostly consist of localized structural transitions. Therefore, it is critical to properly select a subset of structural features that can describe the slowest dynamics of these functional conformational changes. To address this challenge, we recommend several automatic feature selection methods such as Spectral-OASIS. To identify states in MSMs, the chosen features can be subject to dimensionality reduction methods such as TICA or deep learning based VAMPNets to project MD conformations onto a few collective variables for subsequent clustering. Another challenge for the application of MSMs to the study of functional conformational changes is the ability to comprehend their biophysical mechanisms, as MSMs built for these processes often require a large number of states. We recommend the recently developed quasi-MSMs (qMSMs) to address this issue. Compared to MSMs, qMSMs encode the non-Markovian dynamics via the generalized master equation and can significantly reduce the number of states. As a result, qMSMs can be built with a handful of states to facilitate the interpretation of functional conformational changes. In the wake of machine learning, we believe that the rapid advancement in the MSM methodology will lead to their wider application in studying functional conformational changes of biomolecules.
Collapse
Affiliation(s)
- Kirill
A. Konovalov
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Ilona Christy Unarta
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Siqin Cao
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Eshani C. Goonetilleke
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| | - Xuhui Huang
- Department
of Chemistry, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Department
of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Hong
Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong
| |
Collapse
|
10
|
Jiang H, Fan X. The Two-Step Clustering Approach for Metastable States Learning. Int J Mol Sci 2021; 22:6576. [PMID: 34205252 PMCID: PMC8233889 DOI: 10.3390/ijms22126576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 06/14/2021] [Accepted: 06/14/2021] [Indexed: 01/20/2023] Open
Abstract
Understanding the energy landscape and the conformational dynamics is crucial for studying many biological or chemical processes, such as protein-protein interaction and RNA folding. Molecular Dynamics (MD) simulations have been a major source of dynamic structure. Although many methods were proposed for learning metastable states from MD data, some key problems are still in need of further investigation. Here, we give a brief review on recent progresses in this field, with an emphasis on some popular methods belonging to a two-step clustering framework, and hope to draw more researchers to contribute to this area.
Collapse
Affiliation(s)
- Hangjin Jiang
- Center for Data Science, Zhejiang University, Hangzhou 310058, China;
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
11
|
Cao X, Tian P. "Dividing and Conquering" and "Caching" in Molecular Modeling. Int J Mol Sci 2021; 22:5053. [PMID: 34068835 PMCID: PMC8126232 DOI: 10.3390/ijms22095053] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/26/2021] [Accepted: 04/27/2021] [Indexed: 11/17/2022] Open
Abstract
Molecular modeling is widely utilized in subjects including but not limited to physics, chemistry, biology, materials science and engineering. Impressive progress has been made in development of theories, algorithms and software packages. To divide and conquer, and to cache intermediate results have been long standing principles in development of algorithms. Not surprisingly, most important methodological advancements in more than half century of molecular modeling are various implementations of these two fundamental principles. In the mainstream classical computational molecular science, tremendous efforts have been invested on two lines of algorithm development. The first is coarse graining, which is to represent multiple basic particles in higher resolution modeling as a single larger and softer particle in lower resolution counterpart, with resulting force fields of partial transferability at the expense of some information loss. The second is enhanced sampling, which realizes "dividing and conquering" and/or "caching" in configurational space with focus either on reaction coordinates and collective variables as in metadynamics and related algorithms, or on the transition matrix and state discretization as in Markov state models. For this line of algorithms, spatial resolution is maintained but results are not transferable. Deep learning has been utilized to realize more efficient and accurate ways of "dividing and conquering" and "caching" along these two lines of algorithmic research. We proposed and demonstrated the local free energy landscape approach, a new framework for classical computational molecular science. This framework is based on a third class of algorithm that facilitates molecular modeling through partially transferable in resolution "caching" of distributions for local clusters of molecular degrees of freedom. Differences, connections and potential interactions among these three algorithmic directions are discussed, with the hope to stimulate development of more elegant, efficient and reliable formulations and algorithms for "dividing and conquering" and "caching" in complex molecular systems.
Collapse
Affiliation(s)
- Xiaoyong Cao
- School of Life Sciences, Jilin University, Changchun 130012, China;
| | - Pu Tian
- School of Life Sciences, Jilin University, Changchun 130012, China;
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| |
Collapse
|
12
|
Affiliation(s)
- Francesco Cocina
- Biochemistry Department, University of Zurich, Zurich CH-8057, Switzerland
| | - Andreas Vitalis
- Biochemistry Department, University of Zurich, Zurich CH-8057, Switzerland
| | - Amedeo Caflisch
- Biochemistry Department, University of Zurich, Zurich CH-8057, Switzerland
| |
Collapse
|
13
|
Ligand-bound glutamine binding protein assumes multiple metastable binding sites with different binding affinities. Commun Biol 2020; 3:419. [PMID: 32747735 PMCID: PMC7400645 DOI: 10.1038/s42003-020-01149-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 07/14/2020] [Indexed: 11/08/2022] Open
Abstract
Protein dynamics plays key roles in ligand binding. However, the microscopic description of conformational dynamics-coupled ligand binding remains a challenge. In this study, we integrate molecular dynamics simulations, Markov state model (MSM) analysis and experimental methods to characterize the conformational dynamics of ligand-bound glutamine binding protein (GlnBP). We show that ligand-bound GlnBP has high conformational flexibility and additional metastable binding sites, presenting a more complex energy landscape than the scenario in the absence of ligand. The diverse conformations of GlnBP demonstrate different binding affinities and entail complex transition kinetics, implicating a concerted ligand binding mechanism. Single molecule fluorescence resonance energy transfer measurements and mutagenesis experiments are performed to validate our MSM-derived structure ensemble as well as the binding mechanism. Collectively, our study provides deeper insights into the protein dynamics-coupled ligand binding, revealing an intricate regulatory network underlying the apparent binding affinity. Zhang, Wu, Feng et al. show that ligand-bound glutamine binding protein assumes multiple metastable binding sites, presenting a more dynamic energy landscape than its ligand-free form. This study provides insights into the ligand-binding mechanisms coupled with protein dynamics that underly the apparent binding affinity.
Collapse
|
14
|
Noé F. Machine Learning for Molecular Dynamics on Long Timescales. MACHINE LEARNING MEETS QUANTUM PHYSICS 2020. [DOI: 10.1007/978-3-030-40245-7_16] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
15
|
Liu H, Zhong H, Liu X, Zhou S, Tan S, Liu H, Yao X. Disclosing the Mechanism of Spontaneous Aggregation and Template-Induced Misfolding of the Key Hexapeptide (PHF6) of Tau Protein Based on Molecular Dynamics Simulation. ACS Chem Neurosci 2019; 10:4810-4823. [PMID: 31661961 DOI: 10.1021/acschemneuro.9b00488] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The microtubule-associated protein tau is critical for the development and maintenance of the nervous system. Tau dysfunction is associated with a variety of neurodegenerative diseases called tauopathies, which are characterized by neurofibrillary tangles formed by abnormally aggregated tau protein. Studying the aggregation mechanism of tau protein is of great significance for elucidating the etiology of tauopathies. The hexapeptide 306VQIVYK311 (PHF6) of R3 has been shown to play a vital role in promoting tau aggregation. In this study, long-term all-atom molecular dynamics simulations in explicit solvent were performed to investigate the mechanisms of spontaneous aggregation and template-induced misfolding of PHF6, and the dimerization at the early stage of nucleation was further specifically analyzed by the Markov state model (MSM). Our results show that PHF6 can spontaneously aggregate to form multimers enriched with β-sheet structure and the β-sheets in multimers prefer to exist in a parallel way. It is observed that PHF6 monomer can be induced to form a β-sheet structure on either side of the template but in a different way. In detail, the β-sheet structure is easier to form on the left side but does not extend well, but on the right side, the monomer can form the extended β-sheet structure. Furthermore, MSM analysis shows that the formation of dimer mainly occurs in three steps. First, the separated monomers collide with each other at random orientations, and then a dimer with short β-sheet structure at the N-terminal forms; finally, β-sheets elongate to form an extended parallel β-sheet dimer. During these processes, multiple intermediate states are identified and multiple paths can form a parallel β-sheet dimer from the disordered coil structure. Moreover, the residues I308, V309, and Y310 play an essential role in the dimerization. In a word, our results uncover the aggregation and misfolding mechanism of PHF6 from the atomic level, which can provide useful theoretical guidance for rational design of effective therapeutic drugs against tauopathies.
Collapse
Affiliation(s)
| | | | | | - Shuangyan Zhou
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | | | | | - Xiaojun Yao
- State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Taipa, Macau 999078, China
| |
Collapse
|
16
|
Affiliation(s)
- Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
- Department of Physics, Freie Universität Berlin, Berlin, Germany
| | - Edina Rosta
- Department of Chemistry, Kings College London, London, England
| |
Collapse
|
17
|
Kells A, Mihálka ZÉ, Annibale A, Rosta E. Mean first passage times in variational coarse graining using Markov state models. J Chem Phys 2019; 150:134107. [DOI: 10.1063/1.5083924] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Affiliation(s)
- Adam Kells
- Department of Chemistry, Kings College London, London, England
| | - Zsuzsanna É. Mihálka
- Laboratory of Theoretical Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Alessia Annibale
- Department of Mathematics, Kings College London, London, England
| | - Edina Rosta
- Department of Chemistry, Kings College London, London, England
| |
Collapse
|
18
|
Bacci M, Caflisch A, Vitalis A. On the removal of initial state bias from simulation data. J Chem Phys 2019; 150:104105. [PMID: 30876362 DOI: 10.1063/1.5063556] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Classical atomistic simulations of biomolecules play an increasingly important role in molecular life science. The structure of current computing architectures favors methods that run multiple trajectories at once without requiring extensive communication between them. Many advanced sampling strategies in the field fit this mold. These approaches often rely on an adaptive logic and create ensembles of comparatively short trajectories whose starting points are not distributed according to the correct Boltzmann weights. This type of bias is notoriously difficult to remove, and Markov state models (MSMs) are one of the few strategies available for recovering the correct kinetics and thermodynamics from these ensembles of trajectories. In this contribution, we analyze the performance of MSMs in the thermodynamic reweighting task for a hierarchical set of systems. We show that MSMs can be rigorous tools to recover the correct equilibrium distribution for systems of sufficiently low dimensionality. This is conditional upon not tampering with local flux imbalances found in the data. For a real-world application, we find that a pure likelihood-based inference of the transition matrix produces the best results. The removal of the bias is incomplete, however, and for this system, all tested MSMs are outperformed by an alternative albeit less general approach rooted in the ideas of statistical resampling. We conclude by formulating some recommendations for how to address the reweighting issue in practice.
Collapse
Affiliation(s)
- Marco Bacci
- University of Zurich, Department of Biochemistry, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Amedeo Caflisch
- University of Zurich, Department of Biochemistry, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Andreas Vitalis
- University of Zurich, Department of Biochemistry, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| |
Collapse
|
19
|
Wang W, Liang T, Sheong FK, Fan X, Huang X. An efficient Bayesian kinetic lumping algorithm to identify metastable conformational states via Gibbs sampling. J Chem Phys 2018; 149:072337. [PMID: 30134698 DOI: 10.1063/1.5027001] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Markov State Model (MSM) has become a popular approach to study the conformational dynamics of complex biological systems in recent years. Built upon a large number of short molecular dynamics simulation trajectories, MSM is able to predict the long time scale dynamics of complex systems. However, to achieve Markovianity, an MSM often contains hundreds or thousands of states (microstates), hindering human interpretation of the underlying system mechanism. One way to reduce the number of states is to lump kinetically similar states together and thus coarse-grain the microstates into macrostates. In this work, we introduce a probabilistic lumping algorithm, the Gibbs lumping algorithm, to assign a probability to any given kinetic lumping using the Bayesian inference. In our algorithm, the transitions among kinetically distinct macrostates are modeled by Poisson processes, which will well reflect the separation of time scales in the underlying free energy landscape of biomolecules. Furthermore, to facilitate the search for the optimal kinetic lumping (i.e., the lumped model with the highest probability), a Gibbs sampling algorithm is introduced. To demonstrate the power of our new method, we apply it to three systems: a 2D potential, alanine dipeptide, and a WW protein domain. In comparison with six other popular lumping algorithms, we show that our method can persistently produce the lumped macrostate model with the highest probability as well as the largest metastability. We anticipate that our Gibbs lumping algorithm holds great promise to be widely applied to investigate conformational changes in biological macromolecules.
Collapse
Affiliation(s)
- Wei Wang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| | - Tong Liang
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Fu Kit Sheong
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Xuhui Huang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| |
Collapse
|
20
|
Peng JH, Wang W, Yu YQ, Gu HL, Huang X. Clustering algorithms to analyze molecular dynamics simulation trajectories for complex chemical and biological systems. CHINESE J CHEM PHYS 2018. [DOI: 10.1063/1674-0068/31/cjcp1806147] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Jun-hui Peng
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Wei Wang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Ye-qing Yu
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Han-lin Gu
- Department of Mathematics, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xuhui Huang
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
- Department of Chemistry, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Center of Systems Biology and Human Health, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
21
|
Litzinger F, Boninsegna L, Wu H, Nüske F, Patel R, Baraniuk R, Noé F, Clementi C. Rapid Calculation of Molecular Kinetics Using Compressed Sensing. J Chem Theory Comput 2018; 14:2771-2783. [DOI: 10.1021/acs.jctc.8b00089] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Florian Litzinger
- Freie Universität Berlin, Department of Mathematics and Computer Science, Arnimallee 6, 14195 Berlin, Germany
| | - Lorenzo Boninsegna
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| | - Hao Wu
- Freie Universität Berlin, Department of Mathematics and Computer Science, Arnimallee 6, 14195 Berlin, Germany
| | - Feliks Nüske
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| | - Raajen Patel
- Rice University, Department of Electrical and Computer Engineering, Houston, Texas 77005, United States
| | - Richard Baraniuk
- Rice University, Department of Electrical and Computer Engineering, Houston, Texas 77005, United States
| | - Frank Noé
- Freie Universität Berlin, Department of Mathematics and Computer Science, Arnimallee 6, 14195 Berlin, Germany
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| | - Cecilia Clementi
- Rice University, Center for Theoretical Biological Physics and Department of Chemistry, Houston, Texas 77005, United States
| |
Collapse
|
22
|
Optimal Data-Driven Estimation of Generalized Markov State Models for Non-Equilibrium Dynamics. COMPUTATION 2018. [DOI: 10.3390/computation6010022] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
23
|
Zhu L, Sheong FK, Zeng X, Huang X. Elucidation of the conformational dynamics of multi-body systems by construction of Markov state models. Phys Chem Chem Phys 2018; 18:30228-30235. [PMID: 27314275 DOI: 10.1039/c6cp02545e] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Constructing Markov State Models (MSMs) based on short molecular dynamics simulations is a powerful computational technique to complement experiments in predicting long-time kinetics of biomolecular processes at atomic resolution. Even though the MSM approach has been widely applied to study one-body processes such as protein folding and enzyme conformational changes, the majority of biological processes, e.g. protein-ligand recognition, signal transduction, and protein aggregation, essentially involve multiple entities. Here we review the attempts at constructing MSMs for multi-body systems, point out the challenges therein and discuss recent algorithmic progresses that alleviate these challenges. In particular, we describe an automatic kinetics based partitioning method that achieves optimal definition of the conformational states in a multi-body system, and discuss a novel maximum-likelihood approach that efficiently estimates the slow uphill kinetics utilizing pre-computed equilibrium populations of all states. We expect that these new algorithms and their combinations may boost investigations of important multi-body biological processes via the efficient construction of MSMs.
Collapse
Affiliation(s)
- Lizhe Zhu
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. and Centre of Systems Biology and Human Health, School of Science and Institute for Advance Study, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Fu Kit Sheong
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China.
| | - Xiangze Zeng
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. and Centre of Systems Biology and Human Health, School of Science and Institute for Advance Study, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Xuhui Huang
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. and Centre of Systems Biology and Human Health, School of Science and Institute for Advance Study, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| |
Collapse
|
24
|
Affiliation(s)
- Brooke E. Husic
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Vijay S. Pande
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
25
|
Husic BE, McKiernan KA, Wayment-Steele HK, Sultan MM, Pande VS. A Minimum Variance Clustering Approach Produces Robust and Interpretable Coarse-Grained Models. J Chem Theory Comput 2018; 14:1071-1082. [PMID: 29253336 DOI: 10.1021/acs.jctc.7b01004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Markov state models (MSMs) are a powerful framework for the analysis of molecular dynamics data sets, such as protein folding simulations, because of their straightforward construction and statistical rigor. The coarse-graining of MSMs into an interpretable number of macrostates is a crucial step for connecting theoretical results with experimental observables. Here we present the minimum variance clustering approach (MVCA) for the coarse-graining of MSMs into macrostate models. The method utilizes agglomerative clustering with Ward's minimum variance objective function, and the similarity of the microstate dynamics is determined using the Jensen-Shannon divergence between the corresponding rows in the MSM transition probability matrix. We first show that MVCA produces intuitive results for a simple tripeptide system and is robust toward long-duration statistical artifacts. MVCA is then applied to two protein folding simulations of the same protein in different force fields to demonstrate that a different number of macrostates is appropriate for each model, revealing a misfolded state present in only one of the simulations. Finally, we show that the same method can be used to analyze a data set containing many MSMs from simulations in different force fields by aggregating them into groups and quantifying their dynamical similarity in the context of force field parameter choices. The minimum variance clustering approach with the Jensen-Shannon divergence provides a powerful tool to group dynamics by similarity, both among model states and among dynamical models themselves.
Collapse
Affiliation(s)
- Brooke E Husic
- Department of Chemistry, Stanford University , Stanford, California 94305, United States
| | - Keri A McKiernan
- Department of Chemistry, Stanford University , Stanford, California 94305, United States
| | | | - Mohammad M Sultan
- Department of Chemistry, Stanford University , Stanford, California 94305, United States
| | - Vijay S Pande
- Department of Chemistry, Stanford University , Stanford, California 94305, United States
| |
Collapse
|
26
|
Mardt A, Pasquali L, Wu H, Noé F. VAMPnets for deep learning of molecular kinetics. Nat Commun 2018; 9:5. [PMID: 29295994 PMCID: PMC5750224 DOI: 10.1038/s41467-017-02388-1] [Citation(s) in RCA: 232] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 11/22/2017] [Indexed: 12/15/2022] Open
Abstract
There is an increasing demand for computing the relevant structures, equilibria, and long-timescale kinetics of biomolecular processes, such as protein-drug binding, from high-throughput molecular dynamics simulations. Current methods employ transformation of simulated coordinates into structural features, dimension reduction, clustering the dimension-reduced data, and estimation of a Markov state model or related model of the interconversion rates between molecular structures. This handcrafted approach demands a substantial amount of modeling expertise, as poor decisions at any step will lead to large modeling errors. Here we employ the variational approach for Markov processes (VAMP) to develop a deep learning framework for molecular kinetics using neural networks, dubbed VAMPnets. A VAMPnet encodes the entire mapping from molecular coordinates to Markov states, thus combining the whole data processing pipeline in a single end-to-end framework. Our method performs equally or better than state-of-the-art Markov modeling methods and provides easily interpretable few-state kinetic models. Extracting kinetic models from high-throughput molecular dynamics (MD) simulations is laborious and prone to human error. Here the authors introduce a deep learning framework that automates construction of Markov state models from MD simulation data.
Collapse
Affiliation(s)
- Andreas Mardt
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195, Berlin, Germany
| | - Luca Pasquali
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195, Berlin, Germany
| | - Hao Wu
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195, Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195, Berlin, Germany.
| |
Collapse
|
27
|
Wang W, Cao S, Zhu L, Huang X. Constructing Markov State Models to elucidate the functional conformational changes of complex biomolecules. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2017. [DOI: 10.1002/wcms.1343] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Wei Wang
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Center of Systems Biology and Human HealthThe Hong Kong University of Science and Technology Kowloon Hong Kong
| | - Siqin Cao
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
| | - Lizhe Zhu
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Center of Systems Biology and Human HealthThe Hong Kong University of Science and Technology Kowloon Hong Kong
| | - Xuhui Huang
- Department of ChemistryThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Center of Systems Biology and Human HealthThe Hong Kong University of Science and Technology Kowloon Hong Kong
- Hong Kong Branch of Chinese National Engineering Research Center for Tissue Restoration & ReconstructionThe Hong Kong University of Science and Technology Kowloon Hong Kong
- HKUST‐Shenzhen Research Institute Shenzhen China
| |
Collapse
|
28
|
Meng L, Sheong FK, Zeng X, Zhu L, Huang X. Path lumping: An efficient algorithm to identify metastable path channels for conformational dynamics of multi-body systems. J Chem Phys 2017; 147:044112. [DOI: 10.1063/1.4995558] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Affiliation(s)
- Luming Meng
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Fu Kit Sheong
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xiangze Zeng
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Lizhe Zhu
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Center of Systems Biology and Human Health, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xuhui Huang
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Center of Systems Biology and Human Health, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Hong Kong Branch of Chinese National Engineering Research Center for Tissue Restoration and Reconstruction, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- HKUST-Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
| |
Collapse
|
29
|
Elucidating Mechanisms of Molecular Recognition Between Human Argonaute and miRNA Using Computational Approaches. Methods Mol Biol 2016. [PMID: 27924488 DOI: 10.1007/978-1-4939-6563-2_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
MicroRNA (miRNA) and Argonaute (AGO) protein together form the RNA-induced silencing complex (RISC) that plays an essential role in the regulation of gene expression. Elucidating the underlying mechanism of AGO-miRNA recognition is thus of great importance not only for the in-depth understanding of miRNA function but also for inspiring new drugs targeting miRNAs. In this chapter we introduce a combined computational approach of molecular dynamics (MD) simulations, Markov state models (MSMs), and protein-RNA docking to investigate AGO-miRNA recognition. Constructed from MD simulations, MSMs can elucidate the conformational dynamics of AGO at biologically relevant timescales. Protein-RNA docking can then efficiently identify the AGO conformations that are geometrically accessible to miRNA. Using our recent work on human AGO2 as an example, we explain the rationale and the workflow of our method in details. This combined approach holds great promise to complement experiments in unraveling the mechanisms of molecular recognition between large, flexible, and complex biomolecules.
Collapse
|
30
|
Liu S, Zhu L, Sheong FK, Wang W, Huang X. Adaptive partitioning by local density-peaks: An efficient density-based clustering algorithm for analyzing molecular dynamics trajectories. J Comput Chem 2016; 38:152-160. [PMID: 27868222 DOI: 10.1002/jcc.24664] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Revised: 10/09/2016] [Accepted: 10/26/2016] [Indexed: 12/11/2022]
Abstract
We present an efficient density-based adaptive-resolution clustering method APLoD for analyzing large-scale molecular dynamics (MD) trajectories. APLoD performs the k-nearest-neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same high-density region into a cluster. APLoD greatly improves the popular density peaks algorithm by reducing the running time and the memory usage by 2-3 orders of magnitude for systems ranging from alanine dipeptide to a 370-residue Maltose-binding protein. In addition, we demonstrate that APLoD can produce clusters with various sizes that are adaptive to the underlying density (i.e., larger clusters at low-density regions, while smaller clusters at high-density regions), which is a clear advantage over other popular clustering algorithms including k-centers and k-medoids. We anticipate that APLoD can be widely applied to split ultra-large MD datasets containing millions of conformations for subsequent construction of Markov State Models. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Song Liu
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Lizhe Zhu
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.,Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Fu Kit Sheong
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Wei Wang
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.,Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xuhui Huang
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.,Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| |
Collapse
|
31
|
Koltai P, Ciccotti G, Schütte C. On metastability and Markov state models for non-stationary molecular dynamics. J Chem Phys 2016; 145:174103. [DOI: 10.1063/1.4966157] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
|
32
|
Sittel F, Stock G. Robust Density-Based Clustering To Identify Metastable Conformational States of Proteins. J Chem Theory Comput 2016; 12:2426-35. [PMID: 27058020 DOI: 10.1021/acs.jctc.5b01233] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
A density-based clustering method is proposed that is deterministic, computationally efficient, and self-consistent in its parameter choice. By calculating a geometric coordinate space density for every point of a given data set, a local free energy is defined. On the basis of these free energy estimates, the frames are lumped into local free energy minima, ultimately forming microstates separated by local free energy barriers. The algorithm is embedded into a complete workflow to robustly generate Markov state models from molecular dynamics trajectories. It consists of (i) preprocessing of the data via principal component analysis in order to reduce the dimensionality of the problem, (ii) proposed density-based clustering to generate microstates, and (iii) dynamical clustering via the most probable path algorithm to construct metastable states. To characterize the resulting state-resolved conformational distribution, dihedral angle content color plots are introduced which identify structural differences of protein states in a concise way. To illustrate the performance of the method, three well-established model problems are adopted: conformational transitions of hepta-alanine, folding of villin headpiece, and functional dynamics of bovine pancreatic trypsin inhibitor.
Collapse
Affiliation(s)
- Florian Sittel
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University , 79104 Freiburg, Germany
| | - Gerhard Stock
- Biomolecular Dynamics, Institute of Physics, Albert Ludwigs University , 79104 Freiburg, Germany
| |
Collapse
|
33
|
Zhang L, Pardo-Avila F, Unarta IC, Cheung PPH, Wang G, Wang D, Huang X. Elucidation of the Dynamics of Transcription Elongation by RNA Polymerase II using Kinetic Network Models. Acc Chem Res 2016; 49:687-94. [PMID: 26991064 DOI: 10.1021/acs.accounts.5b00536] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
RNA polymerase II (Pol II) is an essential enzyme that catalyzes transcription with high efficiency and fidelity in eukaryotic cells. During transcription elongation, Pol II catalyzes the nucleotide addition cycle (NAC) to synthesize mRNA using DNA as the template. The transitions between the states of the NAC require conformational changes of both the protein and nucleotides. Although X-ray structures are available for most of these states, the dynamics of the transitions between states are largely unknown. Molecular dynamics (MD) simulations can predict structure-based molecular details and shed light on the mechanisms of these dynamic transitions. However, the employment of MD simulations on a macromolecule (tens to hundreds of nanoseconds) such as Pol II is challenging due to the difficulty of reaching biologically relevant timescales (tens of microseconds or even longer). For this challenge to be overcome, kinetic network models (KNMs), such as Markov State Models (MSMs), have become a popular approach to access long-timescale conformational changes using many short MD simulations. We describe here our application of KNMs to characterize the molecular mechanisms of the NAC of Pol II. First, we introduce the general background of MSMs and further explain procedures for the construction and validation of MSMs by providing some technical details. Next, we review our previous studies in which we applied MSMs to investigate the individual steps of the NAC, including translocation and pyrophosphate ion release. In particular, we describe in detail how we prepared the initial conformations of Pol II elongation complex, performed MD simulations, extracted MD conformations to construct MSMs, and further validated them. We also summarize our major findings on molecular mechanisms of Pol II elongation based on these MSMs. In addition, we have included discussions regarding various key points and challenges for applications of MSMs to systems as large as the Pol II elongation complex. Finally, to study the overall NAC, we combine the individual steps of the NAC into a five-state KNM based on a nonbranched Brownian ratchet scheme to explain the single-molecule optical tweezers experimental data. The studies complement experimental observations and provide molecular mechanisms for the transcription elongation cycle. In the long term, incorporation of sequence-dependent kinetic parameters into KNMs has great potential for identifying error-prone sequences and predicting transcription dynamics in genome-wide transcriptomes.
Collapse
Affiliation(s)
- Lu Zhang
- Department
of Chemistry and State Key Laboratory of Molecular Neuroscience, Center
for System Biology and Human Health, School of Science, and IAS, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Fátima Pardo-Avila
- Department
of Chemistry and State Key Laboratory of Molecular Neuroscience, Center
for System Biology and Human Health, School of Science, and IAS, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Ilona Christy Unarta
- Department
of Chemistry and State Key Laboratory of Molecular Neuroscience, Center
for System Biology and Human Health, School of Science, and IAS, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Peter Pak-Hang Cheung
- Department
of Chemistry and State Key Laboratory of Molecular Neuroscience, Center
for System Biology and Human Health, School of Science, and IAS, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Guo Wang
- Department
of Chemistry and State Key Laboratory of Molecular Neuroscience, Center
for System Biology and Human Health, School of Science, and IAS, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Dong Wang
- Department
of Cellular and Molecular Medicine, Skaggs School of Pharmacy and
Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, United States
| | - Xuhui Huang
- Department
of Chemistry and State Key Laboratory of Molecular Neuroscience, Center
for System Biology and Human Health, School of Science, and IAS, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
34
|
Shukla D, Peck A, Pande VS. Conformational heterogeneity of the calmodulin binding interface. Nat Commun 2016; 7:10910. [PMID: 27040077 PMCID: PMC4822001 DOI: 10.1038/ncomms10910] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Accepted: 01/28/2016] [Indexed: 01/13/2023] Open
Abstract
Calmodulin (CaM) is a ubiquitous Ca(2+) sensor and a crucial signalling hub in many pathways aberrantly activated in disease. However, the mechanistic basis of its ability to bind diverse signalling molecules including G-protein-coupled receptors, ion channels and kinases remains poorly understood. Here we harness the high resolution of molecular dynamics simulations and the analytical power of Markov state models to dissect the molecular underpinnings of CaM binding diversity. Our computational model indicates that in the absence of Ca(2+), sub-states in the folded ensemble of CaM's C-terminal domain present chemically and sterically distinct topologies that may facilitate conformational selection. Furthermore, we find that local unfolding is off-pathway for the exchange process relevant for peptide binding, in contrast to prior hypotheses that unfolding might account for binding diversity. Finally, our model predicts a novel binding interface that is well-populated in the Ca(2+)-bound regime and, thus, a candidate for pharmacological intervention.
Collapse
Affiliation(s)
- Diwakar Shukla
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
- SIMBIOS NIH Center for Biomedical Computation, Stanford University, Stanford, California 94305, USA
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Ariana Peck
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Vijay S. Pande
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
- SIMBIOS NIH Center for Biomedical Computation, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
35
|
Zhou G, Voelz VA. Using Kinetic Network Models To Probe Non-Native Salt-Bridge Effects on α-Helix Folding. J Phys Chem B 2016; 120:926-35. [PMID: 26769494 DOI: 10.1021/acs.jpcb.5b11767] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Salt-bridge interactions play an important role in stabilizing many protein structures, and have been shown to be designable features for protein design. In this work, we study the effects of non-native salt bridges on the folding of a soluble alanine-based peptide (Fs peptide) using extensive all-atom molecular dynamics simulations performed on the Folding@home distributed computing platform. Using Markov State Models, we show how non-native salt-bridges affect the folding kinetics of Fs peptide by perturbing specific conformational states. Furthermore, we present methods for the automatic detection and analysis of such states. These results provide insight into helix folding mechanisms and useful information to guide simulation-based computational protein design.
Collapse
Affiliation(s)
- Guangfeng Zhou
- Department of Chemistry, Temple University , 1901 North 13th Street, Beury Hall, Philadelphia, Pennsylvania 19122, United States
| | - Vincent A Voelz
- Department of Chemistry, Temple University , 1901 North 13th Street, Beury Hall, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
36
|
Qiao Q, Qi R, Wei G, Huang X. Dynamics of the conformational transitions during the dimerization of an intrinsically disordered peptide: a case study on the human islet amyloid polypeptide fragment. Phys Chem Chem Phys 2016; 18:29892-29904. [DOI: 10.1039/c6cp05590g] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Dimerization pathways of the human islet amyloid polypeptide fragment are elucidated from extensive molecular dynamics simulations.
Collapse
Affiliation(s)
- Qin Qiao
- Hefei National Laboratory for Physical Sciences at the Microscale and Collaborative Innovation Center of Chemistry for Energy Materials (iChEM)
- University of Science and Technology of China
- Hefei
- China
| | - Ruxi Qi
- State Key Laboratory of Surface Physics
- Key Laboratory for Computational Physical Sciences (MOE)
- and Department of Physics
- Fudan University
- Shanghai
| | - Guanghong Wei
- State Key Laboratory of Surface Physics
- Key Laboratory for Computational Physical Sciences (MOE)
- and Department of Physics
- Fudan University
- Shanghai
| | - Xuhui Huang
- Department of Chemistry
- The Hong Kong University of Science and Technology
- Kowloon
- Hong Kong
- Division of Biomedical Engineering
| |
Collapse
|
37
|
Zhang L, Jiang H, Sheong F, Pardo-Avila F, Cheung PH, Huang X. Constructing Kinetic Network Models to Elucidate Mechanisms of Functional Conformational Changes of Enzymes and Their Recognition with Ligands. Methods Enzymol 2016; 578:343-71. [DOI: 10.1016/bs.mie.2016.05.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
38
|
Boninsegna L, Gobbo G, Noé F, Clementi C. Investigating Molecular Kinetics by Variationally Optimized Diffusion Maps. J Chem Theory Comput 2015; 11:5947-60. [DOI: 10.1021/acs.jctc.5b00749] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Lorenzo Boninsegna
- Center
for Theoretical Biological Physics and Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| | - Gianpaolo Gobbo
- Maxwell
Institute for Mathematical Sciences and School of Mathematics, The University of Edinburgh, Peter Guthrie Tait Road, Edinburgh EH9 3FD, United Kingdom
| | - Frank Noé
- Department
of Mathematics, Computer Science and Bioinformatics, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics and Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|
39
|
Scherer MK, Trendelkamp-Schroer B, Paul F, Pérez-Hernández G, Hoffmann M, Plattner N, Wehmeyer C, Prinz JH, Noé F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J Chem Theory Comput 2015; 11:5525-42. [PMID: 26574340 DOI: 10.1021/acs.jctc.5b00743] [Citation(s) in RCA: 715] [Impact Index Per Article: 79.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Markov (state) models (MSMs) and related models of molecular kinetics have recently received a surge of interest as they can systematically reconcile simulation data from either a few long or many short simulations and allow us to analyze the essential metastable structures, thermodynamics, and kinetics of the molecular system under investigation. However, the estimation, validation, and analysis of such models is far from trivial and involves sophisticated and often numerically sensitive methods. In this work we present the open-source Python package PyEMMA ( http://pyemma.org ) that provides accurate and efficient algorithms for kinetic model construction. PyEMMA can read all common molecular dynamics data formats, helps in the selection of input features, provides easy access to dimension reduction algorithms such as principal component analysis (PCA) and time-lagged independent component analysis (TICA) and clustering algorithms such as k-means, and contains estimators for MSMs, hidden Markov models, and several other models. Systematic model validation and error calculation methods are provided. PyEMMA offers a wealth of analysis functions such that the user can conveniently compute molecular observables of interest. We have derived a systematic and accurate way to coarse-grain MSMs to few states and to illustrate the structures of the metastable states of the system. Plotting functions to produce a manuscript-ready presentation of the results are available. In this work, we demonstrate the features of the software and show new methodological concepts and results produced by PyEMMA.
Collapse
Affiliation(s)
- Martin K Scherer
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | | | - Fabian Paul
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | - Guillermo Pérez-Hernández
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | - Moritz Hoffmann
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | - Nuria Plattner
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | - Christoph Wehmeyer
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | - Jan-Hendrik Prinz
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| | - Frank Noé
- Department for Mathematics and Computer Science, Freie Universität , Arnimallee 6, Berlin 14195, Germany
| |
Collapse
|
40
|
Hierarchical Conformational Analysis of Native Lysozyme Based on Sub-Millisecond Molecular Dynamics Simulations. PLoS One 2015; 10:e0129846. [PMID: 26057625 PMCID: PMC4461368 DOI: 10.1371/journal.pone.0129846] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 05/12/2015] [Indexed: 11/19/2022] Open
Abstract
Hierarchical organization of free energy landscape (FEL) for native globular proteins has been widely accepted by the biophysics community. However, FEL of native proteins is usually projected onto one or a few dimensions. Here we generated collectively 0.2 milli-second molecular dynamics simulation trajectories in explicit solvent for hen egg white lysozyme (HEWL), and carried out detailed conformational analysis based on backbone torsional degrees of freedom (DOF). Our results demonstrated that at micro-second and coarser temporal resolutions, FEL of HEWL exhibits hub-like topology with crystal structures occupying the dominant structural ensemble that serves as the hub of conformational transitions. However, at 100ns and finer temporal resolutions, conformational substates of HEWL exhibit network-like topology, crystal structures are associated with kinetic traps that are important but not dominant ensembles. Backbone torsional state transitions on time scales ranging from nanoseconds to beyond microseconds were found to be associated with various types of molecular interactions. Even at nanoseconds temporal resolution, the number of conformational substates that are of statistical significance is quite limited. These observations suggest that detailed analysis of conformational substates at multiple temporal resolutions is both important and feasible. Transition state ensembles among various conformational substates at microsecond temporal resolution were observed to be considerably disordered. Life times of these transition state ensembles are found to be nearly independent of the time scales of the participating torsional DOFs.
Collapse
|
41
|
Sheong FK, Silva DA, Meng L, Zhao Y, Huang X. Automatic state partitioning for multibody systems (APM): an efficient algorithm for constructing Markov state models to elucidate conformational dynamics of multibody systems. J Chem Theory Comput 2014; 11:17-27. [PMID: 26574199 DOI: 10.1021/ct5007168] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The conformational dynamics of multibody systems plays crucial roles in many important problems. Markov state models (MSMs) are powerful kinetic network models that can predict long-time-scale dynamics using many short molecular dynamics simulations. Although MSMs have been successfully applied to conformational changes of individual proteins, the analysis of multibody systems is still a challenge because of the complexity of the dynamics that occur on a mixture of drastically different time scales. In this work, we have developed a new algorithm, automatic state partitioning for multibody systems (APM), for constructing MSMs to elucidate the conformational dynamics of multibody systems. The APM algorithm effectively addresses different time scales in the multibody systems by directly incorporating dynamics into geometric clustering when identifying the metastable conformational states. We have applied the APM algorithm to a 2D potential that can mimic a protein-ligand binding system and the aggregation of two hydrophobic particles in water and have shown that it can yield tremendous enhancements in the computational efficiency of MSM construction and the accuracy of the models.
Collapse
Affiliation(s)
- Fu Kit Sheong
- HKUST Shenzhen Research Institute , Nanshan, Shenzhen 518057, China
| | | | - Luming Meng
- HKUST Shenzhen Research Institute , Nanshan, Shenzhen 518057, China
| | | | - Xuhui Huang
- HKUST Shenzhen Research Institute , Nanshan, Shenzhen 518057, China
| |
Collapse
|
42
|
Voelz VA, Elman B, Razavi AM, Zhou G. Surprisal Metrics for Quantifying Perturbed Conformational Dynamics in Markov State Models. J Chem Theory Comput 2014; 10:5716-28. [DOI: 10.1021/ct500827g] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Vincent A. Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Brandon Elman
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Asghar M. Razavi
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Guangfeng Zhou
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
43
|
Gu S, Silva DA, Meng L, Yue A, Huang X. Quantitatively characterizing the ligand binding mechanisms of choline binding protein using Markov state model analysis. PLoS Comput Biol 2014; 10:e1003767. [PMID: 25101697 PMCID: PMC4125059 DOI: 10.1371/journal.pcbi.1003767] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2013] [Accepted: 06/22/2014] [Indexed: 01/05/2023] Open
Abstract
Protein-ligand recognition plays key roles in many biological processes. One of the most fascinating questions about protein-ligand recognition is to understand its underlying mechanism, which often results from a combination of induced fit and conformational selection. In this study, we have developed a three-pronged approach of Markov State Models, Molecular Dynamics simulations, and flux analysis to determine the contribution of each model. Using this approach, we have quantified the recognition mechanism of the choline binding protein (ChoX) to be ∼90% conformational selection dominant under experimental conditions. This is achieved by recovering all the necessary parameters for the flux analysis in combination with available experimental data. Our results also suggest that ChoX has several metastable conformational states, of which an apo-closed state is dominant, consistent with previous experimental findings. Our methodology holds great potential to be widely applied to understand recognition mechanisms underlining many fundamental biological processes.
Collapse
Affiliation(s)
- Shuo Gu
- Department of Chemistry, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Daniel-Adriano Silva
- Department of Chemistry, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Luming Meng
- Department of Chemistry, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Alexander Yue
- Department of Chemistry, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
| | - Xuhui Huang
- Department of Chemistry, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Division of Biomedical Engineering, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- Center of Systems Biology and Human Health, Institute for Advance Study and School of Science, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
- * E-mail:
| |
Collapse
|
44
|
Chodera JD, Noé F. Markov state models of biomolecular conformational dynamics. Curr Opin Struct Biol 2014; 25:135-44. [PMID: 24836551 DOI: 10.1016/j.sbi.2014.04.002] [Citation(s) in RCA: 502] [Impact Index Per Article: 50.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 04/08/2014] [Accepted: 04/12/2014] [Indexed: 10/25/2022]
Abstract
It has recently become practical to construct Markov state models (MSMs) that reproduce the long-time statistical conformational dynamics of biomolecules using data from molecular dynamics simulations. MSMs can predict both stationary and kinetic quantities on long timescales (e.g. milliseconds) using a set of atomistic molecular dynamics simulations that are individually much shorter, thus addressing the well-known sampling problem in molecular dynamics simulation. In addition to providing predictive quantitative models, MSMs greatly facilitate both the extraction of insight into biomolecular mechanism (such as folding and functional dynamics) and quantitative comparison with single-molecule and ensemble kinetics experiments. A variety of methodological advances and software packages now bring the construction of these models closer to routine practice. Here, we review recent progress in this field, considering theoretical and methodological advances, new software tools, and recent applications of these approaches in several domains of biochemistry and biophysics, commenting on remaining challenges.
Collapse
Affiliation(s)
- John D Chodera
- Computational Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany.
| |
Collapse
|
45
|
Bowman GR, Meng L, Huang X. Quantitative comparison of alternative methods for coarse-graining biological networks. J Chem Phys 2014; 139:121905. [PMID: 24089717 DOI: 10.1063/1.4812768] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Markov models and master equations are a powerful means of modeling dynamic processes like protein conformational changes. However, these models are often difficult to understand because of the enormous number of components and connections between them. Therefore, a variety of methods have been developed to facilitate understanding by coarse-graining these complex models. Here, we employ Bayesian model comparison to determine which of these coarse-graining methods provides the models that are most faithful to the original set of states. We find that the Bayesian agglomerative clustering engine and the hierarchical Nyström expansion graph (HNEG) typically provide the best performance. Surprisingly, the original Perron cluster cluster analysis (PCCA) method often provides the next best results, outperforming the newer PCCA+ method and the most probable paths algorithm. We also show that the differences between the models are qualitatively significant, rather than being minor shifts in the boundaries between states. The performance of the methods correlates well with the entropy of the resulting coarse-grainings, suggesting that finding states with more similar populations (i.e., avoiding low population states that may just be noise) gives better results.
Collapse
Affiliation(s)
- Gregory R Bowman
- Departments of Chemistry and Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
| | | | | |
Collapse
|
46
|
Application of Markov State Models to Simulate Long Timescale Dynamics of Biological Macromolecules. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014; 805:29-66. [DOI: 10.1007/978-3-319-02970-2_2] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
47
|
Haack F, Fackeldey K, Röblitz S, Scharkoi O, Weber M, Schmidt B. Adaptive spectral clustering with application to tripeptide conformation analysis. J Chem Phys 2013; 139:194110. [DOI: 10.1063/1.4830409] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|