1
|
Wang J, Li Z. Electric field modulated configuration and orientation of aqueous molecule chains. J Chem Phys 2024; 161:094305. [PMID: 39230558 DOI: 10.1063/5.0222122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Accepted: 08/22/2024] [Indexed: 09/05/2024] Open
Abstract
Understanding how external electric fields (EFs) impact the properties of aqueous molecules is crucial for various applications in chemistry, biology, and engineering. In this paper, we present a study utilizing molecular dynamics simulation to explore how direct-current (DC) and alternative-current (AC) EFs affect hydrophobic (n-triacontane) and hydrophilic (PEG-10) oligomer chains. Through a machine learning approach, we extract a 2-dimensional free energy (FE) landscape of these molecules, revealing that electric fields modulate the FE landscape to favor stretched configurations and enhance the alignment of the chain with the electric field. Our observations indicate that DC EFs have a more prominent impact on modulation compared to AC EFs and that EFs have a stronger effect on hydrophobic chains than on hydrophilic oligomers. We analyze the orientation of water dipole moments and hydrogen bonds, finding that EFs align water molecules and induce more directional hydrogen bond networks, forming 1D water structures. This favors the stretched configuration and alignment of the studied oligomers simultaneously, as it minimizes the disruption of 1D structures. This research deepens our understanding of the mechanisms by which electric fields modulate molecular properties and could guide the broader application of EFs to control other aqueous molecules, such as proteins or biomolecules.
Collapse
Affiliation(s)
- Jiang Wang
- College of Science, Guizhou Institute of Technology, Boshi Road, Dangwu Town, Gui'an New District, Guizhou 550025, China
| | - Zhiling Li
- College of Science, Guizhou Institute of Technology, Boshi Road, Dangwu Town, Gui'an New District, Guizhou 550025, China
| |
Collapse
|
2
|
Dasetty S, Bidone TC, Ferguson AL. Data-driven prediction of α IIbβ 3 integrin activation paths using manifold learning and deep generative modeling. Biophys J 2024; 123:2716-2729. [PMID: 38098231 PMCID: PMC11393677 DOI: 10.1016/j.bpj.2023.12.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Revised: 12/05/2023] [Accepted: 12/11/2023] [Indexed: 01/06/2024] Open
Abstract
The integrin heterodimer is a transmembrane protein critical for driving cellular process and is a therapeutic target in the treatment of multiple diseases linked to its malfunction. Activation of integrin involves conformational transitions between bent and extended states. Some of the conformations that are intermediate between bent and extended states of the heterodimer have been experimentally characterized, but the full activation pathways remain unresolved both experimentally due to their transient nature and computationally due to the challenges in simulating rare barrier crossing events in these large molecular systems. An understanding of the activation pathways can provide new fundamental understanding of the biophysical processes associated with the dynamic interconversions between bent and extended states and can unveil new putative therapeutic targets. In this work, we apply nonlinear manifold learning to coarse-grained molecular dynamics simulations of bent, extended, and two intermediate states of αIIbβ3 integrin to learn a low-dimensional embedding of the configurational phase space. We then train deep generative models to learn an inverse mapping between the low-dimensional embedding and high-dimensional molecular space and use these models to interpolate the molecular configurations constituting the activation pathways between the experimentally characterized states. This work furnishes plausible predictions of integrin activation pathways and reports a generic and transferable multiscale technique to predict transition pathways for biomolecular systems.
Collapse
Affiliation(s)
- Siva Dasetty
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois
| | - Tamara C Bidone
- Department of Biomedical Engineering, University of Utah, Salt Lake City, Utah; Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, Utah
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois.
| |
Collapse
|
3
|
Domingues TS, Coifman R, Haji-Akbari A. Estimating Position-Dependent and Anisotropic Diffusivity Tensors from Molecular Dynamics Trajectories: Existing Methods and Future Outlook. J Chem Theory Comput 2024; 20:4427-4455. [PMID: 38815171 DOI: 10.1021/acs.jctc.4c00148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Confinement can substantially alter the physicochemical properties of materials by breaking translational isotropy and rendering all physical properties position-dependent. Molecular dynamics (MD) simulations have proven instrumental in characterizing such spatial heterogeneities and probing the impact of confinement on materials' properties. For static properties, this is a straightforward task and can be achieved via simple spatial binning. Such an approach, however, cannot be readily applied to transport coefficients due to lack of natural extensions of autocorrelations used for their calculation in the bulk. The prime example of this challenge is diffusivity, which, in the bulk, can be readily estimated from the particles' mobility statistics, which satisfy the Fokker-Planck equation. Under confinement, however, such statistics will follow the Smoluchowski equation, which lacks a closed-form analytical solution. This brief review explores the rich history of estimating profiles of the diffusivity tensor from MD simulations and discusses various approximate methods and algorithms developed for this purpose. Besides discussing heuristic extensions of bulk methods, we overview more rigorous algorithms, including kernel-based methods, Bayesian approaches, and operator discretization techniques. Additionally, we outline methods based on applying biasing potentials or imposing constraints on tracer particles. Finally, we discuss approaches that estimate diffusivity from mean first passage time or committor probability profiles, a conceptual framework originally developed in the context of collective variable spaces describing rare events in computational chemistry and biology. In summary, this paper offers a concise survey of diverse approaches for estimating diffusivity from MD trajectories, highlighting challenges and opportunities in this area.
Collapse
Affiliation(s)
- Tiago S Domingues
- Department of Chemical and Environmental Engineering, Yale University, New Haven, Connecticut 06520, United States
| | - Ronald Coifman
- Department of Mathematics, Yale University, New Haven, Connecticut 06520, United States
- Department of Computer Science, Yale University, New Haven, Connecticut 06520, United States
| | - Amir Haji-Akbari
- Department of Chemical and Environmental Engineering, Yale University, New Haven, Connecticut 06520, United States
| |
Collapse
|
4
|
Evangelou N, Cui T, Bello-Rivas JM, Makeev A, Kevrekidis IG. Tipping points of evolving epidemiological networks: Machine learning-assisted, data-driven effective modeling. CHAOS (WOODBURY, N.Y.) 2024; 34:063128. [PMID: 38865091 DOI: 10.1063/5.0187511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 04/20/2024] [Indexed: 06/13/2024]
Abstract
We study the tipping point collective dynamics of an adaptive susceptible-infected-susceptible (SIS) epidemiological network in a data-driven, machine learning-assisted manner. We identify a parameter-dependent effective stochastic differential equation (eSDE) in terms of physically meaningful coarse mean-field variables through a deep-learning ResNet architecture inspired by numerical stochastic integrators. We construct an approximate effective bifurcation diagram based on the identified drift term of the eSDE and contrast it with the mean-field SIS model bifurcation diagram. We observe a subcritical Hopf bifurcation in the evolving network's effective SIS dynamics that causes the tipping point behavior; this takes the form of large amplitude collective oscillations that spontaneously-yet rarely-arise from the neighborhood of a (noisy) stationary state. We study the statistics of these rare events both through repeated brute force simulations and by using established mathematical/computational tools exploiting the right-hand side of the identified SDE. We demonstrate that such a collective SDE can also be identified (and the rare event computations also performed) in terms of data-driven coarse observables, obtained here via manifold learning techniques, in particular, Diffusion Maps. The workflow of our study is straightforwardly applicable to other complex dynamic problems exhibiting tipping point dynamics.
Collapse
Affiliation(s)
- Nikolaos Evangelou
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Tianqi Cui
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Juan M Bello-Rivas
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Alexei Makeev
- Faculty of Computational Mathematics and Cybernetics, Moscow State University, 119991 Moscow, Russia
| | - Ioannis G Kevrekidis
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
5
|
Kacirani A, Uralcan B, Domingues TS, Haji-Akbari A. Effect of Pressure on the Conformational Landscape of Human γD-Crystallin from Replica Exchange Molecular Dynamics Simulations. J Phys Chem B 2024; 128:4931-4942. [PMID: 38685567 DOI: 10.1021/acs.jpcb.4c00178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Human γD-crystallin belongs to a crucial family of proteins known as crystallins located in the fiber cells of the human lens. Since crystallins do not undergo any turnover after birth, they need to possess remarkable thermodynamic stability. However, their sporadic misfolding and aggregation, triggered by environmental perturbations or genetic mutations, constitute the molecular basis of cataracts, which is the primary cause of blindness in the globe according to the World Health Organization. Here, we investigate the impact of high pressure on the conformational landscape of wild-type HγD-crystallin using replica exchange molecular dynamics simulations augmented with principal component analysis. We find pressure to have a modest impact on global measures of protein stability, such as root-mean-square displacement and radius of gyration. Upon projecting our trajectories along the first two principal components from principal component analysis, however, we observe the emergence of distinct free energy basins at high pressures. By screening local order parameters previously shown or hypothesized as markers of HγD-crystallin stability, we establish correlations between a tyrosine-tyrosine aromatic contact within the N-terminal domain and the protein's end-to-end distance with projections along the first and second principal components, respectively. Furthermore, we observe the simultaneous contraction of the hydrophobic core and its intrusion by water molecules. This exploration sheds light on the intricate responses of HγD-crystallin to elevated pressures, offering insights into potential mechanisms underlying its stability and susceptibility to environmental perturbations, crucial for understanding cataract formation.
Collapse
Affiliation(s)
- Arlind Kacirani
- Department of Chemical and Environmental Engineering, Yale University, New Haven, Connecticut 06520, United States
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, Connecticut 06520, United States
| | - Betül Uralcan
- Department of Chemical and Environmental Engineering, Yale University, New Haven, Connecticut 06520, United States
- Department of Chemical Engineering, Boğaziçi University, Istanbul 34342, Turkey
| | - Tiago S Domingues
- Department of Chemical and Environmental Engineering, Yale University, New Haven, Connecticut 06520, United States
- Graduate Program in Applied Mathematics, Yale University, New Haven, Connecticut 06520, United States
| | - Amir Haji-Akbari
- Department of Chemical and Environmental Engineering, Yale University, New Haven, Connecticut 06520, United States
| |
Collapse
|
6
|
Russo A, Duran-Olivencia MA, Kevrekidis IG, Kalliadasis S. Machine Learning Memory Kernels as Closure for Non-Markovian Stochastic Processes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6531-6543. [PMID: 36374895 DOI: 10.1109/tnnls.2022.3210695] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Finding the dynamical law of observable quantities lies at the core of physics. Within the particular field of statistical mechanics, the generalized Langevin equation (GLE) comprises a general model for the evolution of observables covering a great deal of physical systems with many degrees of freedom and an inherently stochastic nature. Although formally exact, GLE brings its own great challenges. It depends on the complete history of the observables under scrutiny, as well as the microscopic degrees of freedom, all of which are often inaccessible. We show that these drawbacks can be overcome by adopting elements of machine learning from empirical data, in particular coupling a multilayer perceptron (MLP) with the formal structure of GLE and calibrating the MLP with the data. This yields a powerful computational tool capable of describing noisy complex systems beyond the realms of statistical mechanics. It is exemplified with a number of representative examples from different fields: from a single colloidal particle and particle chains in a thermal bath to climatology and finance, showing in all cases excellent agreement with the actual observable dynamics. The new framework offers an alternative perspective for the study of nonequilibrium processes opening also a new route for stochastic modeling.
Collapse
|
7
|
Lee SC, Z Y. Interpretation of autoencoder-learned collective variables using Morse-Smale complex and sublevelset persistent homology: An application on molecular trajectories. J Chem Phys 2024; 160:144104. [PMID: 38591676 DOI: 10.1063/5.0191446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 03/22/2024] [Indexed: 04/10/2024] Open
Abstract
Dimensionality reduction often serves as the first step toward a minimalist understanding of physical systems as well as the accelerated simulations of them. In particular, neural network-based nonlinear dimensionality reduction methods, such as autoencoders, have shown promising outcomes in uncovering collective variables (CVs). However, the physical meaning of these CVs remains largely elusive. In this work, we constructed a framework that (1) determines the optimal number of CVs needed to capture the essential molecular motions using an ensemble of hierarchical autoencoders and (2) provides topology-based interpretations to the autoencoder-learned CVs with Morse-Smale complex and sublevelset persistent homology. This approach was exemplified using a series of n-alkanes and can be regarded as a general, explainable nonlinear dimensionality reduction method.
Collapse
Affiliation(s)
- Shao-Chun Lee
- Department of Nuclear, Plasma, and Radiological Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Y Z
- Department of Nuclear, Plasma, and Radiological Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
- Department of Nuclear Engineering and Radiological Sciences, Department of Materials Science and Engineering, Department of Robotics, and Applied Physics Program, University of Michigan, Ann Arbor, Michigan 48105, USA
| |
Collapse
|
8
|
Feng L, Gao T, Xiao W, Duan J. Early warning indicators via latent stochastic dynamical systems. CHAOS (WOODBURY, N.Y.) 2024; 34:031101. [PMID: 38442235 DOI: 10.1063/5.0195042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 02/15/2024] [Indexed: 03/07/2024]
Abstract
Detecting early warning indicators for abrupt dynamical transitions in complex systems or high-dimensional observation data are essential in many real-world applications, such as brain diseases, natural disasters, and engineering reliability. To this end, we develop a novel approach: the directed anisotropic diffusion map that captures the latent evolutionary dynamics in the low-dimensional manifold. Then three effective warning signals (Onsager-Machlup indicator, sample entropy indicator, and transition probability indicator) are derived through the latent coordinates and the latent stochastic dynamical systems. To validate our framework, we apply this methodology to authentic electroencephalogram data. We find that our early warning indicators are capable of detecting the tipping point during state transition. This framework not only bridges the latent dynamics with real-world data but also shows the potential ability for automatic labeling on complex high-dimensional time series.
Collapse
Affiliation(s)
- Lingyu Feng
- School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
- Center for Mathematical Science, Huazhong University of Science and Technology, Wuhan 430074, China
- Steklov-Wuhan Institute for Mathematical Exploration, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Ting Gao
- School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
- Center for Mathematical Science, Huazhong University of Science and Technology, Wuhan 430074, China
- Steklov-Wuhan Institute for Mathematical Exploration, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Wang Xiao
- School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China
- Center for Mathematical Science, Huazhong University of Science and Technology, Wuhan 430074, China
- Steklov-Wuhan Institute for Mathematical Exploration, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jinqiao Duan
- Steklov-Wuhan Institute for Mathematical Exploration, Huazhong University of Science and Technology, Wuhan 430074, China
- Department of Mathematics and Department of Physics, Great Bay University, Dongguan 523000, China
- Dongguan Key Laboratory for Data Science and Intelligent Medicine, Dongguan 523000, China
| |
Collapse
|
9
|
da Hora GCA, Oh M, Mifflin MC, Digal L, Roberts AG, Swanson JMJ. Lasso Peptides: Exploring the Folding Landscape of Nature's Smallest Interlocked Motifs. J Am Chem Soc 2024; 146:4444-4454. [PMID: 38166378 PMCID: PMC11282585 DOI: 10.1021/jacs.3c10126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Lasso peptides make up a class of natural products characterized by a threaded structure. Given their small size and stability, chemical synthesis would offer tremendous potential for the development of novel therapeutics. However, the accessibility of the pre-folded lasso architecture has limited this advance. To better understand the folding process de novo, simulations are used herein to characterize the folding propensity of microcin J25 (MccJ25), a lasso peptide known for its antimicrobial properties. New algorithms are developed to unambiguously distinguish threaded from nonthreaded precursors and determine handedness, a key feature in natural lasso peptides. We find that MccJ25 indeed forms right-handed pre-lassos, in contrast to past predictions but consistent with all natural lasso peptides. Additionally, the native pre-lasso structure is shown to be metastable prior to ring formation but to readily transition to entropically favored unfolded and nonthreaded structures, suggesting that de novo lasso folding is rare. However, by altering the ring forming residues and appending thiol and thioester functionalities, we are able to increase the stability of pre-lasso conformations. Furthermore, conditions leading to protonation of a histidine imidazole side chain further stabilize the modified pre-lasso ensemble. This work highlights the use of computational methods to characterize lasso folding and demonstrates that de novo access to lasso structures can be facilitated by optimizing sequence, unnatural modifications, and reaction conditions like pH.
Collapse
Affiliation(s)
- Gabriel C A da Hora
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Myongin Oh
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Marcus C Mifflin
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Lori Digal
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Andrew G Roberts
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Jessica M J Swanson
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
10
|
Herringer NSM, Dasetty S, Gandhi D, Lee J, Ferguson AL. Permutationally Invariant Networks for Enhanced Sampling (PINES): Discovery of Multimolecular and Solvent-Inclusive Collective Variables. J Chem Theory Comput 2024; 20:178-198. [PMID: 38150421 DOI: 10.1021/acs.jctc.3c00923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
The typically rugged nature of molecular free-energy landscapes can frustrate efficient sampling of the thermodynamically relevant phase space due to the presence of high free-energy barriers. Enhanced sampling techniques can improve phase space exploration by accelerating sampling along particular collective variables (CVs). A number of techniques exist for the data-driven discovery of CVs parametrizing the important large-scale motions of the system. A challenge to CV discovery is learning CVs invariant to the symmetries of the molecular system, frequently rigid translation, rigid rotation, and permutational relabeling of identical particles. Of these, permutational invariance has proved a persistent challenge in frustrating the data-driven discovery of multimolecular CVs in systems of self-assembling particles and solvent-inclusive CVs for solvated systems. In this work, we integrate permutation invariant vector (PIV) featurizations with autoencoding neural networks to learn nonlinear CVs invariant to translation, rotation, and permutation and perform interleaved rounds of CV discovery and enhanced sampling to iteratively expand the sampling of configurational phase space and obtain converged CVs and free-energy landscapes. We demonstrate the permutationally invariant network for enhanced sampling (PINES) approach in applications to the self-assembly of a 13-atom argon cluster, association/dissociation of a NaCl ion pair in water, and hydrophobic collapse of a C45H92 n-pentatetracontane polymer chain. We make the approach freely available as a new module within the PLUMED2 enhanced sampling libraries.
Collapse
Affiliation(s)
| | - Siva Dasetty
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Diya Gandhi
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Junhee Lee
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
11
|
Qiu Y, O’Connor MS, Xue M, Liu B, Huang X. An Efficient Path Classification Algorithm Based on Variational Autoencoder to Identify Metastable Path Channels for Complex Conformational Changes. J Chem Theory Comput 2023; 19:4728-4742. [PMID: 37382437 PMCID: PMC11042546 DOI: 10.1021/acs.jctc.3c00318] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Conformational changes (i.e., dynamic transitions between pairs of conformational states) play important roles in many chemical and biological processes. Constructing the Markov state model (MSM) from extensive molecular dynamics (MD) simulations is an effective approach to dissect the mechanism of conformational changes. When combined with transition path theory (TPT), MSM can be applied to elucidate the ensemble of kinetic pathways connecting pairs of conformational states. However, the application of TPT to analyze complex conformational changes often results in a vast number of kinetic pathways with comparable fluxes. This obstacle is particularly pronounced in heterogeneous self-assembly and aggregation processes. The large number of kinetic pathways makes it challenging to comprehend the molecular mechanisms underlying conformational changes of interest. To address this challenge, we have developed a path classification algorithm named latent-space path clustering (LPC) that efficiently lumps parallel kinetic pathways into distinct metastable path channels, making them easier to comprehend. In our algorithm, MD conformations are first projected onto a low-dimensional space containing a small set of collective variables (CVs) by time-structure-based independent component analysis (tICA) with kinetic mapping. Then, MSM and TPT are constructed to obtain the ensemble of pathways, and a deep learning architecture named the variational autoencoder (VAE) is used to learn the spatial distributions of kinetic pathways in the continuous CV space. Based on the trained VAE model, the TPT-generated ensemble of kinetic pathways can be embedded into a latent space, where the classification becomes clear. We show that LPC can efficiently and accurately identify the metastable path channels in three systems: a 2D potential, the aggregation of two hydrophobic particles in water, and the folding of the Fip35 WW domain. Using the 2D potential, we further demonstrate that our LPC algorithm outperforms the previous path-lumping algorithms by making substantially fewer incorrect assignments of individual pathways to four path channels. We expect that LPC can be widely applied to identify the dominant kinetic pathways underlying complex conformational changes.
Collapse
Affiliation(s)
- Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Michael S. O’Connor
- Biophysics Graduate Program, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Mingyi Xue
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Bojun Liu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA
- Biophysics Graduate Program, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
12
|
Topel M, Ejaz A, Squires A, Ferguson AL. Learned Reconstruction of Protein Folding Trajectories from Noisy Single-Molecule Time Series. J Chem Theory Comput 2023; 19:4654-4667. [PMID: 36701162 DOI: 10.1021/acs.jctc.2c00920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Single-molecule Förster resonance energy transfer (smFRET) is an experimental methodology to track the real-time dynamics of molecules using fluorescent probes to follow one or more intramolecular distances. These distances provide a low-dimensional representation of the full atomistic dynamics. Under mild technical conditions, Takens' Delay Embedding Theorem guarantees that the full three-dimensional atomistic dynamics of a system are diffeomorphic (i.e., related by a smooth and invertible transformation) to a time-delayed embedding of one or more scalar observables. Appealing to these theoretical guarantees, we employ manifold learning, artificial neural networks, and statistical mechanics to learn from molecular simulation training data the a priori unknown transformation between the atomic coordinates and delay-embedded intramolecular distances accessible to smFRET. This learned transformation may then be used to reconstruct atomistic coordinates from smFRET time series data. We term this approach Single-molecule TAkens Reconstruction (STAR). We have previously applied STAR to reconstruct molecular configurations of a C24H50 polymer chain and the mini-protein Chignolin with accuracies better than 0.2 nm from simulated smFRET data under noise free and high time resolution conditions. In the present work, we investigate the role of signal-to-noise ratio, data volume, and time resolution in simulated smFRET data to assess the performance of STAR under conditions more representative of experimental realities. We show that STAR can reconstruct the Chignolin and Villin mini-proteins to accuracies of 0.12 and 0.42 nm, respectively, and place bounds on these conditions for accurate reconstructions. These results demonstrate that it is possible to reconstruct dynamical trajectories of protein folding from time series in noisy, time binned, experimentally measurable observables and lay the foundations for the application of STAR to real experimental data.
Collapse
Affiliation(s)
- Maximilian Topel
- Department of Physics, University of Chicago, Chicago, Illinois 60637, United States
| | - Ayesha Ejaz
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Allison Squires
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
13
|
Mansbach R, Patel LA, Watson NA, Kubicek-Sutherland JZ, Gnanakaran S. Inferring Pathways of Oxidative Folding from Prefolding Free Energy Landscapes of Disulfide-Rich Toxins. J Phys Chem B 2023; 127:1689-1703. [PMID: 36791259 PMCID: PMC9987446 DOI: 10.1021/acs.jpcb.2c07124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 12/07/2022] [Indexed: 02/17/2023]
Abstract
Short, cysteine-rich peptides can exist in stable or metastable structural ensembles due to the number of possible patterns of formation of their disulfide bonds. One interesting subset of this peptide group is the conotoxins, which are produced by aquatic snails in the family Conidae. The μ conotoxins, which are antagonists and blockers of the voltage-gated sodium channel, exist in a folding spectrum: on one end of the spectrum are more hirudin-like folders, which form disulfide bonds and then reshuffle them, leading to an ensemble of kinetically trapped isomers, and on the other end are more BPTI-like folders, which form the native disulfide bonds one by one in a particular order, leading to a preponderance of conformations existing in a single stable state. In this Article, we employ the composite diffusion map approach to study the unified free energy surface of prefolding μ-conotoxin equilibrium. We identify the two most important nonlinear collective modes of the unified folding landscape and demonstrate that in the absence of their disulfides, the conotoxins can be thought of as largely disordered polymers. A small increase in the number of hydrophobic residues in the protein shifts the free energy landscape toward hydrophobically collapsed coil conformations responsible for cysteine proximity in hirudin-like folders, compared to semiextended coil conformations with more distal cysteines in BPTI-like folders. Overall, this work sheds important light on the folding processes and free energy landscapes of cysteine-rich peptides and demonstrates the extent to which sequence and length contribute to these landscapes.
Collapse
Affiliation(s)
| | - Lara A. Patel
- OpenEye
Scientific Research, Santa Fe, New Mexico 87508, United States
| | - Natalya A. Watson
- Physics
Department, University of Concordia, Montreal, QC H4B 1R6, Canada
| | | | - S. Gnanakaran
- Physical
Chemistry and Applied Spectroscopy Group, Chemistry Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
14
|
Evans L, Cameron MK, Tiwary P. Computing committors via Mahalanobis diffusion maps with enhanced sampling data. J Chem Phys 2022; 157:214107. [PMID: 36511548 DOI: 10.1063/5.0122990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
The study of phenomena such as protein folding and conformational changes in molecules is a central theme in chemical physics. Molecular dynamics (MD) simulation is the primary tool for the study of transition processes in biomolecules, but it is hampered by a huge timescale gap between the processes of interest and atomic vibrations that dictate the time step size. Therefore, it is imperative to combine MD simulations with other techniques in order to quantify the transition processes taking place on large timescales. In this work, the diffusion map with Mahalanobis kernel, a meshless approach for approximating the Backward Kolmogorov Operator (BKO) in collective variables, is upgraded to incorporate standard enhanced sampling techniques, such as metadynamics. The resulting algorithm, which we call the target measure Mahalanobis diffusion map (tm-mmap), is suitable for a moderate number of collective variables in which one can approximate the diffusion tensor and free energy. Imposing appropriate boundary conditions allows use of the approximated BKO to solve for the committor function and utilization of transition path theory to find the reactive current delineating the transition channels and the transition rate. The proposed algorithm, tm-mmap, is tested on the two-dimensional Moro-Cardin two-well system with position-dependent diffusion coefficient and on alanine dipeptide in two collective variables where the committor, the reactive current, and the transition rate are compared to those computed by the finite element method (FEM). Finally, tm-mmap is applied to alanine dipeptide in four collective variables where the use of finite elements is infeasible.
Collapse
Affiliation(s)
- L Evans
- Department of Mathematics, University of Maryland, College Park, Maryland 20742, USA
| | - M K Cameron
- Department of Mathematics, University of Maryland, College Park, Maryland 20742, USA
| | - P Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
15
|
Bhakat S. Collective variable discovery in the age of machine learning: reality, hype and everything in between. RSC Adv 2022; 12:25010-25024. [PMID: 36199882 PMCID: PMC9437778 DOI: 10.1039/d2ra03660f] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 08/20/2022] [Indexed: 11/21/2022] Open
Abstract
Understanding the kinetics and thermodynamics profile of biomolecules is necessary to understand their functional roles which has a major impact in mechanism driven drug discovery. Molecular dynamics simulation has been routinely used to understand conformational dynamics and molecular recognition in biomolecules. Statistical analysis of high-dimensional spatiotemporal data generated from molecular dynamics simulation requires identification of a few low-dimensional variables which can describe the essential dynamics of a system without significant loss of information. In physical chemistry, these low-dimensional variables are often called collective variables. Collective variables are used to generate reduced representations of free energy surfaces and calculate transition probabilities between different metastable basins. However the choice of collective variables is not trivial for complex systems. Collective variables range from geometric criteria such as distances and dihedral angles to abstract ones such as weighted linear combinations of multiple geometric variables. The advent of machine learning algorithms led to increasing use of abstract collective variables to represent biomolecular dynamics. In this review, I will highlight several nuances of commonly used collective variables ranging from geometric to abstract ones. Further, I will put forward some cases where machine learning based collective variables were used to describe simple systems which in principle could have been described by geometric ones. Finally, I will put forward my thoughts on artificial general intelligence and how it can be used to discover and predict collective variables from spatiotemporal data generated by molecular dynamics simulations.
Collapse
Affiliation(s)
- Soumendranath Bhakat
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania Pennsylvania 19104-6059 USA +1 30549 32620
| |
Collapse
|
16
|
Oide M, Sugita Y. Protein Folding Intermediates on the Dimensionality Reduced Landscape with UMAP and Native Contact Likelihood. J Chem Phys 2022; 157:075101. [DOI: 10.1063/5.0099094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To understand protein folding mechanisms from molecular dynamics (MD) simulations, it is important to explore not only folded/unfolded states but also representative intermediate structures on the conformational landscape. Here, we propose a novel approach to construct the landscape using the uniform manifold approximation and projection (UMAP) method, which reduces the dimensionality without losing data-point proximity. In the approach, native contact likelihood is used as feature variables rather than the conventional Cartesian coordinates or dihedral angles of protein structures. We tested the performance of UMAP for coarse-grained MD simulation trajectories of B1 domain in protein G and observed on-pathway transient structures and other metastable states on the UMAP conformational landscape. In contrast, these structures were not clearly distinguished on the dimensionality reduced landscape using principal component analysis (PCA) or time-lagged independent component analysis (tICA). This approach is also useful to obtain dynamical information through Markov State Modeling and would be applicable to large-scale conformational changes in many other biomacromolecules.
Collapse
Affiliation(s)
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN, Japan
| |
Collapse
|
17
|
Morishita T. Time-dependent principal component analysis: A unified approach to high-dimensional data reduction using adiabatic dynamics. J Chem Phys 2021; 155:134114. [PMID: 34624975 DOI: 10.1063/5.0061874] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Systematic reduction of the dimensionality is highly demanded in making a comprehensive interpretation of experimental and simulation data. Principal component analysis (PCA) is a widely used technique for reducing the dimensionality of molecular dynamics (MD) trajectories, which assists our understanding of MD simulation data. Here, we propose an approach that incorporates time dependence in the PCA algorithm. In the standard PCA, the eigenvectors obtained by diagonalizing the covariance matrix are time independent. In contrast, they are functions of time in our new approach, and their time evolution is implemented in the framework of Car-Parrinello or Born-Oppenheimer type adiabatic dynamics. Thanks to the time dependence, each of the step-by-step structural changes or intermittent collective fluctuations is clearly identified, which are often keys to provoking a drastic structural transformation but are easily masked in the standard PCA. The time dependence also allows for reoptimization of the principal components (PCs) according to the structural development, which can be exploited for enhanced sampling in MD simulations. The present approach is applied to phase transitions of a water model and conformational changes of a coarse-grained protein model. In the former, collective dynamics associated with the dihedral-motion in the tetrahedral network structure is found to play a key role in crystallization. In the latter, various conformations of the protein model were successfully sampled by enhancing structural fluctuation along the periodically optimized PC. Both applications clearly demonstrate the virtue of the new approach, which we refer to as time-dependent PCA.
Collapse
Affiliation(s)
- Tetsuya Morishita
- Research Center for Computational Design of Advanced Functional Materials (CD-FMat), National Institute of Advanced Industrial Science and Technology (AIST), Central 2, 1-1-1 Umezono, Tsukuba 305-8568, Japan and Mathematics for Advanced Materials Open Innovation Laboratory (MathAM-OIL), National Institute of Advanced Industrial Science and Technology (AIST), c/o AIMR, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, Japan
| |
Collapse
|
18
|
Musil F, Grisafi A, Bartók AP, Ortner C, Csányi G, Ceriotti M. Physics-Inspired Structural Representations for Molecules and Materials. Chem Rev 2021; 121:9759-9815. [PMID: 34310133 DOI: 10.1021/acs.chemrev.1c00021] [Citation(s) in RCA: 145] [Impact Index Per Article: 48.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The first step in the construction of a regression model or a data-driven analysis, aiming to predict or elucidate the relationship between the atomic-scale structure of matter and its properties, involves transforming the Cartesian coordinates of the atoms into a suitable representation. The development of atomic-scale representations has played, and continues to play, a central role in the success of machine-learning methods for chemistry and materials science. This review summarizes the current understanding of the nature and characteristics of the most commonly used structural and chemical descriptions of atomistic structures, highlighting the deep underlying connections between different frameworks and the ideas that lead to computationally efficient and universally applicable models. It emphasizes the link between properties, structures, their physical chemistry, and their mathematical description, provides examples of recent applications to a diverse set of chemical and materials science problems, and outlines the open questions and the most promising research directions in the field.
Collapse
Affiliation(s)
- Felix Musil
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Andrea Grisafi
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Albert P Bartók
- Department of Physics and Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Christoph Ortner
- University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, United Kingdom
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
19
|
Lattice Light-Sheet Microscopy Multi-dimensional Analyses (LaMDA) of T-Cell Receptor Dynamics Predict T-Cell Signaling States. Cell Syst 2021; 10:433-444.e5. [PMID: 32437685 DOI: 10.1016/j.cels.2020.04.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 02/29/2020] [Accepted: 04/21/2020] [Indexed: 12/19/2022]
Abstract
Lattice light-sheet microscopy provides large amounts of high-dimensional, high-spatiotemporal resolution imaging data of cell surface receptors across the 3D surface of live cells, but user-friendly analysis pipelines are lacking. Here, we introduce lattice light-sheet microscopy multi-dimensional analyses (LaMDA), an end-to-end pipeline comprised of publicly available software packages that combines machine learning, dimensionality reduction, and diffusion maps to analyze surface receptor dynamics and classify cellular signaling states without the need for complex biochemical measurements or other prior information. We use LaMDA to analyze images of T-cell receptor (TCR) microclusters on the surface of live primary T cells under resting and stimulated conditions. We observe global spatial and temporal changes of TCRs across the 3D cell surface, accurately differentiate stimulated cells from unstimulated cells, precisely predict attenuated T-cell signaling after CD4 and CD28 receptor blockades, and reliably discriminate between structurally similar TCR ligands. All instructions needed to implement LaMDA are included in this paper.
Collapse
|
20
|
Dhabal D, Jiang Z, Pallath A, Patel AJ. Characterizing the Interplay between Polymer Solvation and Conformation. J Phys Chem B 2021; 125:5434-5442. [PMID: 33978411 DOI: 10.1021/acs.jpcb.1c02191] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Conformational transitions of flexible molecules, especially those driven by hydrophobic effects, tend to be hindered by desolvation barriers. For such transitions, it is thus important to characterize and understand the interplay between solvation and conformation. Using specialized molecular simulations, here we perform such a characterization for a hydrophobic polymer solvated in water. We find that an external potential, which unfavorably perturbs the polymer hydration waters, can trigger a coil-to-globule or collapse transition, and that the relative stabilities of the collapsed and extended states can be quantified by the strength of the requisite potential. Our results also provide mechanistic insights into the collapse transition, highlighting that the bottleneck to polymer collapse is the formation of a sufficiently large cluster, and the collective dewetting of such a cluster. We also study the collapse of the hydrophobic polymer in octane, a nonpolar solvent, and interestingly, we find that the mechanistic details of the transition are qualitatively similar to that in water.
Collapse
Affiliation(s)
- Debdas Dhabal
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Zhitong Jiang
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Akash Pallath
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Amish J Patel
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
21
|
Tsai ST, Tiwary P. On the distance between A and B in molecular configuration space. MOLECULAR SIMULATION 2021. [DOI: 10.1080/08927022.2020.1761548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Sun-Ting Tsai
- Department of Physics and Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD, USA
| |
Collapse
|
22
|
Cruz-Chú ER, Hosseinizadeh A, Mashayekhi G, Fung R, Ourmazd A, Schwander P. Selecting XFEL single-particle snapshots by geometric machine learning. STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2021; 8:014701. [PMID: 33644252 PMCID: PMC7902084 DOI: 10.1063/4.0000060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 01/21/2021] [Indexed: 05/05/2023]
Abstract
A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of "single particles" and "non-single particles." As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability.
Collapse
Affiliation(s)
- Eduardo R. Cruz-Chú
- Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, Wisconsin 53211, USA
| | - Ahmad Hosseinizadeh
- Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, Wisconsin 53211, USA
| | - Ghoncheh Mashayekhi
- Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, Wisconsin 53211, USA
| | - Russell Fung
- Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, Wisconsin 53211, USA
| | - Abbas Ourmazd
- Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, Wisconsin 53211, USA
| | - Peter Schwander
- Department of Physics, University of Wisconsin-Milwaukee, 3135 N. Maryland Ave, Milwaukee, Wisconsin 53211, USA
| |
Collapse
|
23
|
Topel M, Ferguson AL. Reconstruction of protein structures from single-molecule time series. J Chem Phys 2020; 153:194102. [DOI: 10.1063/5.0024732] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Maximilian Topel
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, USA
| | - Andrew L. Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
24
|
Ford DM, Dendukuri A, Kalyoncu G, Luu K, Patitz MJ. Machine learning to identify variables in thermodynamically small systems. Comput Chem Eng 2020. [DOI: 10.1016/j.compchemeng.2020.106989] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
25
|
Chiappini M, Patti A, Dijkstra M. Helicoidal dynamics of biaxial curved rods in twist-bend nematic phases unveiled by unsupervised machine learning techniques. Phys Rev E 2020; 102:040601. [PMID: 33212681 DOI: 10.1103/physreve.102.040601] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 09/08/2020] [Indexed: 06/11/2023]
Abstract
Uniaxial rods in a nematic phase diffuse preferentially in the direction parallel to the nematic director n[over ̂]. The nematic director field n[over ̂](r) of a chiral twist-bend nematic (N_{TB}) phase of achiral banana-shaped particles, recently discovered experimentally, displays a heliconical twist of given handedness and periodicity. Using simulations, we investigate the long-time macroscopic diffusion in N_{TB} phases, and find that the predilection of curved rods to diffuse in the direction of the twisting n[over ̂](r) yields a fascinating chiral dynamics along helices, even though achiral curved rods display Brownian motion with a nontrivial rototranslational coupling. We devise a machine learning protocol to characterize the helicoidal particle trajectories, finding that their pitch and radius are determined by the pitch and conical angle of the N_{TB} phase thereby connecting its structural and dynamical properties.
Collapse
Affiliation(s)
- Massimiliano Chiappini
- Department of Physics, Soft Condensed Matter, Debye Institute for Nanomaterials Science, Utrecht University, Princetonplein 1, Utrecht 3584 CC, The Netherlands
| | - Alessandro Patti
- Department of Chemical Engineering and Analytical Science, The University of Manchester, Manchester M13 9PL, United Kingdom
| | - Marjolein Dijkstra
- Department of Physics, Soft Condensed Matter, Debye Institute for Nanomaterials Science, Utrecht University, Princetonplein 1, Utrecht 3584 CC, The Netherlands
| |
Collapse
|
26
|
Xie WJ, Qi Y, Zhang B. Characterizing chromatin folding coordinate and landscape with deep learning. PLoS Comput Biol 2020; 16:e1008262. [PMID: 32986691 PMCID: PMC7544120 DOI: 10.1371/journal.pcbi.1008262] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 10/08/2020] [Accepted: 08/14/2020] [Indexed: 12/13/2022] Open
Abstract
Genome organization is critical for setting up the spatial environment of gene transcription, and substantial progress has been made towards its high-resolution characterization. The underlying molecular mechanism for its establishment is much less understood. We applied a deep-learning approach, variational autoencoder (VAE), to analyze the fluctuation and heterogeneity of chromatin structures revealed by single-cell imaging and to identify a reaction coordinate for chromatin folding. This coordinate connects the seemingly random structures observed in individual cohesin-depleted cells as intermediate states along a folding pathway that leads to the formation of topologically associating domains (TAD). We showed that folding into wild-type-like structures remain energetically favorable in cohesin-depleted cells, potentially as a result of the phase separation between the two chromatin segments with active and repressive histone marks. The energetic stabilization, however, is not strong enough to overcome the entropic penalty, leading to the formation of only partially folded structures and the disappearance of TADs from contact maps upon averaging. Our study suggests that machine learning techniques, when combined with rigorous statistical mechanical analysis, are powerful tools for analyzing structural ensembles of chromatin.
Collapse
Affiliation(s)
- Wen Jun Xie
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Yifeng Qi
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
27
|
Bernetti M, Bertazzo M, Masetti M. Data-Driven Molecular Dynamics: A Multifaceted Challenge. Pharmaceuticals (Basel) 2020; 13:E253. [PMID: 32961909 PMCID: PMC7557855 DOI: 10.3390/ph13090253] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 12/18/2022] Open
Abstract
The big data concept is currently revolutionizing several fields of science including drug discovery and development. While opening up new perspectives for better drug design and related strategies, big data analysis strongly challenges our current ability to manage and exploit an extraordinarily large and possibly diverse amount of information. The recent renewal of machine learning (ML)-based algorithms is key in providing the proper framework for addressing this issue. In this respect, the impact on the exploitation of molecular dynamics (MD) simulations, which have recently reached mainstream status in computational drug discovery, can be remarkable. Here, we review the recent progress in the use of ML methods coupled to biomolecular simulations with potentially relevant implications for drug design. Specifically, we show how different ML-based strategies can be applied to the outcome of MD simulations for gaining knowledge and enhancing sampling. Finally, we discuss how intrinsic limitations of MD in accurately modeling biomolecular systems can be alleviated by including information coming from experimental data.
Collapse
Affiliation(s)
- Mattia Bernetti
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, I-34136 Trieste, Italy;
| | - Martina Bertazzo
- Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, I-16163 Genova, Italy;
| | - Matteo Masetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| |
Collapse
|
28
|
Gkeka P, Stoltz G, Barati Farimani A, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson AL, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Lelièvre T. Machine Learning Force Fields and Coarse-Grained Variables in Molecular Dynamics: Application to Materials and Biological Systems. J Chem Theory Comput 2020; 16:4757-4775. [PMID: 32559068 PMCID: PMC8312194 DOI: 10.1021/acs.jctc.0c00355] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Machine learning encompasses tools and algorithms that are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
Collapse
Affiliation(s)
- Paraskevi Gkeka
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
| | - Gabriel Stoltz
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| | | | - Zineb Belkacemi
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
| | - Michele Ceriotti
- Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Aaron R Dinner
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | | | - Hervé Minoux
- Integrated Drug Discovery, Sanofi R&D, 94403 Vitry-sur-Seine, France
| | | | - Fabio Pietrucci
- UMR CNRS 7590, MNHN, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, 75005 Paris, France
| | - Ana Silveira
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Zofia Trstanova
- School of Mathematics, The University of Edinburgh, Edinburgh EH9 3FD, U.K
| | - Rafal Wiewiora
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Tony Lelièvre
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| |
Collapse
|
29
|
Spiwok V, Kříž P. Time-Lagged t-Distributed Stochastic Neighbor Embedding (t-SNE) of Molecular Simulation Trajectories. Front Mol Biosci 2020; 7:132. [PMID: 32714941 PMCID: PMC7344294 DOI: 10.3389/fmolb.2020.00132] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 06/03/2020] [Indexed: 11/30/2022] Open
Abstract
Molecular simulation trajectories represent high-dimensional data. Such data can be visualized by methods of dimensionality reduction. Non-linear dimensionality reduction methods are likely to be more efficient than linear ones due to the fact that motions of atoms are non-linear. Here we test a popular non-linear t-distributed Stochastic Neighbor Embedding (t-SNE) method on analysis of trajectories of 200 ns alanine dipeptide dynamics and 208 μs Trp-cage folding and unfolding. Furthermore, we introduce a time-lagged variant of t-SNE in order to focus on rarely occurring transitions in the molecular system. This time-lagged t-SNE efficiently separates states according to distance in time. Using this method it is possible to visualize key states of studied systems (e.g., unfolded and folded protein) as well as possible kinetic traps using a two-dimensional plot. Time-lagged t-SNE is a visualization method and other applications, such as clustering and free energy modeling, must be done with caution.
Collapse
Affiliation(s)
- Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Czechia
| | - Pavel Kříž
- Department of Mathematics, University of Chemistry and Technology, Prague, Czechia
| |
Collapse
|
30
|
Fabrizio A, Meyer B, Corminboeuf C. Machine learning models of the energy curvature vs particle number for optimal tuning of long-range corrected functionals. J Chem Phys 2020; 152:154103. [DOI: 10.1063/5.0005039] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Affiliation(s)
- Alberto Fabrizio
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Benjamin Meyer
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
31
|
Sherman ZM, Howard MP, Lindquist BA, Jadrich RB, Truskett TM. Inverse methods for design of soft materials. J Chem Phys 2020; 152:140902. [DOI: 10.1063/1.5145177] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Affiliation(s)
- Zachary M. Sherman
- McKetta Department of Chemical Engineering, University of Texas at Austin, Austin, Texas 78712, USA
| | - Michael P. Howard
- McKetta Department of Chemical Engineering, University of Texas at Austin, Austin, Texas 78712, USA
| | - Beth A. Lindquist
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Ryan B. Jadrich
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Thomas M. Truskett
- McKetta Department of Chemical Engineering, University of Texas at Austin, Austin, Texas 78712, USA
- Department of Physics, University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
32
|
Shmilovich K, Mansbach RA, Sidky H, Dunne OE, Panda SS, Tovar JD, Ferguson AL. Discovery of Self-Assembling π-Conjugated Peptides by Active Learning-Directed Coarse-Grained Molecular Simulation. J Phys Chem B 2020; 124:3873-3891. [PMID: 32180410 DOI: 10.1021/acs.jpcb.0c00708] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Electronically active organic molecules have demonstrated great promise as novel soft materials for energy harvesting and transport. Self-assembled nanoaggregates formed from π-conjugated oligopeptides composed of an aromatic core flanked by oligopeptide wings offer emergent optoelectronic properties within a water-soluble and biocompatible substrate. Nanoaggregate properties can be controlled by tuning core chemistry and peptide composition, but the sequence-structure-function relations remain poorly characterized. In this work, we employ coarse-grained molecular dynamics simulations within an active learning protocol employing deep representational learning and Bayesian optimization to efficiently identify molecules capable of assembling pseudo-1D nanoaggregates with good stacking of the electronically active π-cores. We consider the DXXX-OPV3-XXXD oligopeptide family, where D is an Asp residue and OPV3 is an oligophenylenevinylene oligomer (1,4-distyrylbenzene), to identify the top performing XXX tripeptides within all 203 = 8000 possible sequences. By direct simulation of only 2.3% of this space, we identify molecules predicted to exhibit superior assembly relative to those reported in prior work. Spectral clustering of the top candidates reveals new design rules governing assembly. This work establishes new understanding of DXXX-OPV3-XXXD assembly, identifies promising new candidates for experimental testing, and presents a computational design platform that can be generically extended to other peptide-based and peptide-like systems.
Collapse
Affiliation(s)
- Kirill Shmilovich
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Rachael A Mansbach
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Hythem Sidky
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Olivia E Dunne
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Sayak Subhra Panda
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, United States.,Institute of NanoBioTechnology, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - John D Tovar
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, United States.,Institute of NanoBioTechnology, Johns Hopkins University, Baltimore, Maryland 21218, United States.,Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
33
|
Sidky H, Chen W, Ferguson AL. Machine learning for collective variable discovery and enhanced sampling in biomolecular simulation. Mol Phys 2020. [DOI: 10.1080/00268976.2020.1737742] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Hythem Sidky
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL, USA
| | - Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Andrew L. Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL, USA
| |
Collapse
|
34
|
Bejagam KK, Singh SK, Ahn R, Deshmukh SA. Unraveling the Conformations of Backbone and Side Chains in Thermosensitive Bottlebrush Polymers. Macromolecules 2019. [DOI: 10.1021/acs.macromol.9b01021] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Karteek K. Bejagam
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | | | - Rebecca Ahn
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Sanket A. Deshmukh
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
35
|
Ma Y, Ferguson AL. Inverse design of self-assembling colloidal crystals with omnidirectional photonic bandgaps. SOFT MATTER 2019; 15:8808-8826. [PMID: 31603182 DOI: 10.1039/c9sm01500k] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Open colloidal lattices possessing omnidirectional photonic bandgaps in the visible or near-visible regime are attractive optical materials the realization of which has remained elusive. We report the use of an inverse design strategy termed landscape engineering that rationally sculpts the free energy self-assembly landscape using evolutionary algorithms to discover anisotropic patchy colloids capable of spontaneously assembling pyrochlore and cubic diamond lattices possessing complete photonic bandgaps. We validate the designs in computer simulations to demonstrate the defect-free formation of these lattices via a two-stage hierarchical assembly mechanism. Our approach demonstrates a principled strategy for the inverse design of self-assembling colloids for the bottom-up fabrication of desired crystal lattices.
Collapse
Affiliation(s)
- Yutao Ma
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, IL 60637, USA.
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, IL 60637, USA.
| |
Collapse
|
36
|
Li C, Dammak H, Dezanneau G. Identification of oxygen diffusion mechanisms in Nd 1-xAE xBaInO 4-x/2 (AE = Ca, Sr, Ba) compounds through molecular dynamics. Phys Chem Chem Phys 2019; 21:21506-21516. [PMID: 31535110 DOI: 10.1039/c9cp03048d] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Molecular dynamics simulations have been widely adopted to study oxygen ion diffusion mechanisms in materials for application in solid oxide fuel cells. Indeed, understanding the fundamental aspects of oxygen diffusion is important to develop new materials for this application. In this work, Nd1-xAExBaInO4-x/2 (AE = Ca, Sr, Ba) compounds have been studied by MD simulations focusing on oxygen diffusion mechanisms. Two general clustering methods were used, namely a convex hull classification method and a DBSCAN machine learning algorithm, to identify oxygen ion diffusion pathways. Here, relevant details are provided for an efficient use of these two approaches during MD analysis of ion conductors. The calculations show that Ca is the most favorable dopant for substituting Nd in NdBaInO4, while Ba is the least desired. Indeed, the substitution of Nd by Ca hardly changes the pristine lattice parameters of NdBaInO4 and leads to the highest oxygen diffusion coefficient compared to other dopants. The oxygen vacancies induced by doping mainly locate on two specific oxygen sites over four oxygen sites available. Concerning the diffusion process, jumps involving these two sites play the main role and are associated with smaller migration enthalpies. For the main diffusion path, ions migrate along the b (2 routes) and c (4 routes) directions. Some other oxygen sites can be considered as barriers for the diffusion process inducing a strong anisotropy in the diffusion process. Additionally, the residence time analysis of oxygen ions confirms that ions at different sites have different motion abilities. As a whole, the approach presented here can be extrapolated to other ion conductors for gaining detailed information about the diffusion process.
Collapse
Affiliation(s)
- Chenyi Li
- Laboratoire Structures, Propriétés et Modélisation des Solides, UMR 8580 CNRS, CentraleSupélec, Université Paris-Saclay, 91190, Gif-sur-Yvette, France.
| | | | | |
Collapse
|
37
|
Eberhardt J, Stote RH, Dejaegere A. Unrolr: Structural analysis of protein conformations using stochastic proximity embedding. J Comput Chem 2019; 39:2551-2557. [PMID: 30447084 DOI: 10.1002/jcc.25599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 08/24/2018] [Accepted: 08/24/2018] [Indexed: 01/29/2023]
Abstract
Molecular dynamics (MD) simulations are widely used to explore the conformational space of biological macromolecules. Advances in hardware, as well as in methods, make the generation of large and complex MD datasets much more common. Although different clustering and dimensionality reduction methods have been applied to MD simulations, there remains a need for improved strategies that handle nonlinear data and/or can be applied to very large datasets. We present an original implementation of the pivot-based version of the stochastic proximity embedding method aimed at large MD datasets using the dihedral distance as a metric. The advantages of the algorithm in terms of data storage and computational efficiency are presented, as well as the implementation realized. Application and testing through the analysis of a 200 ns accelerated MD simulation of a 35-residue villin headpiece is discussed. Analysis of the simulation shows the promise of this method to organize large conformational ensembles. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Jérôme Eberhardt
- Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France
| | - Roland H Stote
- Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France
| | - Annick Dejaegere
- Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France
| |
Collapse
|
38
|
Tan Q, Duan M, Li M, Han L, Huo S. Approximating dynamic proximity with a hybrid geometry energy-based kernel for diffusion maps. J Chem Phys 2019; 151:105101. [PMID: 31521094 DOI: 10.1063/1.5100968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The diffusion map is a dimensionality reduction method. The reduction coordinates are associated with the leading eigenfunctions of the backward Fokker-Planck operator, providing a dynamic meaning for these coordinates. One of the key factors that affect the accuracy of diffusion map embedding is the dynamic measure implemented in the Gaussian kernel. A common practice in diffusion map study of molecular systems is to approximate dynamic proximity with RMSD (root-mean-square deviation). In this paper, we present a hybrid geometry-energy based kernel. Since high energy-barriers may exist between geometrically similar conformations, taking both RMSD and energy difference into account in the kernel can better describe conformational transitions between neighboring conformations and lead to accurate embedding. We applied our diffusion map method to the β-hairpin of the B1 domain of streptococcal protein G and to Trp-cage. Our results in β-hairpin show that the diffusion map embedding achieves better results with the hybrid kernel than that with the RMSD-based kernel in terms of free energy landscape characterization and a new correlation measure between the cluster center Euclidean distances in the reduced-dimension space and the reciprocals of the total net flow between these clusters. In addition, our diffusion map analysis of the ultralong molecular dynamics trajectory of Trp-cage has provided a unified view of its folding mechanism. These promising results demonstrate the effectiveness of our diffusion map approach in the analysis of the dynamics and thermodynamics of molecular systems. The hybrid geometry-energy criterion could be also useful as a general dynamic measure for other purposes.
Collapse
Affiliation(s)
- Qingzhe Tan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Mojie Duan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Minghai Li
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| | - Li Han
- Department of Math and Computer Science, Clark University, Worcester, Massachusetts 01610, USA
| | - Shuanghong Huo
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, 950 Main Street, Worcester, Massachusetts 01610, USA
| |
Collapse
|
39
|
Thiede EH, Giannakis D, Dinner AR, Weare J. Galerkin approximation of dynamical quantities using trajectory data. J Chem Phys 2019; 150:244111. [PMID: 31255053 PMCID: PMC6824902 DOI: 10.1063/1.5063730] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Accepted: 05/13/2019] [Indexed: 11/14/2022] Open
Abstract
Understanding chemical mechanisms requires estimating dynamical statistics such as expected hitting times, reaction rates, and committors. Here, we present a general framework for calculating these dynamical quantities by approximating boundary value problems using dynamical operators with a Galerkin expansion. A specific choice of basis set in the expansion corresponds to the estimation of dynamical quantities using a Markov state model. More generally, the boundary conditions impose restrictions on the choice of basis sets. We demonstrate how an alternative basis can be constructed using ideas from diffusion maps. In our numerical experiments, this basis gives results of comparable or better accuracy to Markov state models. Additionally, we show that delay embedding can reduce the information lost when projecting the system's dynamics for model construction; this improves estimates of dynamical statistics considerably over the standard practice of increasing the lag time.
Collapse
Affiliation(s)
- Erik H Thiede
- Department of Chemistry and James Franck Institute, The University of Chicago, Chicago, Illinois 60637, USA
| | - Dimitrios Giannakis
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| | - Aaron R Dinner
- Department of Chemistry and James Franck Institute, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| |
Collapse
|
40
|
Xu X, Wei Q, Li H, Wang Y, Chen Y, Jiang Y. Recognition of polymer configurations by unsupervised learning. Phys Rev E 2019; 99:043307. [PMID: 31108670 DOI: 10.1103/physreve.99.043307] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Indexed: 12/30/2022]
Abstract
Unsupervised learning as an important branch of machine learning is commonly adopted to discover patterns, with the purpose of conducting data clustering without being labeled in advance. In this study, we elucidate the striking ability of unsupervised learning techniques in exploring the phase transitions of polymer configurations. In order to extract the low-dimensional representation of polymer configurations, principal component analysis and diffusion map are applied to distinguish the coiled state and collapsed states and further detect the delicate distinction among collapsed states, respectively. These dimensionality reduction techniques not only identify the distinct states in the feature space, but also offer significant insights to understand the relation between salient features and order parameters in physics. In addition, a hybrid neural network scheme combining the supervised learning and unsupervised learning is utilized to precisely detect the critical point of phase transition between polymer configurations. Our study demonstrates a promising strategy based on the unsupervised learning, particularly in the exploration of phase transition in polymeric systems.
Collapse
Affiliation(s)
- Xin Xu
- School of Chemistry & Key Laboratory of Bio-Inspired Smart Interfacial Science and Technology of Ministry of Education & Center of Soft Matter Physics and Its Applications, Beihang University, Beijing 100191, China
| | - Qianshi Wei
- School of Chemistry & Key Laboratory of Bio-Inspired Smart Interfacial Science and Technology of Ministry of Education & Center of Soft Matter Physics and Its Applications, Beihang University, Beijing 100191, China.,Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1
| | - Huaping Li
- School of Chemistry & Key Laboratory of Bio-Inspired Smart Interfacial Science and Technology of Ministry of Education & Center of Soft Matter Physics and Its Applications, Beihang University, Beijing 100191, China
| | - Yuzhang Wang
- School of Chemistry & Key Laboratory of Bio-Inspired Smart Interfacial Science and Technology of Ministry of Education & Center of Soft Matter Physics and Its Applications, Beihang University, Beijing 100191, China
| | - Yuguo Chen
- School of Chemistry & Key Laboratory of Bio-Inspired Smart Interfacial Science and Technology of Ministry of Education & Center of Soft Matter Physics and Its Applications, Beihang University, Beijing 100191, China
| | - Ying Jiang
- School of Chemistry & Key Laboratory of Bio-Inspired Smart Interfacial Science and Technology of Ministry of Education & Center of Soft Matter Physics and Its Applications, Beihang University, Beijing 100191, China.,Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
41
|
Rosenberger D, van der Vegt NFA. Relative entropy indicates an ideal concentration for structure-based coarse graining of binary mixtures. Phys Rev E 2019; 99:053308. [PMID: 31212527 DOI: 10.1103/physreve.99.053308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Indexed: 06/09/2023]
Abstract
Many methodological approaches have been proposed to improve systematic or bottom-up coarse-graining techniques to enhance the representability and transferability of the derived interaction potentials. Transferability describes the ability of a coarse-grained (CG) model to be predictive, i.e., to describe a system at state points different from those chosen for parametrization. Whereas the representability characterizes the accuracy of a CG model to reproduce target properties of the underlying reference or fine-grained model at a given state point. In this article, we shift the focus away from methodological aspects and rather raise the question whether we can overcome the disadvantages of a given method in terms of representability and transferability by systematically selecting the state point at which the CG model gets parametrized. We answer this question by applying the inverse Monte Carlo (IMC) approach-a structure-based coarse-graining method-to derive effective interactions for binary mixtures of simple Lennard-Jones (LJ) particles, which are different in size. For such simple systems we indeed can identify a concentration where the derived potentials show the best performance in terms of structural representability and transferability. This specific concentration is identified by computing the relative entropy which quantifies the information loss between different IMC models and the reference LJ model at varying mixture compositions. Further, we show that an IMC model for mixtures of n-hexane and n-perfluorohexane shows the same trend in transferability as the IMC models for the LJ system. All derived models are more transferable in the direction of increasing concentration of the larger-sized compound.
Collapse
Affiliation(s)
- David Rosenberger
- Eduard Zintl Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Darmstadt, 64287, Germany
| | - Nico F A van der Vegt
- Eduard Zintl Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Darmstadt, 64287, Germany
| |
Collapse
|
42
|
Jin J, Han Y, Voth GA. Coarse-graining involving virtual sites: Centers of symmetry coarse-graining. J Chem Phys 2019; 150:154103. [DOI: 10.1063/1.5067274] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yining Han
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Gregory A. Voth
- Department of Chemistry, James Franck Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
43
|
Ceriotti M. Unsupervised machine learning in atomistic simulations, between predictions and understanding. J Chem Phys 2019; 150:150901. [PMID: 31005087 DOI: 10.1063/1.5091842] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Automated analyses of the outcome of a simulation have been an important part of atomistic modeling since the early days, addressing the need of linking the behavior of individual atoms and the collective properties that are usually the final quantity of interest. Methods such as clustering and dimensionality reduction have been used to provide a simplified, coarse-grained representation of the structure and dynamics of complex systems from proteins to nanoparticles. In recent years, the rise of machine learning has led to an even more widespread use of these algorithms in atomistic modeling and to consider different classification and inference techniques as part of a coherent toolbox of data-driven approaches. This perspective briefly reviews some of the unsupervised machine-learning methods-that are geared toward classification and coarse-graining of molecular simulations-seen in relation to the fundamental mathematical concepts that underlie all machine-learning techniques. It discusses the importance of using concise yet complete representations of atomic structures as the starting point of the analyses and highlights the risk of introducing preconceived biases when using machine learning to rationalize and understand structure-property relations. Supervised machine-learning techniques that explicitly attempt to predict the properties of a material given its structure are less susceptible to such biases. Current developments in the field suggest that using these two classes of approaches side-by-side and in a fully integrated mode, while keeping in mind the relations between the data analysis framework and the fundamental physical principles, will be key to realizing the full potential of machine learning to help understand the behavior of complex molecules and materials.
Collapse
Affiliation(s)
- Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institute des Materiaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
44
|
Dixit PD. Introducing User-Prescribed Constraints in Markov Chains for Nonlinear Dimensionality Reduction. Neural Comput 2019; 31:980-997. [PMID: 30883279 DOI: 10.1162/neco_a_01184] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Stochastic kernel-based dimensionality-reduction approaches have become popular in the past decade. The central component of many of these methods is a symmetric kernel that quantifies the vicinity between pairs of data points and a kernel-induced Markov chain on the data. Typically, the Markov chain is fully specified by the kernel through row normalization. However, in many cases, it is desirable to impose user-specified stationary-state and dynamical constraints on the Markov chain. Unfortunately, no systematic framework exists to impose such user-defined constraints. Here, based on our previous work on inference of Markov models, we introduce a path entropy maximization based approach to derive the transition probabilities of Markov chains using a kernel and additional user-specified constraints. We illustrate the usefulness of these Markov chains with examples.
Collapse
Affiliation(s)
- Purushottam D Dixit
- Department of Systems Biology, Columbia University, New York, NY 10032, U.S.A.
| |
Collapse
|
45
|
Abstract
This chapter discusses the way in which dimensionality reduction algorithms such as diffusion maps and sketch-map can be used to analyze molecular dynamics trajectories. The first part discusses how these various algorithms function as well as practical issues such as landmark selection and how these algorithms can be used when the data to be analyzed comes from enhanced sampling trajectories. In the later part a comparison between the results obtained by applying various algorithms to two sets of sample data is performed and discussed. This section is then followed by a summary of how one algorithm in particular, sketch-map, has been applied to a range of problems. The chapter concludes with a discussion on the directions that we believe this field is currently moving.
Collapse
|
46
|
Wang J, Ferguson AL. Recovery of Protein Folding Funnels from Single-Molecule Time Series by Delay Embeddings and Manifold Learning. J Phys Chem B 2018; 122:11931-11952. [PMID: 30428261 DOI: 10.1021/acs.jpcb.8b08800] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The stability and folding of proteins is governed by the underlying single-molecule free energy surface (smFES) mapping the free energy of the molecule as a function of configurational state. Ascertaining the smFES is of great value in understanding and engineering protein structure and function. By integrating tools from dynamical systems theory and nonlinear manifold learning, we describe an approach to reconstruct the multidimensional smFES for a protein from a time series in a single experimentally measurable observable. We employ Takens' delay embeddings to project the time series into a high-dimensional space in which the projected dynamics are C1-equivalent to the true system dynamics and employ diffusion maps to recover a low-dimensional reconstruction of the smFES that is equivalent to the true smFES up to a smooth and invertible transformation. We validate the approach in molecular dynamics simulations of Trp-cage, Villin, and BBA to demonstrate that landscapes recovered from univariate time series in the head-to-tail distance are topologically identical-they precisely preserve the metastable states and folding pathways-and topographically approximate-the free energy barrier heights and well depths are approximately preserved-to the true landscapes determined from complete knowledge of all atomic coordinates. We go on to show that the reconstructed landscapes reliably predict temperature denaturation and identify point mutations and groups of mutations critical to folding. These results demonstrate that protein folding funnels can be reconstructed from experimentally measurable time series and used to understand and engineer folding.
Collapse
Affiliation(s)
- Jiang Wang
- Department of Physics , University of Illinois at Urbana-Champaign , 1110 West Green Street , Urbana , Illinois 61801 , United States
| | - Andrew L Ferguson
- Institute for Molecular Engineering , University of Chicago , 5640 South Ellis Avenue , Chicago , Illinois 60637 , United States
| |
Collapse
|
47
|
Bejagam KK, An Y, Singh S, Deshmukh SA. Machine-Learning Enabled New Insights into the Coil-to-Globule Transition of Thermosensitive Polymers Using a Coarse-Grained Model. J Phys Chem Lett 2018; 9:6480-6488. [PMID: 30372083 DOI: 10.1021/acs.jpclett.8b02956] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We present a computational framework that integrates coarse-grained (CG) molecular dynamics (MD) simulations and a data-driven machine-learning (ML) method to gain insights into the conformations of polymers in solutions. We employ this framework to study conformational transition of a model thermosensitive polymer, poly( N-isopropylacrylamide) (PNIPAM). Here, we have developed the first of its kind, a temperature-independent CG model of PNIPAM that can accurately predict its experimental lower critical solution temperature (LCST) while retaining the tacticity in the presence of an explicit water model. The CG model was extensively validated by performing CG MD simulations with different initial conformations, varying the radius of gyration of chain, the chain length, and the angle between the adjacent monomers of the initial configuration of PNIPAM (total simulation time = 90 μs). Moreover, for the first time, we utilize the nonmetric multidimensional scaling (NMDS) method, a data-driven ML approach, to gain further insights into the mechanisms and pathways of this coil-to-globule transition by analyzing CG MD simulation trajectories. NMDS analysis provides entirely new insights and shows multiple metastable states of PNIPAM during its coil-to-globule transition above the LCST.
Collapse
Affiliation(s)
- Karteek K Bejagam
- Department of Chemical Engineering , Virginia Tech , Blacksburg , Virginia 24061 , United States
| | - Yaxin An
- Department of Chemical Engineering , Virginia Tech , Blacksburg , Virginia 24061 , United States
| | - Samrendra Singh
- CNH Industrial , Burr Ridge , Illinois 60527 , United States
| | - Sanket A Deshmukh
- Department of Chemical Engineering , Virginia Tech , Blacksburg , Virginia 24061 , United States
| |
Collapse
|
48
|
Chen W, Ferguson AL. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration. J Comput Chem 2018; 39:2079-2102. [PMID: 30368832 DOI: 10.1002/jcc.25520] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 06/14/2018] [Indexed: 01/08/2023]
Abstract
Macromolecular and biomolecular folding landscapes typically contain high free energy barriers that impede efficient sampling of configurational space by standard molecular dynamics simulation. Biased sampling can artificially drive the simulation along prespecified collective variables (CVs), but success depends critically on the availability of good CVs associated with the important collective dynamical motions. Nonlinear machine learning techniques can identify such CVs but typically do not furnish an explicit relationship with the atomic coordinates necessary to perform biased sampling. In this work, we employ auto-associative artificial neural networks ("autoencoders") to learn nonlinear CVs that are explicit and differentiable functions of the atomic coordinates. Our approach offers substantial speedups in exploration of configurational space, and is distinguished from existing approaches by its capacity to simultaneously discover and directly accelerate along data-driven CVs. We demonstrate the approach in simulations of alanine dipeptide and Trp-cage, and have developed an open-source and freely available implementation within OpenMM. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 W Green Street, Urbana, Illinois, 61801.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois, 61801
| |
Collapse
|
49
|
Chen W, Tan AR, Ferguson AL. Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design. J Chem Phys 2018; 149:072312. [PMID: 30134681 DOI: 10.1063/1.5023804] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Auto-associative neural networks ("autoencoders") present a powerful nonlinear dimensionality reduction technique to mine data-driven collective variables from molecular simulation trajectories. This technique furnishes explicit and differentiable expressions for the nonlinear collective variables, making it ideally suited for integration with enhanced sampling techniques for accelerated exploration of configurational space. In this work, we describe a number of sophistications of the neural network architectures to improve and generalize the process of interleaved collective variable discovery and enhanced sampling. We employ circular network nodes to accommodate periodicities in the collective variables, hierarchical network architectures to rank-order the collective variables, and generalized encoder-decoder architectures to support bespoke error functions for network training to incorporate prior knowledge. We demonstrate our approach in blind collective variable discovery and enhanced sampling of the configurational free energy landscapes of alanine dipeptide and Trp-cage using an open-source plugin developed for the OpenMM molecular simulation package.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| | - Aik Rui Tan
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, Illinois 61801, USA
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| |
Collapse
|
50
|
Cai Z, Zhang Y. Hydrophobicity-driven unfolding of Trp-cage encapsulated between graphene sheets. Colloids Surf B Biointerfaces 2018; 168:103-108. [PMID: 29627125 DOI: 10.1016/j.colsurfb.2018.03.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Revised: 03/23/2018] [Accepted: 03/24/2018] [Indexed: 11/17/2022]
Abstract
Understanding the interaction between proteins and graphene not only helps elucidate the behaviors of proteins in confined geometries, but is also imperative to the development of a plethora of graphene-based biotechnologies, such as the graphene liquid cell transmission electron microscopy. To discuss the overall geometrical-thermal effects on proteins, we performed molecular dynamics simulations of hydrated Trp-cage miniprotein sandwiched between two graphene sheets and in the bulk environment at the temperatures below and above its unfolding temperature. The structural fluctuations of Trp-cage were characterized using the backbone root mean square displacement and the radius of gyration, from which the free energy landscape of Trp-cage was further constructed. We observed that at both temperatures the confined protein became adsorbed to the graphene surfaces and exhibited unfolded structures. Residue-specific analyses clearly showed the preference for the graphene to interact with the hydrophobic regions of Trp-cage. These results suggested that the conformation space accessible to the protein results from the competition between the thermodynamic driving forces and the geometrical restraints. While confinement usually tends to restrict the conformation of proteins by volume exclusion, it may also induce the unfolding of proteins by hydrophobic interactions.
Collapse
Affiliation(s)
- Zhikun Cai
- Department of Nuclear, Plasma, and Radiological Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Yang Zhang
- Department of Nuclear, Plasma, and Radiological Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|