1
|
Sawade K, Marx A, Peter C, Kukharenko O. Combining molecular dynamics simulations and scoring method to computationally model ubiquitylated linker histones in chromatosomes. PLoS Comput Biol 2023; 19:e1010531. [PMID: 37527265 PMCID: PMC10442151 DOI: 10.1371/journal.pcbi.1010531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 08/21/2023] [Accepted: 06/15/2023] [Indexed: 08/03/2023] Open
Abstract
The chromatin in eukaryotic cells plays a fundamental role in all processes during a cell's life cycle. This nucleoprotein is normally tightly packed but needs to be unpacked for expression and division. The linker histones are critical for such packaging processes and while most experimental and simulation works recognize their crucial importance, the focus is nearly always set on the nucleosome as the basic chromatin building block. Linker histones can undergo several modifications, but only few studies on their ubiquitylation have been conducted. Mono-ubiquitylated linker histones (HUb), while poorly understood, are expected to influence DNA compaction. The size of ubiquitin and the globular domain of the linker histone are comparable and one would expect an increased disorder upon ubiquitylation of the linker histone. However, the formation of higher order chromatin is not hindered and ubiquitylation of the linker histone may even promote gene expression. Structural data on chromatosomes is rare and HUb has never been modeled in a chromatosome so far. Descriptions of the chromatin complex with HUb would greatly benefit from computational structural data. In this study we generate molecular dynamics simulation data for six differently linked HUb variants with the help of a sampling scheme tailored to drive the exploration of phase space. We identify conformational sub-states of the six HUb variants using the sketch-map algorithm for dimensionality reduction and iterative HDBSCAN for clustering on the excessively sampled, shallow free energy landscapes. We present a highly efficient geometric scoring method to identify sub-states of HUb that fit into the nucleosome. We predict HUb conformations inside a nucleosome using on-dyad and off-dyad chromatosome structures as reference and show that unbiased simulations of HUb produce significantly more fitting than non-fitting HUb conformations. A tetranucleosome array is used to show that ubiquitylation can even occur in chromatin without too much steric clashes.
Collapse
Affiliation(s)
- Kevin Sawade
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| | - Andreas Marx
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| | - Christine Peter
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| | - Oleksandra Kukharenko
- Department of Chemistry, University of Konstanz, Konstanz, Germany
- Theory Department, Max-Planck Institute for Polymer Research, Mainz, Germany
| |
Collapse
|
2
|
Zheng LE, Barethiya S, Nordquist E, Chen J. Machine Learning Generation of Dynamic Protein Conformational Ensembles. Molecules 2023; 28:4047. [PMID: 37241789 PMCID: PMC10220786 DOI: 10.3390/molecules28104047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/04/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023] Open
Abstract
Machine learning has achieved remarkable success across a broad range of scientific and engineering disciplines, particularly its use for predicting native protein structures from sequence information alone. However, biomolecules are inherently dynamic, and there is a pressing need for accurate predictions of dynamic structural ensembles across multiple functional levels. These problems range from the relatively well-defined task of predicting conformational dynamics around the native state of a protein, which traditional molecular dynamics (MD) simulations are particularly adept at handling, to generating large-scale conformational transitions connecting distinct functional states of structured proteins or numerous marginally stable states within the dynamic ensembles of intrinsically disordered proteins. Machine learning has been increasingly applied to learn low-dimensional representations of protein conformational spaces, which can then be used to drive additional MD sampling or directly generate novel conformations. These methods promise to greatly reduce the computational cost of generating dynamic protein ensembles, compared to traditional MD simulations. In this review, we examine recent progress in machine learning approaches towards generative modeling of dynamic protein ensembles and emphasize the crucial importance of integrating advances in machine learning, structural data, and physical principles to achieve these ambitious goals.
Collapse
Affiliation(s)
- Li-E Zheng
- Department of Gynecology, The First Affiliated Hospital of Fujian Medical University, Fuzhou 350005, China;
| | - Shrishti Barethiya
- Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA; (S.B.); (E.N.)
| | - Erik Nordquist
- Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA; (S.B.); (E.N.)
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA; (S.B.); (E.N.)
| |
Collapse
|
3
|
Hunkler S, Diederichs K, Kukharenko O, Peter C. Fast conformational clustering of extensive molecular dynamics simulation data. J Chem Phys 2023; 158:144109. [PMID: 37061476 DOI: 10.1063/5.0142797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023] Open
Abstract
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.
Collapse
Affiliation(s)
- Simon Hunkler
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| | - Kay Diederichs
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| | | | - Christine Peter
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| |
Collapse
|
4
|
Gupta A, Dey S, Hicks A, Zhou HX. Artificial intelligence guided conformational mining of intrinsically disordered proteins. Commun Biol 2022; 5:610. [PMID: 35725761 PMCID: PMC9209487 DOI: 10.1038/s42003-022-03562-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 06/07/2022] [Indexed: 12/29/2022] Open
Abstract
Artificial intelligence recently achieved the breakthrough of predicting the three-dimensional structures of proteins. The next frontier is presented by intrinsically disordered proteins (IDPs), which, representing 30% to 50% of proteomes, readily access vast conformational space. Molecular dynamics (MD) simulations are promising in sampling IDP conformations, but only at extremely high computational cost. Here, we developed generative autoencoders that learn from short MD simulations and generate full conformational ensembles. An encoder represents IDP conformations as vectors in a reduced-dimensional latent space. The mean vector and covariance matrix of the training dataset are calculated to define a multivariate Gaussian distribution, from which vectors are sampled and fed to a decoder to generate new conformations. The ensembles of generated conformations cover those sampled by long MD simulations and are validated by small-angle X-ray scattering profile and NMR chemical shifts. This work illustrates the vast potential of artificial intelligence in conformational mining of IDPs.
Collapse
Affiliation(s)
- Aayush Gupta
- grid.185648.60000 0001 2175 0319Department of Chemistry, University of Illinois at Chicago, Chicago, IL 60607 USA
| | - Souvik Dey
- grid.185648.60000 0001 2175 0319Department of Chemistry, University of Illinois at Chicago, Chicago, IL 60607 USA
| | - Alan Hicks
- grid.185648.60000 0001 2175 0319Department of Chemistry, University of Illinois at Chicago, Chicago, IL 60607 USA
| | - Huan-Xiang Zhou
- grid.185648.60000 0001 2175 0319Department of Chemistry, University of Illinois at Chicago, Chicago, IL 60607 USA ,grid.185648.60000 0001 2175 0319Department of Physics, University of Illinois at Chicago, Chicago, IL 60607 USA
| |
Collapse
|
5
|
Montepietra D, Cecconi C, Brancolini G. Combining enhanced sampling and deep learning dimensionality reduction for the study of the heat shock protein B8 and its pathological mutant K141E. RSC Adv 2022; 12:31996-32011. [PMID: 36380940 PMCID: PMC9641792 DOI: 10.1039/d2ra04913a] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 10/28/2022] [Indexed: 11/11/2022] Open
Abstract
The biological functions of proteins closely depend on their conformational dynamics. This aspect is especially relevant for intrinsically disordered proteins (IDP) for which structural ensembles often offer more useful representations than individual conformations. Here we employ extensive enhanced sampling temperature replica-exchange atomistic simulations (TREMD) and deep learning dimensionality reduction to study the conformational ensembles of the human heat shock protein B8 and its pathological mutant K141E, for which no experimental 3D structures are available. First, we combined homology modelling with TREMD to generate high-dimensional data sets of 3D structures. Then, we employed a recently developed machine learning based post-processing algorithm, EncoderMap, to project the large conformational data sets into meaningful two-dimensional maps that helped us interpret the data and extract the most significant conformations adopted by both proteins during TREMD. These studies provide the first 3D structural characterization of HSPB8 and reveal the effects of the pathogenic K141E mutation on its conformational ensembles. In particular, this missense mutation appears to increase the compactness of the protein and its structural variability, at the same time rearranging the hydrophobic patches exposed on the protein surface. These results offer the possibility of rationalizing the pathogenic effects of the K141E mutation in terms of conformational changes. The study provides the first 3D structural characterization of HSPB8 and its K141E mutant: extensive TREMD are combined with a deep learning algorithm to rationalize the disordered ensemble of structures adopted by each variant.![]()
Collapse
Affiliation(s)
- Daniele Montepietra
- Department of Physics, Computer Science and Mathematics, University of Modena and Reggio Emilia, Via Campi 213/A, 41100 Modena, Italy
- Istituto Nanoscienze – CNR-NANO, Center S3, Via G. Campi 213/A, 41100 Modena, Italy
| | - Ciro Cecconi
- Department of Physics, Computer Science and Mathematics, University of Modena and Reggio Emilia, Via Campi 213/A, 41100 Modena, Italy
- Istituto Nanoscienze – CNR-NANO, Center S3, Via G. Campi 213/A, 41100 Modena, Italy
| | - Giorgia Brancolini
- Istituto Nanoscienze – CNR-NANO, Center S3, Via G. Campi 213/A, 41100 Modena, Italy
| |
Collapse
|
6
|
Bhattacharya S, Xu L, Thompson D. Characterization of Amyloidogenic Peptide Aggregability in Helical Subspace. Methods Mol Biol 2022; 2340:401-448. [PMID: 35167084 DOI: 10.1007/978-1-0716-1546-1_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Prototypical amyloidogenic peptides amyloid-β (Aβ) and α-synuclein (αS) can undergo helix-helix associations via partially folded helical conformers, which may influence pathological progression to Alzheimer's (AD) and Parkinson's disease (PD), respectively. At the other extreme, stable folded helical conformers have been reported to resist self-assembly and amyloid formation. Experimental characterisation of such disparities in aggregation profiles due to subtle differences in peptide stabilities is precluded by the conformational heterogeneity of helical subspace. The diverse physical models used in molecular simulations allow sampling distinct regions of the phase space and are extensive in capturing the ensemble of rich helical subspace. Robust and powerful computational predictive methods utilizing network theory and free energy mapping can model the origin of helical population shifts in amyloidogenic peptides, which highlight their inherent aggregability. In this chapter, we discuss computational models, methods, design rules, and strategies to identify the driving force behind helical self-assembly and the molecular origin of aggregation resistance in helical intermediates of Aβ42 and αS. By extensive multiscale mapping of intrapeptide interactions, we show that the computational models can capture features that are otherwise imperceptible to experiments. Our models predict that targeting terminal residues may allow modulation and control of initial pathogenic aggregability of amyloidogenic peptides.
Collapse
Affiliation(s)
- Shayon Bhattacharya
- Department of Physics, Bernal Institute, University of Limerick, Limerick, Ireland
| | - Liang Xu
- Department of Physics, Bernal Institute, University of Limerick, Limerick, Ireland
| | - Damien Thompson
- Department of Physics, Bernal Institute, University of Limerick, Limerick, Ireland.
| |
Collapse
|
7
|
Chen M. Collective variable-based enhanced sampling and machine learning. THE EUROPEAN PHYSICAL JOURNAL. B 2021; 94:211. [PMID: 34697536 PMCID: PMC8527828 DOI: 10.1140/epjb/s10051-021-00220-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 10/03/2021] [Indexed: 05/14/2023]
Abstract
ABSTRACT Collective variable-based enhanced sampling methods have been widely used to study thermodynamic properties of complex systems. Efficiency and accuracy of these enhanced sampling methods are affected by two factors: constructing appropriate collective variables for enhanced sampling and generating accurate free energy surfaces. Recently, many machine learning techniques have been developed to improve the quality of collective variables and the accuracy of free energy surfaces. Although machine learning has achieved great successes in improving enhanced sampling methods, there are still many challenges and open questions. In this perspective, we shall review recent developments on integrating machine learning techniques and collective variable-based enhanced sampling approaches. We also discuss challenges and future research directions including generating kinetic information, exploring high-dimensional free energy surfaces, and efficiently sampling all-atom configurations.
Collapse
Affiliation(s)
- Ming Chen
- Department of Chemistry, Purdue University, West Lafayette, IN 47907 USA
| |
Collapse
|
8
|
Kirchner B, Blasius J, Esser L, Reckien W. Predicting Vibrational Spectroscopy for Flexible Molecules and Molecules with Non‐Idle Environments. ADVANCED THEORY AND SIMULATIONS 2020. [DOI: 10.1002/adts.202000223] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Affiliation(s)
- Barbara Kirchner
- Mulliken Center for Theoretical Chemistry Rheinische Friedrich‐Wilhelms‐Universität Bonn Beringstr. 4+6 D‐53115 Bonn Germany
| | - Jan Blasius
- Mulliken Center for Theoretical Chemistry Rheinische Friedrich‐Wilhelms‐Universität Bonn Beringstr. 4+6 D‐53115 Bonn Germany
| | - Lars Esser
- Mulliken Center for Theoretical Chemistry Rheinische Friedrich‐Wilhelms‐Universität Bonn Beringstr. 4+6 D‐53115 Bonn Germany
| | - Werner Reckien
- Mulliken Center for Theoretical Chemistry Rheinische Friedrich‐Wilhelms‐Universität Bonn Beringstr. 4+6 D‐53115 Bonn Germany
| |
Collapse
|
9
|
Hernández-Segura T, Pastor N. Identification of an α-MoRF in the Intrinsically Disordered Region of the Escargot Transcription Factor. ACS OMEGA 2020; 5:18331-18341. [PMID: 32743208 PMCID: PMC7392517 DOI: 10.1021/acsomega.0c02051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 07/02/2020] [Indexed: 06/11/2023]
Abstract
Molecular recognition features (MoRFs) are common in intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs). MoRFs are in constant order-disorder structural transitions and adopt well-defined structures once they are bound to their targets. Here, we study Escargot (Esg), a transcription factor in Drosophila melanogaster that regulates multiple cellular functions, and consists of a disordered N-terminal domain and a group of zinc fingers at its C-terminal domain. We analyzed the N-terminal domain of Esg with disorder predictors and identified a region of 45 amino acids with high probability to form ordered structures, which we named S2. Through 54 μs of molecular dynamics (MD) simulations using CHARMM36 and implicit solvent (generalized Born/surface area (GBSA)), we characterized the conformational landscape of S2 and found an α-MoRF of ∼16 amino acids stabilized by key contacts within the helix. To test the importance of these contacts in the stability of the α-MoRF, we evaluated the effect of point mutations that would impair these interactions, running 24 μs of MD for each mutation. The mutations had mild effects on the MoRF, and in some cases, led to gain of residual structure through long-range contacts of the α-MoRF and the rest of the S2 region. As this could be an effect of the force field and solvent model we used, we benchmarked our simulation protocol by carrying out 32 μs of MD for the (AAQAA)3 peptide. The results of the benchmark indicate that the global amount of helix in shorter peptides like (AAQAA)3 is reasonably predicted. Careful analysis of the runs of S2 and its mutants suggests that the mutation to hydrophobic residues may have nucleated long-range hydrophobic and aromatic interactions that stabilize the MoRF. Finally, we have identified a set of residues that stabilize an α-MoRF in a region still without functional annotations in Esg.
Collapse
Affiliation(s)
- Teresa Hernández-Segura
- Laboratorio
de Dinámica de Proteínas, Centro de Investigación
en Dinámica Celular-IICBA, Universidad
Autónoma del Estado de Morelos, Av. Universidad 1001, Chamilpa, 62209 Cuernavaca, México
- Doctorado
en Ciencias CIDC-IICBA, Universidad Autónoma
del Estado de Morelos, Cuernavaca 62209, Morelos, México
| | - Nina Pastor
- Laboratorio
de Dinámica de Proteínas, Centro de Investigación
en Dinámica Celular-IICBA, Universidad
Autónoma del Estado de Morelos, Av. Universidad 1001, Chamilpa, 62209 Cuernavaca, México
| |
Collapse
|
10
|
Zhao Y, Cortes-Huerto R, Kremer K, Rudzinski JF. Investigating the Conformational Ensembles of Intrinsically Disordered Proteins with a Simple Physics-Based Model. J Phys Chem B 2020; 124:4097-4113. [PMID: 32345021 PMCID: PMC7246978 DOI: 10.1021/acs.jpcb.0c01949] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
Intrinsically
disordered proteins (IDPs) play an important role
in an array of biological processes but present a number of fundamental
challenges for computational modeling. Recently, simple polymer models
have regained popularity for interpreting the experimental characterization
of IDPs. Homopolymer theory provides a strong foundation for understanding
generic features of phenomena ranging from single-chain conformational
dynamics to the properties of entangled polymer melts, but is difficult
to extend to the copolymer context. This challenge is magnified for
proteins due to the variety of competing interactions and large deviations
in side-chain properties. In this work, we apply a simple physics-based
coarse-grained model for describing largely disordered conformational
ensembles of peptides, based on the premise that sampling sterically
forbidden conformations can compromise the faithful description of
both static and dynamical properties. The Hamiltonian of the employed
model can be easily adjusted to investigate the impact of distinct
interactions and sequence specificity on the randomness of the resulting
conformational ensemble. In particular, starting with a bead–spring-like
model and then adding more detailed interactions one by one, we construct
a hierarchical set of models and perform a detailed comparison of
their properties. Our analysis clarifies the role of generic attractions,
electrostatics, and side-chain sterics, while providing a foundation
for developing efficient models for IDPs that retain an accurate description
of the hierarchy of conformational dynamics, which is nontrivially
influenced by interactions with surrounding proteins and solvent molecules.
Collapse
Affiliation(s)
- Yani Zhao
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | | | - Kurt Kremer
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Joseph F Rudzinski
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| |
Collapse
|
11
|
Bozkurt Varolgüneş Y, Bereau T, Rudzinski JF. Interpretable embeddings from molecular simulations using Gaussian mixture variational autoencoders. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab80b7] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
12
|
Lemke T, Berg A, Jain A, Peter C. EncoderMap(II): Visualizing Important Molecular Motions with Improved Generation of Protein Conformations. J Chem Inf Model 2019; 59:4550-4560. [PMID: 31647645 DOI: 10.1021/acs.jcim.9b00675] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Dimensionality reduction can be used to project high-dimensional molecular data into a simplified, low-dimensional map. One feature of our recently introduced dimensionality reduction technique EncoderMap, which relies on the combination of an autoencoder with multidimensional scaling, is its ability to do the reverse. It is able to generate conformations for any selected points in the low-dimensional map. This transfers the simplified, low-dimensional map back into the high-dimensional conformational space. Although the output is again high-dimensional, certain aspects of the simplification are preserved. The generated conformations only mirror the most dominant conformational differences that determine the positions of conformational states in the low-dimensional map. This allows depicting such differences and-in consequence-visualizing molecular motions and gives a unique perspective on high-dimensional conformational data. In our previous work, protein conformations described in backbone dihedral angle space were used as the input for EncoderMap, and conformations were also generated in this space. For large proteins, however, the generation of conformations is inaccurate with this approach due to the local character of backbone dihedral angles. Here, we present an improved variant of EncoderMap which is able to generate large protein conformations that are accurate in short-range and long-range orders. This is achieved by differentiable reconstruction of Cartesian coordinates from the generated dihedrals, which allows adding a contribution to the cost function that monitors the accuracy of all pairwise distances between the Cα-atoms of the generated conformations. The improved capabilities to generate conformations of large, even multidomain, proteins are demonstrated for two examples: diubiquitin and a part of the Ssa1 Hsp70 yeast chaperone. We show that the improved variant of EncoderMap can nicely visualize motions of protein domains relative to each other but is also able to highlight important conformational changes within the individual domains.
Collapse
Affiliation(s)
- Tobias Lemke
- Theoretical Chemistry , University of Konstanz , 78547 Konstanz , Baden-Württemberg , Germany
| | - Andrej Berg
- Theoretical Chemistry , University of Konstanz , 78547 Konstanz , Baden-Württemberg , Germany
| | - Alok Jain
- Theoretical Chemistry , University of Konstanz , 78547 Konstanz , Baden-Württemberg , Germany.,Department of Biotechnology , National Institute of Pharmaceutical Education and Research Ahmedabad , Gandhinagar , Gujarat 382355 , India
| | - Christine Peter
- Theoretical Chemistry , University of Konstanz , 78547 Konstanz , Baden-Württemberg , Germany
| |
Collapse
|
13
|
Hunkler S, Lemke T, Peter C, Kukharenko O. Back-mapping based sampling: Coarse grained free energy landscapes as a guideline for atomistic exploration. J Chem Phys 2019; 151:154102. [DOI: 10.1063/1.5115398] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
14
|
Eberhardt J, Stote RH, Dejaegere A. Unrolr: Structural analysis of protein conformations using stochastic proximity embedding. J Comput Chem 2019; 39:2551-2557. [PMID: 30447084 DOI: 10.1002/jcc.25599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 08/24/2018] [Accepted: 08/24/2018] [Indexed: 01/29/2023]
Abstract
Molecular dynamics (MD) simulations are widely used to explore the conformational space of biological macromolecules. Advances in hardware, as well as in methods, make the generation of large and complex MD datasets much more common. Although different clustering and dimensionality reduction methods have been applied to MD simulations, there remains a need for improved strategies that handle nonlinear data and/or can be applied to very large datasets. We present an original implementation of the pivot-based version of the stochastic proximity embedding method aimed at large MD datasets using the dihedral distance as a metric. The advantages of the algorithm in terms of data storage and computational efficiency are presented, as well as the implementation realized. Application and testing through the analysis of a 200 ns accelerated MD simulation of a 35-residue villin headpiece is discussed. Analysis of the simulation shows the promise of this method to organize large conformational ensembles. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Jérôme Eberhardt
- Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France
| | - Roland H Stote
- Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France
| | - Annick Dejaegere
- Biologie structurale intégrative Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Institut National de La Santé et de La Recherche Médicale (INSERM), U1258/Centre National de Recherche Scientifique (CNRS), UMR7104/Université de Strasbourg, Illkirch, France
| |
Collapse
|
15
|
Berg A, Peter C. Simulating and analysing configurational landscapes of protein-protein contact formation. Interface Focus 2019; 9:20180062. [PMID: 31065336 DOI: 10.1098/rsfs.2018.0062] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2019] [Indexed: 01/04/2023] Open
Abstract
Interacting proteins can form aggregates and protein-protein interfaces with multiple patterns and different stabilities. Using molecular simulation one would like to understand the formation of these aggregates and which of the observed states are relevant for protein function and recognition. To characterize the complex configurational ensemble of protein aggregates, one needs a quantitative measure for the similarity of structures. We present well-suited descriptors that capture the essential features of non-covalent protein contact formation and domain motion. This set of collective variables is used with a nonlinear multi-dimensional scaling-based dimensionality reduction technique to obtain a low-dimensional representation of the configurational landscape of two ubiquitin proteins from coarse-grained simulations. We show that this two-dimensional representation is a powerful basis to identify meaningful states in the ensemble of aggregated structures and to calculate distributions and free energy landscapes for different sets of simulations. By using a measure to quantitatively compare free energy landscapes we can show how the introduction of a covalent bond between two ubiquitin proteins at different positions alters the configurational states of these dimers.
Collapse
Affiliation(s)
- Andrej Berg
- Department of Chemistry, University of Konstanz, Universitätsstraße 10, Konstanz 78457, Germany
| | - Christine Peter
- Department of Chemistry, University of Konstanz, Universitätsstraße 10, Konstanz 78457, Germany
| |
Collapse
|
16
|
Lemke T, Peter C. EncoderMap: Dimensionality Reduction and Generation of Molecule Conformations. J Chem Theory Comput 2019; 15:1209-1215. [DOI: 10.1021/acs.jctc.8b00975] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Tobias Lemke
- Theoretical Chemistry, University of Konstanz, 78547 Konstanz, Germany
| | - Christine Peter
- Theoretical Chemistry, University of Konstanz, 78547 Konstanz, Germany
| |
Collapse
|
17
|
Abstract
This chapter discusses the way in which dimensionality reduction algorithms such as diffusion maps and sketch-map can be used to analyze molecular dynamics trajectories. The first part discusses how these various algorithms function as well as practical issues such as landmark selection and how these algorithms can be used when the data to be analyzed comes from enhanced sampling trajectories. In the later part a comparison between the results obtained by applying various algorithms to two sets of sample data is performed and discussed. This section is then followed by a summary of how one algorithm in particular, sketch-map, has been applied to a range of problems. The chapter concludes with a discussion on the directions that we believe this field is currently moving.
Collapse
|
18
|
Berg A, Kukharenko O, Scheffner M, Peter C. Towards a molecular basis of ubiquitin signaling: A dual-scale simulation study of ubiquitin dimers. PLoS Comput Biol 2018; 14:e1006589. [PMID: 30444864 PMCID: PMC6268000 DOI: 10.1371/journal.pcbi.1006589] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 11/30/2018] [Accepted: 10/22/2018] [Indexed: 12/25/2022] Open
Abstract
Covalent modification of proteins by ubiquitin or ubiquitin chains is one of the most prevalent post-translational modifications in eukaryotes. Different types of ubiquitin chains are assumed to selectively signal respectively modified proteins for different fates. In support of this hypothesis, structural studies have shown that the eight possible ubiquitin dimers adopt different conformations. However, at least in some cases, these structures cannot sufficiently explain the molecular basis of the selective signaling mechanisms. This indicates that the available structures represent only a few distinct conformations within the entire conformational space adopted by a ubiquitin dimer. Here, molecular simulations on different levels of resolution can complement the structural information. We have combined exhaustive coarse grained and atomistic simulations of all eight possible ubiquitin dimers with a suitable dimensionality reduction technique and a new method to characterize protein-protein interfaces and the conformational landscape of protein conjugates. We found that ubiquitin dimers exhibit characteristic linkage type-dependent properties in solution, such as interface stability and the character of contacts between the subunits, which can be directly correlated with experimentally observed linkage-specific properties. Post-translational modification of proteins by covalent attachment of ubiquitin is a key cellular process, regulating for example the fate and recycling of proteins. We present a new method to combine multiscale simulation with advanced analysis methods to characterize the states of ubiquitin-ubiquitin conjugates. We found that the linkage position affects the conformational space of ubiquitin dimers, determining the number and stability of relevant states, the character of subunit contacts and the nature of the surface exposed to possible binding partners.
Collapse
Affiliation(s)
- Andrej Berg
- Department of Chemistry, University of Konstanz, Konstanz, Germany
| | | | - Martin Scheffner
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Christine Peter
- Department of Chemistry, University of Konstanz, Konstanz, Germany
- * E-mail:
| |
Collapse
|
19
|
Zimmerman MI, Porter JR, Sun X, Silva RR, Bowman GR. Choice of Adaptive Sampling Strategy Impacts State Discovery, Transition Probabilities, and the Apparent Mechanism of Conformational Changes. J Chem Theory Comput 2018; 14:5459-5475. [PMID: 30240203 PMCID: PMC6571142 DOI: 10.1021/acs.jctc.8b00500] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Interest in atomically detailed simulations has grown significantly with recent advances in computational hardware and Markov state modeling (MSM) methods, yet outstanding questions remain that hinder their widespread adoption. Namely, how do alternative sampling strategies explore conformational space and how might this influence predictions generated from the data? Here, we seek to answer these questions for four commonly used sampling methods: (1) a single long simulation, (2) many short simulations run in parallel, (3) adaptive sampling, and (4) our recently developed goal-oriented sampling algorithm, FAST. We first develop a theoretical framework for analytically calculating the probability of discovering select states on simple landscapes, where we uncover the drastic effects of varying the number and length of simulations. We then use kinetic Monte Carlo simulations on a variety of physically inspired landscapes to characterize the probability of discovering particular states and transition pathways for each of the four methods. Consistently, we find that FAST simulations discover each target state with the highest probability, while traversing realistic pathways. Furthermore, we uncover the potential pathology that short parallel simulations sometimes predict an incorrect transition pathway by crossing large energy barriers that long simulations would typically circumnavigate. We refer to this pathology as "pathway tunneling". To protect against this phenomenon when using adaptive-sampling and FAST simulations, we introduce the FAST-string method. This method enhances sampling along the highest-flux transition paths to refine an MSMs transition probabilities and discriminate between competing pathways. Additionally, we compare the performance of a variety of MSM estimators in describing accurate thermodynamics and kinetics. For adaptive sampling, we recommend simply normalizing the transition counts out of each state after adding small pseudocounts to avoid creating sources or sinks. Lastly, we evaluate whether our insights from simple landscapes hold for all-atom molecular dynamics simulations of the folding of the λ-repressor protein. Remarkably, we find that FAST-contacts predicts the same folding pathway as a set of long simulations but with orders of magnitude less simulation time.
Collapse
Affiliation(s)
- Maxwell I. Zimmerman
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - Justin R. Porter
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - Xianqiang Sun
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - Roseane R. Silva
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, United States
| | - Gregory R. Bowman
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, United States
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri, 63110, United States
- Center for Biological Systems Engineering, Washington University in St. Louis, St. Louis, Missouri, 63110, United States
| |
Collapse
|
20
|
Lemke T, Peter C, Kukharenko O. Efficient Sampling and Characterization of Free Energy Landscapes of Ion–Peptide Systems. J Chem Theory Comput 2018; 14:5476-5488. [DOI: 10.1021/acs.jctc.8b00560] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Tobias Lemke
- Theoretical Chemistry, University of Konstanz, 78547 Konstanz, Germany
| | - Christine Peter
- Theoretical Chemistry, University of Konstanz, 78547 Konstanz, Germany
| | | |
Collapse
|
21
|
Bhattacharya S, Xu L, Thompson D. Revisiting the earliest signatures of amyloidogenesis: Roadmaps emerging from computational modeling and experiment. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2018. [DOI: 10.1002/wcms.1359] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Shayon Bhattacharya
- Department of Physics, Bernal InstituteUniversity of LimerickLimerickIreland
| | - Liang Xu
- Department of Physics, Bernal InstituteUniversity of LimerickLimerickIreland
| | - Damien Thompson
- Department of Physics, Bernal InstituteUniversity of LimerickLimerickIreland
| |
Collapse
|