1
|
Barrett R, Ansari M, Ghoshal G, White AD. Simulation-based inference with approximately correct parameters via maximum entropy. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac6286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Inferring the input parameters of simulators from observations is a crucial challenge with applications from epidemiology to molecular dynamics. Here we show a simple approach in the regime of sparse data and approximately correct models, which is common when trying to use an existing model to infer latent variables with observed data. This approach is based on the principle of maximum entropy (MaxEnt) and provably makes the smallest change in the latent joint distribution to fit new data. This method requires no likelihood or model derivatives and its fit is insensitive to prior strength, removing the need to balance observed data fit with prior belief. The method requires the ansatz that data is fit in expectation, which is true in some settings and may be reasonable in all settings with few data points. The method is based on sample reweighting, so its asymptotic run time is independent of prior distribution dimension. We demonstrate this MaxEnt approach and compare with other likelihood-free inference methods across three systems: a point particle moving in a gravitational field, a compartmental model of epidemic spread and molecular dynamics simulation of a protein.
Collapse
|
2
|
Conformational ensembles of intrinsically disordered proteins and flexible multidomain proteins. Biochem Soc Trans 2022; 50:541-554. [PMID: 35129612 DOI: 10.1042/bst20210499] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 01/13/2022] [Accepted: 01/17/2022] [Indexed: 12/29/2022]
Abstract
Intrinsically disordered proteins (IDPs) and multidomain proteins with flexible linkers show a high level of structural heterogeneity and are best described by ensembles consisting of multiple conformations with associated thermodynamic weights. Determining conformational ensembles usually involves the integration of biophysical experiments and computational models. In this review, we discuss current approaches to determine conformational ensembles of IDPs and multidomain proteins, including the choice of biophysical experiments, computational models used to sample protein conformations, models to calculate experimental observables from protein structure, and methods to refine ensembles against experimental data. We also provide examples of recent applications of integrative conformational ensemble determination to study IDPs and multidomain proteins and suggest future directions for research in the field.
Collapse
|
3
|
Voelz VA, Ge Y, Raddi RM. Reconciling Simulations and Experiments With BICePs: A Review. Front Mol Biosci 2021; 8:661520. [PMID: 34046431 PMCID: PMC8144449 DOI: 10.3389/fmolb.2021.661520] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 04/12/2021] [Indexed: 02/04/2023] Open
Abstract
Bayesian Inference of Conformational Populations (BICePs) is an algorithm developed to reconcile simulated ensembles with sparse experimental measurements. The Bayesian framework of BICePs enables population reweighting as a post-simulation processing step, with several advantages over existing methods, including the proper use of reference potentials, and the estimation of a Bayes factor-like quantity called the BICePs score for model selection. Here, we summarize the theory underlying this method in context with related algorithms, review the history of BICePs applications to date, and discuss current shortcomings along with future plans for improvement.
Collapse
Affiliation(s)
- Vincent A. Voelz
- Department of Chemistry, Temple University, Philadelphia, PA, United States
| | - Yunhui Ge
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, United States
| | - Robert M. Raddi
- Department of Chemistry, Temple University, Philadelphia, PA, United States
| |
Collapse
|
4
|
Hays JM, Boland E, Kasson PM. Inference of Joint Conformational Distributions from Separately Acquired Experimental Measurements. J Phys Chem Lett 2021; 12:1606-1611. [PMID: 33596657 PMCID: PMC8310705 DOI: 10.1021/acs.jpclett.0c03623] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Flexible proteins serve vital roles in a multitude of biological processes. However, determining their full conformational ensembles is extremely difficult because this requires detailed knowledge about the heterogeneity of the protein's degrees of freedom. Label-based experiments such as double electron-electron resonance (DEER) are very useful in studying flexible proteins, as they provide distributional data on heterogeneity. These experiments are typically performed separately, so information about correlation between distributions is lost. We have developed a method to recover correlation information using nonequilibrium work estimates in molecular dynamics refinement. We tested this method on a simple model of an alternating-access transporter for which the true joint distributions are known, and it successfully recovered the true joint distribution. We also applied our method to the protein syntaxin-1a, where it discarded physically implausible conformations. Our method thus provides a way to recover correlation structure in separate experimental measurements of conformational ensembles and refines the resulting structural ensemble.
Collapse
Affiliation(s)
- Jennifer M. Hays
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology, University of Virginia, Charlottesville, VA, USA
| | - Emily Boland
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology, University of Virginia, Charlottesville, VA, USA
| | - Peter M. Kasson
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
- Department of Molecular Physiology, University of Virginia, Charlottesville, VA, USA
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala, 75124 Sweden
- Corresponding Author:
| |
Collapse
|
5
|
Kauffmann C, Zawadzka‐Kazimierczuk A, Kontaxis G, Konrat R. Using Cross-Correlated Spin Relaxation to Characterize Backbone Dihedral Angle Distributions of Flexible Protein Segments. Chemphyschem 2021; 22:18-28. [PMID: 33119214 PMCID: PMC7839595 DOI: 10.1002/cphc.202000789] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/28/2020] [Indexed: 01/11/2023]
Abstract
Crucial to the function of proteins is their existence as conformational ensembles sampling numerous and structurally diverse substates. Despite this widely accepted notion there is still a high demand for meaningful and reliable approaches to characterize protein ensembles in solution. As it is usually conducted in solution, NMR spectroscopy offers unique possibilities to address this challenge. Particularly, cross-correlated relaxation (CCR) effects have long been established to encode both protein structure and dynamics in a compelling manner. However, this wealth of information often limits their use in practice as structure and dynamics might prove difficult to disentangle. Using a modern Maximum Entropy (MaxEnt) reweighting approach to interpret CCR rates of Ubiquitin, we demonstrate that these uncertainties do not necessarily impair resolving CCR-encoded structural information. Instead, a suitable balance between complementary CCR experiments and prior information is found to be the most crucial factor in mapping backbone dihedral angle distributions. Experimental and systematic deviations such as oversimplified dynamics appear to be of minor importance. Using Ubiquitin as an example, we demonstrate that CCR rates are capable of characterizing rigid and flexible residues alike, indicating their unharnessed potential in studying disordered proteins.
Collapse
Affiliation(s)
- Clemens Kauffmann
- Department of Structural and Computational BiologyMax Perutz LaboratoriesUniversity of ViennaVienna Biocenter Campus 5A-1030ViennaAustria
| | - Anna Zawadzka‐Kazimierczuk
- Biological and Chemical Research CentreFaculty of ChemistryUniversity of WarsawŻwirki i Wigury 10102-089WarsawPoland
| | - Georg Kontaxis
- Department of Structural and Computational BiologyMax Perutz LaboratoriesUniversity of ViennaVienna Biocenter Campus 5A-1030ViennaAustria
| | - Robert Konrat
- Department of Structural and Computational BiologyMax Perutz LaboratoriesUniversity of ViennaVienna Biocenter Campus 5A-1030ViennaAustria
| |
Collapse
|
6
|
Kauffmann C, Kazimierczuk K, Schwarz TC, Konrat R, Zawadzka-Kazimierczuk A. A novel high-dimensional NMR experiment for resolving protein backbone dihedral angle ambiguities. JOURNAL OF BIOMOLECULAR NMR 2020; 74:257-265. [PMID: 32239382 PMCID: PMC7211790 DOI: 10.1007/s10858-020-00308-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/12/2020] [Indexed: 05/07/2023]
Abstract
Intrinsically disordered proteins (IDPs) are challenging established structural biology perception and urge a reassessment of the conventional understanding of the subtle interplay between protein structure and dynamics. Due to their importance in eukaryotic life and central role in protein interaction networks, IDP research is a fascinating and highly relevant research area in which NMR spectroscopy is destined to be a key player. The flexible nature of IDPs, as a result of the sampling of a vast conformational space, however, poses a tremendous scientific challenge, both technically and theoretically. Pronounced signal averaging results in narrow signal dispersion and requires higher dimensionality NMR techniques. Moreover, a fundamental problem in the structural characterization of IDPs is the definition of the conformational ensemble sampled by the polypeptide chain in solution, where often the interpretation relies on the concept of 'residual structure' or 'conformational preference'. An important source of structural information is information-rich NMR experiments that probe protein backbone dihedral angles in a unique manner. Cross-correlated relaxation experiments have proven to fulfil this task as they provide unique information about protein backbones, particularly in IDPs. Here we present a novel cross-correlation experiment that utilizes non-uniform sampling detection schemes to resolve protein backbone dihedral ambiguities in IDPs. The sensitivity of this novel technique is illustrated with an application to the prototypical IDP [Formula: see text]-Synculein for which unexpected deviations from random-coil-like behaviour could be observed.
Collapse
Affiliation(s)
- Clemens Kauffmann
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria
| | | | - Thomas C Schwarz
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria
| | - Robert Konrat
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria.
| | - Anna Zawadzka-Kazimierczuk
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria.
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, 02-089, Warsaw, Poland.
| |
Collapse
|
7
|
Orioli S, Larsen AH, Bottaro S, Lindorff-Larsen K. How to learn from inconsistencies: Integrating molecular simulations with experimental data. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 170:123-176. [PMID: 32145944 DOI: 10.1016/bs.pmbts.2019.12.006] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Molecular simulations and biophysical experiments can be used to provide independent and complementary insights into the molecular origin of biological processes. A particularly useful strategy is to use molecular simulations as a modeling tool to interpret experimental measurements, and to use experimental data to refine our biophysical models. Thus, explicit integration and synergy between molecular simulations and experiments is fundamental for furthering our understanding of biological processes. This is especially true in the case where discrepancies between measured and simulated observables emerge. In this chapter, we provide an overview of some of the core ideas behind methods that were developed to improve the consistency between experimental information and numerical predictions. We distinguish between situations where experiments are used to refine our understanding and models of specific systems, and situations where experiments are used more generally to refine transferable models. We discuss different philosophies and attempt to unify them in a single framework. Until now, such integration between experiments and simulations have mostly been applied to equilibrium data, and we discuss more recent developments aimed to analyze time-dependent or time-resolved data.
Collapse
Affiliation(s)
- Simone Orioli
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Haahr Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Atomistic Simulations Laboratory, Istituto Italiano di Tecnologia, Genova, Italy
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
8
|
Integrative Approaches in Structural Biology: A More Complete Picture from the Combination of Individual Techniques. Biomolecules 2019; 9:biom9080370. [PMID: 31416261 PMCID: PMC6723403 DOI: 10.3390/biom9080370] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 08/08/2019] [Accepted: 08/11/2019] [Indexed: 11/21/2022] Open
Abstract
With the recent technological and computational advancements, structural biology has begun to tackle more and more difficult questions, including complex biochemical pathways and transient interactions among macromolecules. This has demonstrated that, to approach the complexity of biology, one single technique is largely insufficient and unable to yield thorough answers, whereas integrated approaches have been more and more adopted with successful results. Traditional structural techniques (X-ray crystallography and Nuclear Magnetic Resonance (NMR)) and the emerging ones (cryo-electron microscopy (cryo-EM), Small Angle X-ray Scattering (SAXS)), together with molecular modeling, have pros and cons which very nicely complement one another. In this review, three examples of synergistic approaches chosen from our previous research will be revisited. The first shows how the joint use of both solution and solid-state NMR (SSNMR), X-ray crystallography, and cryo-EM is crucial to elucidate the structure of polyethylene glycol (PEG)ylated asparaginase, which would not be obtainable through any of the techniques taken alone. The second deals with the integrated use of NMR, X-ray crystallography, and SAXS in order to elucidate the catalytic mechanism of an enzyme that is based on the flexibility of the enzyme itself. The third one shows how it is possible to put together experimental data from X-ray crystallography and NMR restraints in order to refine a protein model in order to obtain a structure which simultaneously satisfies both experimental datasets and is therefore closer to the ‘real structure’.
Collapse
|
9
|
Hermann MR, Hub JS. SAXS-Restrained Ensemble Simulations of Intrinsically Disordered Proteins with Commitment to the Principle of Maximum Entropy. J Chem Theory Comput 2019; 15:5103-5115. [DOI: 10.1021/acs.jctc.9b00338] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Markus R. Hermann
- Institute for Microbiology and Genetics, Georg-August-Universität Göttingen, 37077 Göttingen, Germany
| | - Jochen S. Hub
- Theoretical Physics and Center for Biophysics, Saarland University, Campus E2 6, 66123 Saarbrücken, Germany
| |
Collapse
|
10
|
Hays JM, Cafiso DS, Kasson PM. Hybrid Refinement of Heterogeneous Conformational Ensembles Using Spectroscopic Data. J Phys Chem Lett 2019; 10:3410-3414. [PMID: 31181934 PMCID: PMC6605767 DOI: 10.1021/acs.jpclett.9b01407] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Multistructured biomolecular systems play crucial roles in a wide variety of cellular processes but have resisted traditional methods of structure determination, which often resolve only a few low-energy states. High-resolution structure determination using experimental methods that yield distributional data remains extremely difficult, especially when the underlying conformational ensembles are quite heterogeneous. We have therefore developed a method to integrate sparse, multimultimodal spectroscopic data to obtain high-resolution estimates of conformational ensembles. We have tested our method by incorporating double electron-electron resonance data on the soluble N-ethylmaleimide-sensitive factor attachment receptor (SNARE) protein syntaxin-1a into biased molecular dynamics simulations. We find that our method substantially outperforms existing state-of-the-art methods in capturing syntaxin's open-closed conformational equilibrium and further yields new conformational states that are consistent with experimental data and may help in understanding syntaxin's function. Our improved methods for refining heterogeneous conformational ensembles from spectroscopic data will greatly accelerate the structural understanding of such systems.
Collapse
Affiliation(s)
- Jennifer M. Hays
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, 22903
- Department of Molecular Physiology and Biophysics, University of Virginia, Charlottesville, VA, 22903
| | - David S. Cafiso
- Department of Molecular Physiology and Biophysics, University of Virginia, Charlottesville, VA, 22903
- Department of Chemistry, University of Virginia, Charlottesville, VA, 22903
| | - Peter M. Kasson
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, 22903
- Department of Molecular Physiology and Biophysics, University of Virginia, Charlottesville, VA, 22903
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, 75124 Uppsala,
Sweden
| |
Collapse
|
11
|
Amirkulova DB, White AD. Recent advances in maximum entropy biasing techniques for molecular dynamics. MOLECULAR SIMULATION 2019. [DOI: 10.1080/08927022.2019.1608988] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- D. B. Amirkulova
- Department of Chemical Engineering, University of Rochester, Rochester, NY, USA
| | - A. D. White
- Department of Chemical Engineering, University of Rochester, Rochester, NY, USA
| |
Collapse
|
12
|
Matsunaga Y, Sugita Y. Linking time-series of single-molecule experiments with molecular dynamics simulations by machine learning. eLife 2018; 7:32668. [PMID: 29723137 PMCID: PMC5933924 DOI: 10.7554/elife.32668] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Accepted: 04/23/2018] [Indexed: 12/27/2022] Open
Abstract
Single-molecule experiments and molecular dynamics (MD) simulations are indispensable tools for investigating protein conformational dynamics. The former provide time-series data, such as donor-acceptor distances, whereas the latter give atomistic information, although this information is often biased by model parameters. Here, we devise a machine-learning method to combine the complementary information from the two approaches and construct a consistent model of conformational dynamics. It is applied to the folding dynamics of the formin-binding protein WW domain. MD simulations over 400 μs led to an initial Markov state model (MSM), which was then "refined" using single-molecule Förster resonance energy transfer (FRET) data through hidden Markov modeling. The refined or data-assimilated MSM reproduces the FRET data and features hairpin one in the transition-state ensemble, consistent with mutation experiments. The folding pathway in the data-assimilated MSM suggests interplay between hydrophobic contacts and turn formation. Our method provides a general framework for investigating conformational transitions in other proteins.
Collapse
Affiliation(s)
- Yasuhiro Matsunaga
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Japan.,JST PRESTO, Kawaguchi, Japan
| | - Yuji Sugita
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Japan.,Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Japan.,Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| |
Collapse
|
13
|
Vasile F, Panigada M, Siccardi A, Potenza D, Tiana G. A Combined NMR-Computational Study of the Interaction between Influenza Virus Hemagglutinin and Sialic Derivatives from Human and Avian Receptors on the Surface of Transfected Cells. Int J Mol Sci 2018; 19:E1267. [PMID: 29695047 PMCID: PMC5983646 DOI: 10.3390/ijms19051267] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Revised: 04/18/2018] [Accepted: 04/19/2018] [Indexed: 12/31/2022] Open
Abstract
The development of small-molecule inhibitors of influenza virus Hemagglutinin could be relevant to the opposition of the diffusion of new pandemic viruses. In this work, we made use of Nuclear Magnetic Resonance (NMR) spectroscopy to study the interaction between two derivatives of sialic acid, Neu5Ac-α-(2,6)-Gal-β-(1⁻4)-GlcNAc and Neu5Ac-α-(2,3)-Gal-β-(1⁻4)-GlcNAc, and hemagglutinin directly expressed on the surface of recombinant human cells. We analyzed the interaction of these trisaccharides with 293T cells transfected with the H5 and H1 variants of hemagglutinin, which thus retain their native trimeric conformation in such a realistic environment. By exploiting the magnetization transfer between the protein and the ligand, we obtained evidence of the binding event, and identified the epitope. We analyzed the conformational features of the glycans with an approach combining NMR spectroscopy and data-driven molecular dynamics simulations, thus obtaining useful information for an efficient drug design.
Collapse
Affiliation(s)
- Francesca Vasile
- Department of Chemistry, University of Milano, Via Golgi 19, 20133 Milano, Italy.
| | - Maddalena Panigada
- Molecular Immunology Unit, San Raffaele Research Institute, via Olgettina 58, 20132 Milano, Italy.
| | - Antonio Siccardi
- Molecular Immunology Unit, San Raffaele Research Institute, via Olgettina 58, 20132 Milano, Italy.
| | - Donatella Potenza
- Department of Chemistry, University of Milano, Via Golgi 19, 20133 Milano, Italy.
| | - Guido Tiana
- Center for Complexity and Biosystems and Department of Physics, University of Milano and INFN, Via Celoria 16, 20133 Milano, Italy.
| |
Collapse
|
14
|
Gaalswyk K, Muniyat MI, MacCallum JL. The emerging role of physical modeling in the future of structure determination. Curr Opin Struct Biol 2018; 49:145-153. [DOI: 10.1016/j.sbi.2018.03.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2017] [Revised: 03/04/2018] [Accepted: 03/05/2018] [Indexed: 10/17/2022]
|
15
|
De Martino A, De Martino D. An introduction to the maximum entropy approach and its application to inference problems in biology. Heliyon 2018; 4:e00596. [PMID: 29862358 PMCID: PMC5968179 DOI: 10.1016/j.heliyon.2018.e00596] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 03/31/2018] [Accepted: 04/03/2018] [Indexed: 11/15/2022] Open
Abstract
A cornerstone of statistical inference, the maximum entropy framework is being increasingly applied to construct descriptive and predictive models of biological systems, especially complex biological networks, from large experimental data sets. Both its broad applicability and the success it obtained in different contexts hinge upon its conceptual simplicity and mathematical soundness. Here we try to concisely review the basic elements of the maximum entropy principle, starting from the notion of 'entropy', and describe its usefulness for the analysis of biological systems. As examples, we focus specifically on the problem of reconstructing gene interaction networks from expression data and on recent work attempting to expand our system-level understanding of bacterial metabolism. Finally, we highlight some extensions and potential limitations of the maximum entropy approach, and point to more recent developments that are likely to play a key role in the upcoming challenges of extracting structures and information from increasingly rich, high-throughput biological data.
Collapse
Affiliation(s)
- Andrea De Martino
- Soft & Living Matter Lab, Institute of Nanotechnology (NANOTEC), Consiglio Nazionale delle Ricerche, Rome, Italy
- Italian Institute for Genomic Medicine (IIGM), Turin, Italy
| | | |
Collapse
|
16
|
Ge Y, Voelz VA. Model Selection Using BICePs: A Bayesian Approach for Force Field Validation and Parameterization. J Phys Chem B 2018. [PMID: 29518328 DOI: 10.1021/acs.jpcb.7b11871] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Bayesian Inference of Conformational Populations (BICePs) algorithm reconciles theoretical predictions of conformational state populations with sparse and/or noisy experimental measurements. Among its key advantages is its ability to perform objective model selection through a quantity we call the BICePs score, which reflects the integrated posterior evidence in favor of a given model, computed through free energy estimation methods. Here, we explore how the BICePs score can be used for force field validation and parametrization. Using a 2D lattice protein as a toy model, we demonstrate that BICePs is able to select the correct value of an interaction energy parameter given ensemble-averaged experimental distance measurements. We show that if conformational states are sufficiently fine-grained, the results are robust to experimental noise and measurement sparsity. Using these insights, we apply BICePs to perform force field evaluations for all-atom simulations of designed β-hairpin peptides against experimental NMR chemical shift measurements. These tests suggest that BICePs scores can be used for model selection in the context of all-atom simulations. We expect this approach to be particularly useful for the computational foldamer design as a tool for improving general-purpose force fields given sparse experimental measurements.
Collapse
Affiliation(s)
- Yunhui Ge
- Department of Chemistry , Temple University , Philadelphia , Pennsylvania 19122 , United States
| | - Vincent A Voelz
- Department of Chemistry , Temple University , Philadelphia , Pennsylvania 19122 , United States
| |
Collapse
|
17
|
Combining experimental and simulation data of molecular processes via augmented Markov models. Proc Natl Acad Sci U S A 2017; 114:8265-8270. [PMID: 28716931 DOI: 10.1073/pnas.1704803114] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Accurate mechanistic description of structural changes in biomolecules is an increasingly important topic in structural and chemical biology. Markov models have emerged as a powerful way to approximate the molecular kinetics of large biomolecules while keeping full structural resolution in a divide-and-conquer fashion. However, the accuracy of these models is limited by that of the force fields used to generate the underlying molecular dynamics (MD) simulation data. Whereas the quality of classical MD force fields has improved significantly in recent years, remaining errors in the Boltzmann weights are still on the order of a few [Formula: see text], which may lead to significant discrepancies when comparing to experimentally measured rates or state populations. Here we take the view that simulations using a sufficiently good force-field sample conformations that are valid but have inaccurate weights, yet these weights may be made accurate by incorporating experimental data a posteriori. To do so, we propose augmented Markov models (AMMs), an approach that combines concepts from probability theory and information theory to consistently treat systematic force-field error and statistical errors in simulation and experiment. Our results demonstrate that AMMs can reconcile conflicting results for protein mechanisms obtained by different force fields and correct for a wide range of stationary and dynamical observables even when only equilibrium measurements are incorporated into the estimation process. This approach constitutes a unique avenue to combine experiment and computation into integrative models of biomolecular structure and dynamics.
Collapse
|
18
|
The Exact Nuclear Overhauser Enhancement: Recent Advances. Molecules 2017; 22:molecules22071176. [PMID: 28708092 PMCID: PMC6152122 DOI: 10.3390/molecules22071176] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 07/10/2017] [Indexed: 02/04/2023] Open
Abstract
Although often depicted as rigid structures, proteins are highly dynamic systems, whose motions are essential to their functions. Despite this, it is difficult to investigate protein dynamics due to the rapid timescale at which they sample their conformational space, leading most NMR-determined structures to represent only an averaged snapshot of the dynamic picture. While NMR relaxation measurements can help to determine local dynamics, it is difficult to detect translational or concerted motion, and only recently have significant advances been made to make it possible to acquire a more holistic representation of the dynamics and structural landscapes of proteins. Here, we briefly revisit our most recent progress in the theory and use of exact nuclear Overhauser enhancements (eNOEs) for the calculation of structural ensembles that describe their conformational space. New developments are primarily targeted at increasing the number and improving the quality of extracted eNOE distance restraints, such that the multi-state structure calculation can be applied to proteins of higher molecular weights. We then review the implications of the exact NOE to the protein dynamics and function of cyclophilin A and the WW domain of Pin1, and finally discuss our current research and future directions.
Collapse
|
19
|
Allison JR. Using simulation to interpret experimental data in terms of protein conformational ensembles. Curr Opin Struct Biol 2017; 43:79-87. [DOI: 10.1016/j.sbi.2016.11.018] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 11/15/2016] [Accepted: 11/21/2016] [Indexed: 01/03/2023]
|
20
|
Antonov LD, Olsson S, Boomsma W, Hamelryck T. Bayesian inference of protein ensembles from SAXS data. Phys Chem Chem Phys 2017; 18:5832-8. [PMID: 26548662 DOI: 10.1039/c5cp04886a] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The inherent flexibility of intrinsically disordered proteins (IDPs) and multi-domain proteins with intrinsically disordered regions (IDRs) presents challenges to structural analysis. These macromolecules need to be represented by an ensemble of conformations, rather than a single structure. Small-angle X-ray scattering (SAXS) experiments capture ensemble-averaged data for the set of conformations. We present a Bayesian approach to ensemble inference from SAXS data, called Bayesian ensemble SAXS (BE-SAXS). We address two issues with existing methods: the use of a finite ensemble of structures to represent the underlying distribution, and the selection of that ensemble as a subset of an initial pool of structures. This is achieved through the formulation of a Bayesian posterior of the conformational space. BE-SAXS modifies a structural prior distribution in accordance with the experimental data. It uses multi-step expectation maximization, with alternating rounds of Markov-chain Monte Carlo simulation and empirical Bayes optimization. We demonstrate the method by employing it to obtain a conformational ensemble of the antitoxin PaaA2 and comparing the results to a published ensemble.
Collapse
Affiliation(s)
- L D Antonov
- Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen N, Denmark.
| | - S Olsson
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH-Hönggerberg, Vladimir-Prelog-Weg 2, CH-8093 Zürich, Switzerland and Institute for Research in Biomedicine, Università della Svizzera Italiana, Via Vincenzo Vela 6, CH-6500 Bellinzona, Switzerland
| | - W Boomsma
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen N, Denmark
| | - T Hamelryck
- Bioinformatics Centre, Department of Biology, University of Copenhagen, Ole Maaloes Vej 5, DK-2200 Copenhagen N, Denmark.
| |
Collapse
|
21
|
Ravera E, Sgheri L, Parigi G, Luchinat C. A critical assessment of methods to recover information from averaged data. Phys Chem Chem Phys 2017; 18:5686-701. [PMID: 26565805 DOI: 10.1039/c5cp04077a] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Conformational heterogeneity is key to the function of many biomacromolecules, but only a few groups have tried to characterize it until recently. Now, thanks to the increased throughput of experimental data and the increased computational power, the problem of the characterization of protein structural variability has become more and more popular. Several groups have devoted their efforts in trying to create quantitative, reliable and accurate protocols for extracting such information from averaged data. We analyze here different approaches, discussing strengths and weaknesses of each. All approaches can roughly be clustered into two groups: those satisfying the maximum entropy principle and those recovering ensembles composed of a restricted number of molecular conformations. In the first case, the solution focuses on the features that are common to all the infinite solutions satisfying the experimental data; in the second case, the reconstructed ensemble shows the conformational regions where a large probability can be placed. The upper limits for conformational probabilities (MaxOcc) can also be calculated. We also give an overview of the mainstream experimental observables, with considerations on the assumptions underlying their usage.
Collapse
Affiliation(s)
- Enrico Ravera
- Center for Magnetic Resonance (CERM) and Department of Chemistry "Ugo Schiff", University of Florence, Via L. Sacconi 6, 50019, Sesto Fiorentino, Italy.
| | - Luca Sgheri
- Istituto per le Applicazioni del Calcolo, Sezione di Firenze, CNR, Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy
| | - Giacomo Parigi
- Center for Magnetic Resonance (CERM) and Department of Chemistry "Ugo Schiff", University of Florence, Via L. Sacconi 6, 50019, Sesto Fiorentino, Italy.
| | - Claudio Luchinat
- Center for Magnetic Resonance (CERM) and Department of Chemistry "Ugo Schiff", University of Florence, Via L. Sacconi 6, 50019, Sesto Fiorentino, Italy.
| |
Collapse
|
22
|
Bonomi M, Heller GT, Camilloni C, Vendruscolo M. Principles of protein structural ensemble determination. Curr Opin Struct Biol 2017; 42:106-116. [PMID: 28063280 DOI: 10.1016/j.sbi.2016.12.004] [Citation(s) in RCA: 222] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 11/18/2016] [Accepted: 12/06/2016] [Indexed: 01/19/2023]
Abstract
The biological functions of protein molecules are intimately dependent on their conformational dynamics. This aspect is particularly evident for disordered proteins, which constitute perhaps one-third of the human proteome. Therefore, structural ensembles often offer more useful representations of proteins than individual conformations. Here, we describe how the well-established principles of protein structure determination should be extended to the case of protein structural ensembles determination. These principles concern primarily how to deal with conformationally heterogeneous states, and with experimental measurements that are averaged over such states and affected by a variety of errors. We first review the growing literature of recent methods that combine experimental and computational information to model structural ensembles, highlighting their similarities and differences. We then address some conceptual problems in the determination of structural ensembles and define future goals towards the establishment of objective criteria for the comparison, validation, visualization and dissemination of such ensembles.
Collapse
Affiliation(s)
| | | | - Carlo Camilloni
- Department of Chemistry and Institute for Advanced Study, Technische Universität München, D-85747 Garching, Germany
| | | |
Collapse
|
23
|
Olsson S, Noé F. Mechanistic Models of Chemical Exchange Induced Relaxation in Protein NMR. J Am Chem Soc 2016; 139:200-210. [DOI: 10.1021/jacs.6b09460] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Simon Olsson
- Computational Molecular
Biology,
FB Mathematik und Informatik, Freie Universität Berlin, Berlin 14195, Germany
| | - Frank Noé
- Computational Molecular
Biology,
FB Mathematik und Informatik, Freie Universität Berlin, Berlin 14195, Germany
| |
Collapse
|
24
|
Cesari A, Gil-Ley A, Bussi G. Combining Simulations and Solution Experiments as a Paradigm for RNA Force Field Refinement. J Chem Theory Comput 2016; 12:6192-6200. [DOI: 10.1021/acs.jctc.6b00944] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Andrea Cesari
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Alejandro Gil-Ley
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
25
|
van Gunsteren WF, Allison JR, Daura X, Dolenc J, Hansen N, Mark AE, Oostenbrink C, Rusu VH, Smith LJ. Bestimmung von Strukturinformation aus experimentellen Messdaten für Biomoleküle. Angew Chem Int Ed Engl 2016. [DOI: 10.1002/ange.201601828] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Wilfred F. van Gunsteren
- Laboratorium für Physikalische Chemie; Eidgenössische Technische Hochschule Zürich; 8093 Zürich Schweiz
| | - Jane R. Allison
- Centre for Theor. Chem. and Phys. & Institute of Natural and Mathematical Sciences; Massey Univ.; Auckland Neuseeland
- Biomolecular Interaction Centre; University of Canterbury, Christchurch; Neuseeland
- Maurice Wilkins Centre for Molecular Biodiscovery; Neuseeland
| | - Xavier Daura
- Institute of Biotechnology and Biomedicine; Universitat Autònoma de Barcelona (UAB); 08193 Barcelona Spanien
- Catalan Institution for Research and Advanced Studies (ICREA); 08010 Barcelona Spanien
| | - Jožica Dolenc
- Laboratorium für Physikalische Chemie; Eidgenössische Technische Hochschule Zürich; 8093 Zürich Schweiz
| | - Niels Hansen
- Institut für Technische Thermodynamik und Thermische Verfahrenstechnik; Universität Stuttgart; Pfaffenwaldring 9 70569 Stuttgart Deutschland
| | - Alan E. Mark
- School of Chemistry and Molecular Biosciences; University of Queensland; St. Lucia QLD 4072 Australien
| | - Chris Oostenbrink
- Institut für Molekulare Modellierung und Simulation; Universität für Bodenkultur Wien; Wien Österreich
| | - Victor H. Rusu
- Laboratorium für Physikalische Chemie; Eidgenössische Technische Hochschule Zürich; 8093 Zürich Schweiz
| | - Lorna J. Smith
- Department of Chemistry; University of Oxford, Inorganic Chemistry Laboratory; South Parks Road Oxford OX1 3QR Großbritannien
| |
Collapse
|
26
|
van Gunsteren WF, Allison JR, Daura X, Dolenc J, Hansen N, Mark AE, Oostenbrink C, Rusu VH, Smith LJ. Deriving Structural Information from Experimentally Measured Data on Biomolecules. Angew Chem Int Ed Engl 2016; 55:15990-16010. [PMID: 27862777 DOI: 10.1002/anie.201601828] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2016] [Revised: 07/08/2016] [Indexed: 12/27/2022]
Abstract
During the past half century, the number and accuracy of experimental techniques that can deliver values of observables for biomolecular systems have been steadily increasing. The conversion of a measured value Qexp of an observable quantity Q into structural information is, however, a task beset with theoretical and practical problems: 1) insufficient or inaccurate values of Qexp , 2) inaccuracies in the function Q(r→) used to relate the quantity Q to structure r→ , 3) how to account for the averaging inherent in the measurement of Qexp , 4) how to handle the possible multiple-valuedness of the inverse r→(Q) of the function Q(r→) , to mention a few. These apply to a variety of observable quantities Q and measurement techniques such as X-ray and neutron diffraction, small-angle and wide-angle X-ray scattering, free-electron laser imaging, cryo-electron microscopy, nuclear magnetic resonance, electron paramagnetic resonance, infrared and Raman spectroscopy, circular dichroism, Förster resonance energy transfer, atomic force microscopy and ion-mobility mass spectrometry. The process of deriving structural information from measured data is reviewed with an eye to non-experts and newcomers in the field using examples from the literature of the effect of the various choices and approximations involved in the process. A list of choices to be avoided is provided.
Collapse
Affiliation(s)
- Wilfred F van Gunsteren
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH, 8093, Zurich, Switzerland
| | - Jane R Allison
- Centre for Theor. Chem. and Phys. & Institute of Natural and Mathematical Sciences, Massey Univ., Auckland, New Zealand.,Biomolecular Interaction Centre, University of Canterbury, Christchurch, New Zealand.,Maurice Wilkins Centre for Molecular Biodiscovery, New Zealand
| | - Xavier Daura
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona (UAB), 08193, Barcelona, Spain.,Catalan Institution for Research and Advanced Studies (ICREA), 08010, Barcelona, Spain
| | - Jožica Dolenc
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH, 8093, Zurich, Switzerland
| | - Niels Hansen
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Pfaffenwaldring 9, 70569, Stuttgart, Germany
| | - Alan E Mark
- School of Chemistry and Molecular Biosciences, University of Queensland, St. Lucia, QLD 4072, Australia
| | - Chris Oostenbrink
- Institute of Molecular Modeling and Simulation, University of Natural Resources and Life Sciences, Vienna, Austria
| | - Victor H Rusu
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH, 8093, Zurich, Switzerland
| | - Lorna J Smith
- Department of Chemistry, University of Oxford, Inorganic Chemistry Laboratory, South Parks Road, Oxford, OX1 3QR, UK
| |
Collapse
|
27
|
The Dynamic Basis for Signal Propagation in Human Pin1-WW. Structure 2016; 24:1464-75. [PMID: 27499442 DOI: 10.1016/j.str.2016.06.013] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Revised: 06/11/2016] [Accepted: 06/14/2016] [Indexed: 12/23/2022]
Abstract
Allostery is the structural manifestation of information transduction in biomolecules. Its hallmark is conformational change induced by perturbations at a distal site. An increasing body of evidence demonstrates the presence of allostery in very flexible and even disordered proteins, encouraging a thermodynamic description of this phenomenon. Still, resolving such processes at atomic resolution is difficult. Here we establish a protocol to determine atomistic thermodynamic models of such systems using high-resolution solution state nuclear magnetic resonance data and extensive molecular simulations. Using this methodology, we study information transduction in the WW domain of a key cell-cycle regulator Pin1. Pin1 binds promiscuously to phospho-Ser/Thr-Pro motifs, however, disparate structural and dynamic responses have been reported upon binding different ligands. Our model consists of two topologically distinct states whose relative population may be specifically skewed by an incoming ligand. This model provides a canonical basis for the understanding of multi-functionality in Pin1.
Collapse
|
28
|
ENCORE: Software for Quantitative Ensemble Comparison. PLoS Comput Biol 2015; 11:e1004415. [PMID: 26505632 PMCID: PMC4624683 DOI: 10.1371/journal.pcbi.1004415] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 06/24/2015] [Indexed: 12/15/2022] Open
Abstract
There is increasing evidence that protein dynamics and conformational changes can play an important role in modulating biological function. As a result, experimental and computational methods are being developed, often synergistically, to study the dynamical heterogeneity of a protein or other macromolecules in solution. Thus, methods such as molecular dynamics simulations or ensemble refinement approaches have provided conformational ensembles that can be used to understand protein function and biophysics. These developments have in turn created a need for algorithms and software that can be used to compare structural ensembles in the same way as the root-mean-square-deviation is often used to compare static structures. Although a few such approaches have been proposed, these can be difficult to implement efficiently, hindering a broader applications and further developments. Here, we present an easily accessible software toolkit, called ENCORE, which can be used to compare conformational ensembles generated either from simulations alone or synergistically with experiments. ENCORE implements three previously described methods for ensemble comparison, that each can be used to quantify the similarity between conformational ensembles by estimating the overlap between the probability distributions that underlie them. We demonstrate the kinds of insights that can be obtained by providing examples of three typical use-cases: comparing ensembles generated with different molecular force fields, assessing convergence in molecular simulations, and calculating differences and similarities in structural ensembles refined with various sources of experimental data. We also demonstrate efficient computational scaling for typical analyses, and robustness against both the size and sampling of the ensembles. ENCORE is freely available and extendable, integrates with the established MDAnalysis software package, reads ensemble data in many common formats, and can work with large trajectory files.
Collapse
|
29
|
Olsson S, Cavalli A. Quantification of Entropy-Loss in Replica-Averaged Modeling. J Chem Theory Comput 2015; 11:3973-7. [DOI: 10.1021/acs.jctc.5b00579] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Simon Olsson
- Institute for Research in Biomedicine, Via Vincenzo Vela 6, CH-6500 Bellinzona, Ticino, Switzerland
- Laboratory
of Physical Chemistry, Swiss Federal Institute of Technology, ETH-Hönggerberg, Vladimir-Prelog-Weg 2, CH-8093 Zürich, Zürich, Switzerland
| | - Andrea Cavalli
- Institute for Research in Biomedicine, Via Vincenzo Vela 6, CH-6500 Bellinzona, Ticino, Switzerland
- Department
of Chemistry, University of Cambridge, Cambridge, CB2 1EW United Kingdom
| |
Collapse
|
30
|
White AD, Dama JF, Voth GA. Designing Free Energy Surfaces That Match Experimental Data with Metadynamics. J Chem Theory Comput 2015; 11:2451-60. [DOI: 10.1021/acs.jctc.5b00178] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Andrew D. White
- Department of Chemistry,
James Franck Institute, Institute for Biophysical Dynamics, and Computation
Institute, The University of Chicago, 5735 South Ellis Avenue, Chicago, Illinois 60637, United States
- Center for Nonlinear Studies,
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - James F. Dama
- Department of Chemistry,
James Franck Institute, Institute for Biophysical Dynamics, and Computation
Institute, The University of Chicago, 5735 South Ellis Avenue, Chicago, Illinois 60637, United States
- Center for Nonlinear Studies,
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Gregory A. Voth
- Department of Chemistry,
James Franck Institute, Institute for Biophysical Dynamics, and Computation
Institute, The University of Chicago, 5735 South Ellis Avenue, Chicago, Illinois 60637, United States
- Center for Nonlinear Studies,
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
31
|
Olsson S, Ekonomiuk D, Sgrignani J, Cavalli A. Molecular Dynamics of Biomolecules through Direct Analysis of Dipolar Couplings. J Am Chem Soc 2015; 137:6270-8. [PMID: 25895902 DOI: 10.1021/jacs.5b01289] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Residual dipolar couplings (RDCs) are important probes in structural biology, but their analysis is often complicated by the determination of an alignment tensor or its associated assumptions. We here apply the maximum entropy principle to derive a tensor-free formalism which allows for direct, dynamic analysis of RDCs and holds the classic tensor formalism as a special case. Specifically, the framework enables us to robustly analyze data regardless of whether a clear separation of internal and overall dynamics is possible. Such a separation is often difficult in the core subjects of current structural biology, which include multidomain and intrinsically disordered proteins as well as nucleic acids. We demonstrate the method is tractable and self-consistent and generalizes to data sets comprised of observations from multiple different alignment conditions.
Collapse
Affiliation(s)
- Simon Olsson
- †Institute for Research in Biomedicine, Via Vincenzo Vela 6, CH-6500 Bellinzona, Switzerland.,‡Laboratory of Physical Chemistry, Eidgenössische Technische Hochschule Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Dariusz Ekonomiuk
- †Institute for Research in Biomedicine, Via Vincenzo Vela 6, CH-6500 Bellinzona, Switzerland
| | - Jacopo Sgrignani
- †Institute for Research in Biomedicine, Via Vincenzo Vela 6, CH-6500 Bellinzona, Switzerland
| | - Andrea Cavalli
- †Institute for Research in Biomedicine, Via Vincenzo Vela 6, CH-6500 Bellinzona, Switzerland.,§Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW United Kingdom
| |
Collapse
|
32
|
Beauchamp KA, Pande VS, Das R. Bayesian energy landscape tilting: towards concordant models of molecular ensembles. Biophys J 2014; 106:1381-90. [PMID: 24655513 DOI: 10.1016/j.bpj.2014.02.009] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2013] [Revised: 01/19/2014] [Accepted: 02/06/2014] [Indexed: 11/28/2022] Open
Abstract
Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data.
Collapse
Affiliation(s)
- Kyle A Beauchamp
- Computational Biology Center, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Vijay S Pande
- Departments of Chemistry, Computer Science, and Structural Biology and Biophysics Program, Stanford University, Stanford, California.
| | - Rhiju Das
- Departments of Biochemistry and Physics and Biophysics Program, Stanford University, Stanford, California.
| |
Collapse
|
33
|
Hansen N, Heller F, Schmid N, van Gunsteren WF. Time-averaged order parameter restraints in molecular dynamics simulations. JOURNAL OF BIOMOLECULAR NMR 2014; 60:169-187. [PMID: 25312596 DOI: 10.1007/s10858-014-9866-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 09/25/2014] [Indexed: 06/04/2023]
Abstract
A method is described that allows experimental S(2) order parameters to be enforced as a time-averaged quantity in molecular dynamics simulations. The two parameters that characterize time-averaged restraining, the memory relaxation time and the weight of the restraining potential energy term in the potential energy function used in the simulation, are systematically investigated based on two model systems, a vector with one end restrained in space and a pentapeptide. For the latter it is shown that the backbone N-H order parameter of individual residues can be enforced such that the spatial fluctuations of quantities depending on atomic coordinates are not significantly perturbed. The applicability to realistic systems is illustrated for the B3 domain of protein G in aqueous solution.
Collapse
Affiliation(s)
- Niels Hansen
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH, 8093, Zurich, Switzerland,
| | | | | | | |
Collapse
|
34
|
Voelz VA, Zhou G. Bayesian inference of conformational state populations from computational models and sparse experimental observables. J Comput Chem 2014; 35:2215-24. [PMID: 25250719 DOI: 10.1002/jcc.23738] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2014] [Revised: 08/25/2014] [Accepted: 08/31/2014] [Indexed: 12/29/2022]
Abstract
We present a Bayesian inference approach to estimating conformational state populations from a combination of molecular modeling and sparse experimental data. Unlike alternative approaches, our method is designed for use with small molecules and emphasizes high-resolution structural models, using inferential structure determination with reference potentials, and Markov Chain Monte Carlo to sample the posterior distribution of conformational states. As an application of the method, we determine solution-state conformational populations of the 14-membered macrocycle cineromycin B, using a combination of previously published sparse Nuclear Magnetic Resonance (NMR) observables and replica-exchange molecular dynamic/Quantum Mechanical (QM)-refined conformational ensembles. Our results agree better with experimental data compared to previous modeling efforts. Bayes factors are calculated to quantify the consistency of computational modeling with experiment, and the relative importance of reference potentials and other model parameters.
Collapse
Affiliation(s)
- Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania
| | | |
Collapse
|
35
|
Equilibrium simulations of proteins using molecular fragment replacement and NMR chemical shifts. Proc Natl Acad Sci U S A 2014; 111:13852-7. [PMID: 25192938 DOI: 10.1073/pnas.1404948111] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Methods of protein structure determination based on NMR chemical shifts are becoming increasingly common. The most widely used approaches adopt the molecular fragment replacement strategy, in which structural fragments are repeatedly reassembled into different complete conformations in molecular simulations. Although these approaches are effective in generating individual structures consistent with the chemical shift data, they do not enable the sampling of the conformational space of proteins with correct statistical weights. Here, we present a method of molecular fragment replacement that makes it possible to perform equilibrium simulations of proteins, and hence to determine their free energy landscapes. This strategy is based on the encoding of the chemical shift information in a probabilistic model in Markov chain Monte Carlo simulations. First, we demonstrate that with this approach it is possible to fold proteins to their native states starting from extended structures. Second, we show that the method satisfies the detailed balance condition and hence it can be used to carry out an equilibrium sampling from the Boltzmann distribution corresponding to the force field used in the simulations. Third, by comparing the results of simulations carried out with and without chemical shift restraints we describe quantitatively the effects that these restraints have on the free energy landscapes of proteins. Taken together, these results demonstrate that the molecular fragment replacement strategy can be used in combination with chemical shift information to characterize not only the native structures of proteins but also their conformational fluctuations.
Collapse
|
36
|
Olsson S, Vögeli BR, Cavalli A, Boomsma W, Ferkinghoff-Borg J, Lindorff-Larsen K, Hamelryck T. Probabilistic Determination of Native State Ensembles of Proteins. J Chem Theory Comput 2014; 10:3484-91. [DOI: 10.1021/ct5001236] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Simon Olsson
- Bioinformatics
Centre, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
- Institute for Research in Biomedicine, CH-6500 Bellinzona, Switzerland
| | - Beat Rolf Vögeli
- Laboratory
of Physical Chemistry, Eidgenössische Technische Hochschule Zürich, 8093 Zürich, Switzerland
| | - Andrea Cavalli
- Institute for Research in Biomedicine, CH-6500 Bellinzona, Switzerland
| | - Wouter Boomsma
- Structural
Biology and NMR Laboratory, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Jesper Ferkinghoff-Borg
- Cellular
Signal Integration Group, Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | - Kresten Lindorff-Larsen
- Structural
Biology and NMR Laboratory, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Hamelryck
- Bioinformatics
Centre, Department of Biology, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
37
|
Vögeli B, Orts J, Strotz D, Chi C, Minges M, Wälti MA, Güntert P, Riek R. Towards a true protein movie: a perspective on the potential impact of the ensemble-based structure determination using exact NOEs. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2014; 241:53-59. [PMID: 24656080 DOI: 10.1016/j.jmr.2013.11.016] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Revised: 11/15/2013] [Accepted: 11/18/2013] [Indexed: 06/03/2023]
Abstract
Confined by the Boltzmann distribution of the energies of the states, a multitude of structural states are inherent to biomolecules. For a detailed understanding of a protein's function, its entire structural landscape at atomic resolution and insight into the interconversion between all the structural states (i.e. dynamics) are required. Whereas dedicated trickery with NMR relaxation provides aspects of local dynamics, and 3D structure determination by NMR is well established, only recently have several attempts been made to formulate a more comprehensive description of the dynamics and the structural landscape of a protein. Here, a perspective is given on the use of exact NOEs (eNOEs) for the elucidation of structural ensembles of a protein describing the covered conformational space.
Collapse
Affiliation(s)
- Beat Vögeli
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland.
| | - Julien Orts
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland
| | - Dean Strotz
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland
| | - Celestine Chi
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland
| | - Martina Minges
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland
| | - Marielle Aulikki Wälti
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland
| | - Peter Güntert
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, and Frankfurt Institute for Advanced Studies, J.W. Goethe-Universität, Max-von-Laue-Str. 9, 60438 Frankfurt am Main, Germany; Graduate School of Science, Tokyo Metropolitan University, Hachioji, 192-0397 Tokyo, Japan
| | - Roland Riek
- Laboratory of Physical Chemistry, ETH Zurich, ETH-Hönggerberg, CH-8093 Zürich, Switzerland.
| |
Collapse
|
38
|
Vögeli B. The nuclear Overhauser effect from a quantitative perspective. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2014; 78:1-46. [PMID: 24534087 DOI: 10.1016/j.pnmrs.2013.11.001] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Accepted: 11/13/2013] [Indexed: 05/26/2023]
Abstract
The nuclear Overhauser enhancement or effect (NOE) is the most important measure in liquid-state NMR with macromolecules. Thus, the NOE is the subject of numerous reviews and books. Here, the NOE is revisited in light of our recently introduced measurements of exact nuclear Overhauser enhancements (eNOEs), which enabled the determination of multiple-state 3D protein structures. This review encompasses all relevant facets from the theoretical considerations to the use of eNOEs in multiple-state structure calculation. Important aspects include a detailed presentation of the relaxation theory relevant for the nuclear Overhauser effect, the estimation of the correction for spin diffusion, the experimental determination of the eNOEs, the conversion of eNOE rates into distances and validation of their quality, the distance-restraint classification and the protocols for calculation of structures and ensembles.
Collapse
Affiliation(s)
- Beat Vögeli
- Laboratory of Physical Chemistry, HCI F217, Wolfgang-Pauli-Str. 10, Swiss Federal Institute of Technology, ETH-Hönggerberg, CH-8093 Zürich, Switzerland.
| |
Collapse
|
39
|
Boomsma W, Ferkinghoff-Borg J, Lindorff-Larsen K. Combining experiments and simulations using the maximum entropy principle. PLoS Comput Biol 2014; 10:e1003406. [PMID: 24586124 PMCID: PMC3930489 DOI: 10.1371/journal.pcbi.1003406] [Citation(s) in RCA: 140] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
A key component of computational biology is to compare the results of computer modelling with experimental measurements. Despite substantial progress in the models and algorithms used in many areas of computational biology, such comparisons sometimes reveal that the computations are not in quantitative agreement with experimental data. The principle of maximum entropy is a general procedure for constructing probability distributions in the light of new data, making it a natural tool in cases when an initial model provides results that are at odds with experiments. The number of maximum entropy applications in our field has grown steadily in recent years, in areas as diverse as sequence analysis, structural modelling, and neurobiology. In this Perspectives article, we give a broad introduction to the method, in an attempt to encourage its further adoption. The general procedure is explained in the context of a simple example, after which we proceed with a real-world application in the field of molecular simulations, where the maximum entropy procedure has recently provided new insight. Given the limited accuracy of force fields, macromolecular simulations sometimes produce results that are at not in complete and quantitative accordance with experiments. A common solution to this problem is to explicitly ensure agreement between the two by perturbing the potential energy function towards the experimental data. So far, a general consensus for how such perturbations should be implemented has been lacking. Three very recent papers have explored this problem using the maximum entropy approach, providing both new theoretical and practical insights to the problem. We highlight each of these contributions in turn and conclude with a discussion on remaining challenges.
Collapse
Affiliation(s)
- Wouter Boomsma
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- * E-mail: (WB); (JFB); (KLL)
| | - Jesper Ferkinghoff-Borg
- Cellular Signal Integration Group, Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
- * E-mail: (WB); (JFB); (KLL)
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- * E-mail: (WB); (JFB); (KLL)
| |
Collapse
|
40
|
Sanchez-Martinez M, Crehuet R. Application of the maximum entropy principle to determine ensembles of intrinsically disordered proteins from residual dipolar couplings. Phys Chem Chem Phys 2014; 16:26030-9. [DOI: 10.1039/c4cp03114h] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
We present a method based on the maximum entropy principle that can re-weight an ensemble of protein structures based on data from residual dipolar couplings (RDCs).
Collapse
Affiliation(s)
| | - R. Crehuet
- Institute of Advanced Chemistry of Catalunya (IQAC)
- CSIC
- Spain
| |
Collapse
|