51
|
Fröhlking T, Bernetti M, Calonaci N, Bussi G. Toward empirical force fields that match experimental observables. J Chem Phys 2020; 152:230902. [PMID: 32571067 DOI: 10.1063/5.0011346] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Biomolecular force fields have been traditionally derived based on a mixture of reference quantum chemistry data and experimental information obtained on small fragments. However, the possibility to run extensive molecular dynamics simulations on larger systems achieving ergodic sampling is paving the way to directly using such simulations along with solution experiments obtained on macromolecular systems. Recently, a number of methods have been introduced to automatize this approach. Here, we review these methods, highlight their relationship with machine learning methods, and discuss the open challenges in the field.
Collapse
Affiliation(s)
- Thorben Fröhlking
- Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, Trieste 34136, Italy
| | - Mattia Bernetti
- Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, Trieste 34136, Italy
| | - Nicola Calonaci
- Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, Trieste 34136, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, Trieste 34136, Italy
| |
Collapse
|
52
|
Cuturello F, Tiana G, Bussi G. Assessing the accuracy of direct-coupling analysis for RNA contact prediction. RNA (NEW YORK, N.Y.) 2020; 26:637-647. [PMID: 32115426 PMCID: PMC7161351 DOI: 10.1261/rna.074179.119] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 02/26/2020] [Indexed: 05/31/2023]
Abstract
Many noncoding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is, however, a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, the idea of exploiting the covariance of mutations within a family to predict the protein structure using the direct-coupling-analysis (DCA) method has emerged in the last decade. The application of DCA to RNA systems has been limited so far. We here perform an assessment of the DCA method on 17 riboswitch families, comparing it with the commonly used mutual information analysis and with state-of-the-art R-scape covariance method. We also compare different flavors of DCA, including mean-field, pseudolikelihood, and a proposed stochastic procedure (Boltzmann learning) for solving exactly the DCA inverse problem. Boltzmann learning outperforms the other methods in predicting contacts observed in high-resolution crystal structures.
Collapse
Affiliation(s)
- Francesca Cuturello
- Scuola Internazionale Superiore di Studi Avanzati, International School for Advanced Studies, 34136 Trieste, Italy
| | - Guido Tiana
- Center for Complexity and Biosystems and Department of Physics, Università degli Studi di Milano and INFN, 20133 Milano, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, International School for Advanced Studies, 34136 Trieste, Italy
| |
Collapse
|
53
|
Kauffmann C, Kazimierczuk K, Schwarz TC, Konrat R, Zawadzka-Kazimierczuk A. A novel high-dimensional NMR experiment for resolving protein backbone dihedral angle ambiguities. JOURNAL OF BIOMOLECULAR NMR 2020; 74:257-265. [PMID: 32239382 PMCID: PMC7211790 DOI: 10.1007/s10858-020-00308-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/12/2020] [Indexed: 05/07/2023]
Abstract
Intrinsically disordered proteins (IDPs) are challenging established structural biology perception and urge a reassessment of the conventional understanding of the subtle interplay between protein structure and dynamics. Due to their importance in eukaryotic life and central role in protein interaction networks, IDP research is a fascinating and highly relevant research area in which NMR spectroscopy is destined to be a key player. The flexible nature of IDPs, as a result of the sampling of a vast conformational space, however, poses a tremendous scientific challenge, both technically and theoretically. Pronounced signal averaging results in narrow signal dispersion and requires higher dimensionality NMR techniques. Moreover, a fundamental problem in the structural characterization of IDPs is the definition of the conformational ensemble sampled by the polypeptide chain in solution, where often the interpretation relies on the concept of 'residual structure' or 'conformational preference'. An important source of structural information is information-rich NMR experiments that probe protein backbone dihedral angles in a unique manner. Cross-correlated relaxation experiments have proven to fulfil this task as they provide unique information about protein backbones, particularly in IDPs. Here we present a novel cross-correlation experiment that utilizes non-uniform sampling detection schemes to resolve protein backbone dihedral ambiguities in IDPs. The sensitivity of this novel technique is illustrated with an application to the prototypical IDP [Formula: see text]-Synculein for which unexpected deviations from random-coil-like behaviour could be observed.
Collapse
Affiliation(s)
- Clemens Kauffmann
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria
| | | | - Thomas C Schwarz
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria
| | - Robert Konrat
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria.
| | - Anna Zawadzka-Kazimierczuk
- Max Perutz Laboratories, Department of Structural and Computational Biology, University of Vienna, Vienna Biocenter Campus 5, 1030, Vienna, Austria.
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, 02-089, Warsaw, Poland.
| |
Collapse
|
54
|
Jia Z, Li J, Ge X, Wu Y, Guo Y, Wu Q. Tandem CTCF sites function as insulators to balance spatial chromatin contacts and topological enhancer-promoter selection. Genome Biol 2020; 21:75. [PMID: 32293525 PMCID: PMC7087399 DOI: 10.1186/s13059-020-01984-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 03/04/2020] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND CTCF is a key insulator-binding protein, and mammalian genomes contain numerous CTCF sites, many of which are organized in tandem. RESULTS Using CRISPR DNA-fragment editing, in conjunction with chromosome conformation capture, we find that CTCF sites, if located between enhancers and promoters in the protocadherin (Pcdh) and β-globin clusters, function as an enhancer-blocking insulator by forming distinct directional chromatin loops, regardless whether enhancers contain CTCF sites or not. Moreover, computational simulation in silico and genetic deletions in vivo as well as dCas9 blocking in vitro revealed balanced promoter usage in cell populations and stochastic monoallelic expression in single cells by large arrays of tandem CTCF sites in the Pcdh and immunoglobulin heavy chain (Igh) clusters. Furthermore, CTCF insulators promote, counter-intuitively, long-range chromatin interactions with distal directional CTCF sites, consistent with the cohesin "loop extrusion" model. Finally, gene expression levels are negatively correlated with CTCF insulators located between enhancers and promoters on a genome-wide scale. Thus, single CTCF insulators ensure proper enhancer insulation and promoter activation while tandem CTCF topological insulators determine balanced spatial contacts and promoter choice. CONCLUSIONS These findings have interesting implications on the role of topological chromatin insulators in 3D genome folding and developmental gene regulation.
Collapse
Affiliation(s)
- Zhilian Jia
- MOE Key Lab of Systems Biomedicine, Center for Comparative Biomedicine, State Key Lab of Oncogenes and Related Genes, Shanghai Cancer Institute, Joint International Research Laboratory of Metabolic & Developmental Sciences, Institute of Systems Biomedicine, Xin Hua Hospital, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jingwei Li
- MOE Key Lab of Systems Biomedicine, Center for Comparative Biomedicine, State Key Lab of Oncogenes and Related Genes, Shanghai Cancer Institute, Joint International Research Laboratory of Metabolic & Developmental Sciences, Institute of Systems Biomedicine, Xin Hua Hospital, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiao Ge
- MOE Key Lab of Systems Biomedicine, Center for Comparative Biomedicine, State Key Lab of Oncogenes and Related Genes, Shanghai Cancer Institute, Joint International Research Laboratory of Metabolic & Developmental Sciences, Institute of Systems Biomedicine, Xin Hua Hospital, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yonghu Wu
- MOE Key Lab of Systems Biomedicine, Center for Comparative Biomedicine, State Key Lab of Oncogenes and Related Genes, Shanghai Cancer Institute, Joint International Research Laboratory of Metabolic & Developmental Sciences, Institute of Systems Biomedicine, Xin Hua Hospital, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Ya Guo
- MOE Key Lab of Systems Biomedicine, Center for Comparative Biomedicine, State Key Lab of Oncogenes and Related Genes, Shanghai Cancer Institute, Joint International Research Laboratory of Metabolic & Developmental Sciences, Institute of Systems Biomedicine, Xin Hua Hospital, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Qiang Wu
- MOE Key Lab of Systems Biomedicine, Center for Comparative Biomedicine, State Key Lab of Oncogenes and Related Genes, Shanghai Cancer Institute, Joint International Research Laboratory of Metabolic & Developmental Sciences, Institute of Systems Biomedicine, Xin Hua Hospital, Shanghai Jiao Tong University, Shanghai, 200240, China.
- The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510150, China.
| |
Collapse
|
55
|
Reißer S, Zucchelli S, Gustincich S, Bussi G. Conformational ensembles of an RNA hairpin using molecular dynamics and sparse NMR data. Nucleic Acids Res 2020; 48:1164-1174. [PMID: 31889193 PMCID: PMC7026608 DOI: 10.1093/nar/gkz1184] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 12/05/2019] [Accepted: 12/09/2019] [Indexed: 01/12/2023] Open
Abstract
Solution nuclear magnetic resonance (NMR) experiments allow RNA dynamics to be determined in an aqueous environment. However, when a limited number of peaks are assigned, it is difficult to obtain structural information. We here show a protocol based on the combination of experimental data (Nuclear Overhauser Effect, NOE) and molecular dynamics simulations with enhanced sampling methods. This protocol allows to (a) obtain a maximum entropy ensemble compatible with NMR restraints and (b) obtain a minimal set of metastable conformations compatible with the experimental data (maximum parsimony). The method is applied to a hairpin of 29 nt from an inverted SINEB2, which is part of the SINEUP family and has been shown to enhance protein translation. A clustering procedure is introduced where the annotation of base-base interactions and glycosidic bond angles is used as a metric. By reweighting the contributions of the clusters, minimal sets of four conformations could be found which are compatible with the experimental data. A motif search on the structural database showed that some identified low-population states are present in experimental structures of other RNA transcripts. The introduced method can be applied to characterize RNA dynamics in systems where a limited amount of NMR information is available.
Collapse
Affiliation(s)
- Sabine Reißer
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| | - Silvia Zucchelli
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy
- Department of Health Sciences, Center for Autoimmune and Allergic Diseases (CAAD) and Interdisciplinary Research Center of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Novara, Italy
| | - Stefano Gustincich
- Central RNA Laboratory and Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia (IIT), 16163 Genova, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
56
|
Bradshaw RT, Marinelli F, Faraldo-Gómez JD, Forrest LR. Interpretation of HDX Data by Maximum-Entropy Reweighting of Simulated Structural Ensembles. Biophys J 2020; 118:1649-1664. [PMID: 32105651 PMCID: PMC7136279 DOI: 10.1016/j.bpj.2020.02.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 01/28/2020] [Accepted: 02/05/2020] [Indexed: 01/12/2023] Open
Abstract
Hydrogen-deuterium exchange combined with mass spectrometry (HDX-MS) is a widely applied biophysical technique that probes the structure and dynamics of biomolecules without the need for site-directed modifications or bio-orthogonal labels. The mechanistic interpretation of HDX data, however, is often qualitative and subjective, owing to a lack of quantitative methods to rigorously translate observed deuteration levels into atomistic structural information. To help address this problem, we have developed a methodology to generate structural ensembles that faithfully reproduce HDX-MS measurements. In this approach, an ensemble of protein conformations is first generated, typically using molecular dynamics simulations. A maximum-entropy bias is then applied post hoc to the resulting ensemble such that averaged peptide-deuteration levels, as predicted by an empirical model, agree with target values within a given level of uncertainty. We evaluate this approach, referred to as HDX ensemble reweighting (HDXer), for artificial target data reflecting the two major conformational states of a binding protein. We demonstrate that the information provided by HDX-MS experiments and by the model of exchange are sufficient to recover correctly weighted structural ensembles from simulations, even when the relevant conformations are rarely observed. Degrading the information content of the target data—e.g., by reducing sequence coverage, by averaging exchange levels over longer peptide segments, or by incorporating different sources of uncertainty—reduces the structural accuracy of the reweighted ensemble but still allows for useful insights into the distinctive structural features reflected by the target data. Finally, we describe a quantitative metric to rank candidate structural ensembles according to their correspondence with target data and illustrate the use of HDXer to describe changes in the conformational ensemble of the membrane protein LeuT. In summary, HDXer is designed to facilitate objective structural interpretations of HDX-MS data and to inform experimental approaches and further developments of theoretical exchange models.
Collapse
Affiliation(s)
- Richard T Bradshaw
- Computational Structural Biology Section, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland
| | - Fabrizio Marinelli
- Theoretical Molecular Biophysics Unit, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - José D Faraldo-Gómez
- Theoretical Molecular Biophysics Unit, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland.
| | - Lucy R Forrest
- Computational Structural Biology Section, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland.
| |
Collapse
|
57
|
Geraets JA, Pothula KR, Schröder GF. Integrating cryo-EM and NMR data. Curr Opin Struct Biol 2020; 61:173-181. [PMID: 32028106 DOI: 10.1016/j.sbi.2020.01.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 01/13/2020] [Accepted: 01/14/2020] [Indexed: 01/06/2023]
Abstract
Single-particle cryo-electron microscopy (cryo-EM) is increasingly used as a technique to determine the atomic structure of challenging biological systems. Recent advances in microscope engineering, electron detection, and image processing have allowed the structural determination of bigger and more flexible targets than possible with the complementary techniques X-ray crystallography and NMR spectroscopy. However, there exist many biological targets for which atomic resolution cannot be currently achieved with cryo-EM, making unambiguous determination of the protein structure impossible. Although determining the structure of large biological systems using solely NMR is often difficult, highly complementary experimental atomic-level data for each molecule can be derived from the spectra, and used in combination with cryo-EM data. We review here strategies with which both techniques can be synergistically combined, in order to reach detail and understanding unattainable by each technique acting alone; and the types of biological systems for which such an approach would be desirable.
Collapse
Affiliation(s)
- James A Geraets
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and JuStruct, Jülich Center for Structural Biology, Forschungszentrum Jülich, 52425 Jülich, Germany
| | - Karunakar R Pothula
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and JuStruct, Jülich Center for Structural Biology, Forschungszentrum Jülich, 52425 Jülich, Germany
| | - Gunnar F Schröder
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) and JuStruct, Jülich Center for Structural Biology, Forschungszentrum Jülich, 52425 Jülich, Germany; Physics Department, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany.
| |
Collapse
|
58
|
Orioli S, Larsen AH, Bottaro S, Lindorff-Larsen K. How to learn from inconsistencies: Integrating molecular simulations with experimental data. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 170:123-176. [PMID: 32145944 DOI: 10.1016/bs.pmbts.2019.12.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Molecular simulations and biophysical experiments can be used to provide independent and complementary insights into the molecular origin of biological processes. A particularly useful strategy is to use molecular simulations as a modeling tool to interpret experimental measurements, and to use experimental data to refine our biophysical models. Thus, explicit integration and synergy between molecular simulations and experiments is fundamental for furthering our understanding of biological processes. This is especially true in the case where discrepancies between measured and simulated observables emerge. In this chapter, we provide an overview of some of the core ideas behind methods that were developed to improve the consistency between experimental information and numerical predictions. We distinguish between situations where experiments are used to refine our understanding and models of specific systems, and situations where experiments are used more generally to refine transferable models. We discuss different philosophies and attempt to unify them in a single framework. Until now, such integration between experiments and simulations have mostly been applied to equilibrium data, and we discuss more recent developments aimed to analyze time-dependent or time-resolved data.
Collapse
Affiliation(s)
- Simone Orioli
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Haahr Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Atomistic Simulations Laboratory, Istituto Italiano di Tecnologia, Genova, Italy
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
59
|
Ahmed MC, Crehuet R, Lindorff-Larsen K. Computing, Analyzing, and Comparing the Radius of Gyration and Hydrodynamic Radius in Conformational Ensembles of Intrinsically Disordered Proteins. Methods Mol Biol 2020; 2141:429-445. [PMID: 32696370 DOI: 10.1007/978-1-0716-0524-0_21] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The level of compaction of an intrinsically disordered protein may affect both its physical and biological properties, and can be probed via different types of biophysical experiments. Small-angle X-ray scattering (SAXS) probe the radius of gyration (Rg) whereas pulsed-field-gradient nuclear magnetic resonance (NMR) diffusion, fluorescence correlation spectroscopy, and dynamic light scattering experiments can be used to determine the hydrodynamic radius (Rh). Here we show how to calculate Rg and Rh from a computationally generated conformational ensemble of an intrinsically disordered protein. We further describe how to use a Bayesian/Maximum Entropy procedure to integrate data from SAXS and NMR diffusion experiments, so as to derive conformational ensembles in agreement with those experiments.
Collapse
Affiliation(s)
- Mustapha Carab Ahmed
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
| | - Ramon Crehuet
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen N, Denmark
- Institute for Advanced Chemistry of Catalonia (IQAC-CSIC), Barcelona, Spain
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen N, Denmark.
| |
Collapse
|
60
|
Integrating Molecular Simulation and Experimental Data: A Bayesian/Maximum Entropy Reweighting Approach. Methods Mol Biol 2020; 2112:219-240. [PMID: 32006288 DOI: 10.1007/978-1-0716-0270-6_15] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
We describe a Bayesian/Maximum entropy (BME) procedure and software to construct a conformational ensemble of a biomolecular system by integrating molecular simulations and experimental data. First, an initial conformational ensemble is constructed using, for example, Molecular Dynamics or Monte Carlo simulations. Due to potential inaccuracies in the model and finite sampling effects, properties predicted from simulations may not agree with experimental data. In BME we use the experimental data to refine the simulation so that the new conformational ensemble has the following properties: (1) the calculated averages are close to the experimental values taking uncertainty into account and (2) it maximizes the relative Shannon entropy with respect to the original simulation ensemble. The output of this procedure is a set of optimized weights that can be used to calculate other properties and distributions of these. Here, we provide a practical guide on how to obtain and use such weights, how to choose adjustable parameters and discuss shortcomings of the method.
Collapse
|
61
|
Latham AP, Zhang B. Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins. J Chem Theory Comput 2019; 16:773-781. [PMID: 31756104 DOI: 10.1021/acs.jctc.9b00932] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Intrinsically disordered proteins (IDPs) constitute a significant fraction of eukaryotic proteomes. High-resolution characterization of IDP conformational ensembles can help elucidate their roles in a wide range of biological processes but remains challenging both experimentally and computationally. Here, we present a generic algorithm to improve the accuracy of coarse-grained IDP models using a diverse set of experimental measurements. It combines maximum entropy optimization and least-squares regression to systematically adjust model parameters and improve the agreement between simulation and experiment. We successfully applied the algorithm to derive a transferable force field, which we term the maximum entropy optimized force field (MOFF), for de novo prediction of IDP structures. Statistical analysis of force field parameters reveals features of amino acid interactions not captured by potentials designed to work well for folded proteins. We anticipate its combination of efficiency and accuracy will make MOFF useful for studying the phase separation of IDPs, which drives the formation of various biological compartments.
Collapse
Affiliation(s)
- Andrew P Latham
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Bin Zhang
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
62
|
Abstract
Bayesian and Maximum Entropy approaches allow for a statistically sound and systematic fitting of experimental and computational data. Unfortunately, assessing the relative confidence in these two types of data remains difficult as several steps add unknown error. Here we propose the use of a validation-set method to determine the balance, and thus the amount of fitting. We apply the method to synthetic NMR chemical shift data of an intrinsically disordered protein. We show that the method gives consistent results even when other methods to assess the amount of fitting cannot be applied. Finally, we also describe how the errors in the chemical shift predictor can lead to an incorrect fitting and how using secondary chemical shifts could alleviate this problem.
Collapse
|
63
|
Hermann MR, Hub JS. SAXS-Restrained Ensemble Simulations of Intrinsically Disordered Proteins with Commitment to the Principle of Maximum Entropy. J Chem Theory Comput 2019; 15:5103-5115. [DOI: 10.1021/acs.jctc.9b00338] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Markus R. Hermann
- Institute for Microbiology and Genetics, Georg-August-Universität Göttingen, 37077 Göttingen, Germany
| | - Jochen S. Hub
- Theoretical Physics and Center for Biophysics, Saarland University, Campus E2 6, 66123 Saarbrücken, Germany
| |
Collapse
|
64
|
Bonomi M, Vendruscolo M. Determination of protein structural ensembles using cryo-electron microscopy. Curr Opin Struct Biol 2019; 56:37-45. [DOI: 10.1016/j.sbi.2018.10.006] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 10/24/2018] [Accepted: 10/26/2018] [Indexed: 10/27/2022]
|
65
|
Köfinger J, Stelzl LS, Reuter K, Allande C, Reichel K, Hummer G. Efficient Ensemble Refinement by Reweighting. J Chem Theory Comput 2019; 15:3390-3401. [PMID: 30939006 PMCID: PMC6727217 DOI: 10.1021/acs.jctc.8b01231] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Indexed: 01/24/2023]
Abstract
Ensemble refinement produces structural ensembles of flexible and dynamic biomolecules by integrating experimental data and molecular simulations. Here we present two efficient numerical methods to solve the computationally challenging maximum-entropy problem arising from a Bayesian formulation of ensemble refinement. Recasting the resulting constrained weight optimization problem into an unconstrained form enables the use of gradient-based algorithms. In two complementary formulations that differ in their dimensionality, we optimize either the log-weights directly or the generalized forces appearing in the explicit analytical form of the solution. We first demonstrate the robustness, accuracy, and efficiency of the two methods using synthetic data. We then use NMR J-couplings to reweight an all-atom molecular dynamics simulation ensemble of the disordered peptide Ala-5 simulated with the AMBER99SB*-ildn-q force field. After reweighting, we find a consistent increase in the population of the polyproline-II conformations and a decrease of α-helical-like conformations. Ensemble refinement makes it possible to infer detailed structural models for biomolecules exhibiting significant dynamics, such as intrinsically disordered proteins, by combining input from experiment and simulation in a balanced manner.
Collapse
Affiliation(s)
- Jürgen Köfinger
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Max-von-Laue-Straße
3, 60438 Frankfurt
am Main, Germany
| | - Lukas S. Stelzl
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Max-von-Laue-Straße
3, 60438 Frankfurt
am Main, Germany
| | - Klaus Reuter
- Max Planck Computing and
Data Facility, Gießenbachstr. 2, 85748 Garching, Germany
| | - César Allande
- Max Planck Computing and
Data Facility, Gießenbachstr. 2, 85748 Garching, Germany
| | - Katrin Reichel
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Max-von-Laue-Straße
3, 60438 Frankfurt
am Main, Germany
| | - Gerhard Hummer
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Max-von-Laue-Straße
3, 60438 Frankfurt
am Main, Germany
- Institute for Biophysics, Goethe University, 60438 Frankfurt
am Main, Germany
| |
Collapse
|
66
|
Cesari A, Bottaro S, Lindorff-Larsen K, Banáš P, Šponer J, Bussi G. Fitting Corrections to an RNA Force Field Using Experimental Data. J Chem Theory Comput 2019; 15:3425-3431. [DOI: 10.1021/acs.jctc.9b00206] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Andrea Cesari
- Scuola Internazionale
Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory and Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory and Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Pavel Banáš
- Regional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University, tř. 17 listopadu 12, 771 46, Olomouc, Czech Republic
| | - Jiří Šponer
- Regional Centre of Advanced Technologies and Materials, Department of Physical Chemistry, Faculty of Science, Palacký University, tř. 17 listopadu 12, 771 46, Olomouc, Czech Republic
- Institute of Biophysics
of the Czech Academy of Sciences, Kralovopolska 135, Brno 612 65, Czech Republic
| | - Giovanni Bussi
- Scuola Internazionale
Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
67
|
Dixit PD, Dill KA. Building Markov state models using optimal transport theory. J Chem Phys 2019; 150:054105. [DOI: 10.1063/1.5086681] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
- Purushottam D. Dixit
- Department of Systems Biology, Columbia University, New York, New York 10032, USA
| | - Ken A. Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794, USA
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA
| |
Collapse
|
68
|
Latham AP, Zhang B. Improving Coarse-Grained Protein Force Fields with Small-Angle X-ray Scattering Data. J Phys Chem B 2019; 123:1026-1034. [PMID: 30620594 DOI: 10.1021/acs.jpcb.8b10336] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Small-angle X-ray scattering (SAXS) experiments provide valuable structural data for biomolecules in solution. We develop a highly efficient maximum entropy approach to fit SAXS data by introducing minimal biases to a coarse-grained protein force field, the associative memory, water mediated, structure, and energy model (AWSEM). We demonstrate that the resulting force field, AWSEM-SAXS, succeeds in reproducing scattering profiles and models protein structures with shapes that are in much better agreement with experimental results. Quantitative metrics further reveal a modest, but consistent, improvement in the accuracy of modeled structures when SAXS data are incorporated into the force field. Additionally, when applied to a multiconformational protein, we find that AWSEM-SAXS is able to recover the population of different protein conformations from SAXS data alone. We, therefore, conclude that the maximum entropy approach is effective in fine-tuning the force field to better characterize both protein structure and conformational fluctuation.
Collapse
Affiliation(s)
- Andrew P Latham
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Bin Zhang
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
69
|
Köfinger J, Różycki B, Hummer G. Inferring Structural Ensembles of Flexible and Dynamic Macromolecules Using Bayesian, Maximum Entropy, and Minimal-Ensemble Refinement Methods. Methods Mol Biol 2019; 2022:341-352. [PMID: 31396910 DOI: 10.1007/978-1-4939-9608-7_14] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The flexible and dynamic nature of biomolecules and biomolecular complexes is essential for many cellular functions in living organisms but poses a challenge for experimental methods to determine high-resolution structural models. To meet this challenge, experiments are combined with molecular simulations. The latter propose models for structural ensembles, and the experimental data can be used to steer these simulations and to select ensembles that most likely underlie the experimental data. Here, we explain in detail how the "Bayesian Inference Of ENsembles" (BioEn) method can be used to refine such ensembles using a wide range of experimental data. The "Ensemble Refinement of SAXS" (EROS) method is a special case of BioEn, inspired by the Gull-Daniell formulation of maximum entropy image processing and focused originally on X-ray solution scattering experiments (SAXS) and then extended to integrative structural modeling. We also briefly sketch the "minimum ensemble method," a maximum-parsimony refinement method that seeks to represent an ensemble with a minimal number of representative structures.
Collapse
Affiliation(s)
- Jürgen Köfinger
- Max Planck Institute of Biophysics, Frankfurt am Main, Germany.
| | - Bartosz Różycki
- Institute of Physics, Polish Academy of Sciences, Warsaw, Poland
| | - Gerhard Hummer
- Max Planck Institute of Biophysics, Frankfurt am Main, Germany.
- Department of Physics, Goethe University Frankfurt, Frankfurt am Main, Germany.
| |
Collapse
|
70
|
Rangan R, Bonomi M, Heller GT, Cesari A, Bussi G, Vendruscolo M. Determination of Structural Ensembles of Proteins: Restraining vs Reweighting. J Chem Theory Comput 2018; 14:6632-6641. [DOI: 10.1021/acs.jctc.8b00738] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ramya Rangan
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| | - Massimiliano Bonomi
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| | - Gabriella T. Heller
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| | - Andrea Cesari
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
71
|
Capelli R, Tiana G, Camilloni C. An implementation of the maximum-caliber principle by replica-averaged time-resolved restrained simulations. J Chem Phys 2018; 148:184114. [PMID: 29764124 DOI: 10.1063/1.5030339] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Inferential methods can be used to integrate experimental informations and molecular simulations. The maximum entropy principle provides a framework for using equilibrium experimental data, and it has been shown that replica-averaged simulations, restrained using a static potential, are a practical and powerful implementation of such a principle. Here we show that replica-averaged simulations restrained using a time-dependent potential are equivalent to the principle of maximum caliber, the dynamic version of the principle of maximum entropy, and thus may allow us to integrate time-resolved data in molecular dynamics simulations. We provide an analytical proof of the equivalence as well as a computational validation making use of simple models and synthetic data. Some limitations and possible solutions are also discussed.
Collapse
Affiliation(s)
- Riccardo Capelli
- Center for Complexity and Biosystems and Department of Physics, Università degli Studi di Milano and INFN, Via Celoria 16, I-20133 Milano, Italy
| | - Guido Tiana
- Center for Complexity and Biosystems and Department of Physics, Università degli Studi di Milano and INFN, Via Celoria 16, I-20133 Milano, Italy
| | - Carlo Camilloni
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, I-20133 Milano, Italy
| |
Collapse
|