1
|
Hoff SE, Thomasen FE, Lindorff-Larsen K, Bonomi M. Accurate model and ensemble refinement using cryo-electron microscopy maps and Bayesian inference. PLoS Comput Biol 2024; 20:e1012180. [PMID: 39008528 PMCID: PMC11271924 DOI: 10.1371/journal.pcbi.1012180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 07/25/2024] [Accepted: 05/20/2024] [Indexed: 07/17/2024] Open
Abstract
Converting cryo-electron microscopy (cryo-EM) data into high-quality structural models is a challenging problem of outstanding importance. Current refinement methods often generate unbalanced models in which physico-chemical quality is sacrificed for excellent fit to the data. Furthermore, these techniques struggle to represent the conformational heterogeneity averaged out in low-resolution regions of density maps. Here we introduce EMMIVox, a Bayesian inference approach to determine single-structure models as well as structural ensembles from cryo-EM maps. EMMIVox automatically balances experimental information with accurate physico-chemical models of the system and the surrounding environment, including waters, lipids, and ions. Explicit treatment of data correlation and noise as well as inference of accurate B-factors enable determination of structural models and ensembles with both excellent fit to the data and high stereochemical quality, thus outperforming state-of-the-art refinement techniques. EMMIVox represents a flexible approach to determine high-quality structural models that will contribute to advancing our understanding of the molecular mechanisms underlying biological functions.
Collapse
Affiliation(s)
- Samuel E. Hoff
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France
| | - F. Emil Thomasen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Massimiliano Bonomi
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France
| |
Collapse
|
2
|
Cruz-León S, Majtner T, Hoffmann PC, Kreysing JP, Kehl S, Tuijtel MW, Schaefer SL, Geißler K, Beck M, Turoňová B, Hummer G. High-confidence 3D template matching for cryo-electron tomography. Nat Commun 2024; 15:3992. [PMID: 38734767 PMCID: PMC11088655 DOI: 10.1038/s41467-024-47839-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 04/12/2024] [Indexed: 05/13/2024] Open
Abstract
Visual proteomics attempts to build atlases of the molecular content of cells but the automated annotation of cryo electron tomograms remains challenging. Template matching (TM) and methods based on machine learning detect structural signatures of macromolecules. However, their applicability remains limited in terms of both the abundance and size of the molecular targets. Here we show that the performance of TM is greatly improved by using template-specific search parameter optimization and by including higher-resolution information. We establish a TM pipeline with systematically tuned parameters for the automated, objective and comprehensive identification of structures with confidence 10 to 100-fold above the noise level. We demonstrate high-fidelity and high-confidence localizations of nuclear pore complexes, vaults, ribosomes, proteasomes, fatty acid synthases, lipid membranes and microtubules, and individual subunits inside crowded eukaryotic cells. We provide software tools for the generic implementation of our method that is broadly applicable towards realizing visual proteomics.
Collapse
Affiliation(s)
- Sergio Cruz-León
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Tomáš Majtner
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Patrick C Hoffmann
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Jan Philipp Kreysing
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
- IMPRS on Cellular Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Sebastian Kehl
- Max Planck Computing and Data Facility, Gießenbachstraße 2, 85748, Garching, Germany
| | - Maarten W Tuijtel
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Stefan L Schaefer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Katharina Geißler
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
- IMPRS on Cellular Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany
| | - Martin Beck
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany.
- Institute of Biochemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.
| | - Beata Turoňová
- Department of Molecular Sociology, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany.
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue-Str. 3, 60438, Frankfurt am Main, Germany.
- Institute of Biophysics, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.
| |
Collapse
|
3
|
Yoshidome T. Four-dimensional imaging for cryo-electron microscopy experiments using molecular simulations and manifold learning. J Comput Chem 2024; 45:738-751. [PMID: 38112413 DOI: 10.1002/jcc.27290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 11/20/2023] [Accepted: 12/01/2023] [Indexed: 12/21/2023]
Abstract
Elucidating protein conformational changes is essential because conformational changes are closely related to the functions of proteins. Cryo-electron microscopy (cryo-EM) experiment can be used to reconstruct protein conformational changes via a method that involves using the experimental data (two-dimensional protein images). In this study, a reconstruction method, referred to as the "four-dimensional imaging," was proposed. In our four-dimensional imaging technique, the protein conformational change was obtained using the two-dimensional protein images (the three-dimensional electron density maps used in previously proposed techniques were not used). The protein conformation for each two-dimensional protein image was obtained using our original protocol with molecular dynamics simulations. Using a manifold-learning technique and two-dimensional protein images, the protein conformations were arranged according to the conformational change of the protein. By arranging the protein conformations according to the arrangement of the protein images, four-dimensional imaging is constructed. A simulation for a cryo-EM experiment demonstrated the validity of our four-dimensional imaging technique.
Collapse
Affiliation(s)
- Takashi Yoshidome
- Department of Applied Physics, Graduate School of Engineering, Tohoku University, Sendai, Japan
| |
Collapse
|
4
|
Krieger JM, Sorzano COS, Carazo JM. Scipion-EM-ProDy: A Graphical Interface for the ProDy Python Package within the Scipion Workflow Engine Enabling Integration of Databases, Simulations and Cryo-Electron Microscopy Image Processing. Int J Mol Sci 2023; 24:14245. [PMID: 37762547 PMCID: PMC10532346 DOI: 10.3390/ijms241814245] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/10/2023] [Accepted: 09/15/2023] [Indexed: 09/29/2023] Open
Abstract
Macromolecular assemblies, such as protein complexes, undergo continuous structural dynamics, including global reconfigurations critical for their function. Two fast analytical methods are widely used to study these global dynamics, namely elastic network model normal mode analysis and principal component analysis of ensembles of structures. These approaches have found wide use in various computational studies, driving the development of complex pipelines in several software packages. One common theme has been conformational sampling through hybrid simulations incorporating all-atom molecular dynamics and global modes of motion. However, wide functionality is only available for experienced programmers with limited capabilities for other users. We have, therefore, integrated one popular and extensively developed software for such analyses, the ProDy Python application programming interface, into the Scipion workflow engine. This enables a wider range of users to access a complete range of macromolecular dynamics pipelines beyond the core functionalities available in its command-line applications and the normal mode wizard in VMD. The new protocols and pipelines can be further expanded and integrated into larger workflows, together with other software packages for cryo-electron microscopy image analysis and molecular simulations. We present the resulting plugin, Scipion-EM-ProDy, in detail, highlighting the rich functionality made available by its development.
Collapse
Affiliation(s)
- James M. Krieger
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Campus Universidad Autónoma de Madrid, Darwin 3, Cantoblanco, 28049 Madrid, Spain
| | | | - Jose Maria Carazo
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Campus Universidad Autónoma de Madrid, Darwin 3, Cantoblanco, 28049 Madrid, Spain
| |
Collapse
|
5
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
6
|
Tang WS, Zhong ED, Hanson SM, Thiede EH, Cossio P. Conformational heterogeneity and probability distributions from single-particle cryo-electron microscopy. Curr Opin Struct Biol 2023; 81:102626. [PMID: 37311334 DOI: 10.1016/j.sbi.2023.102626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 04/25/2023] [Accepted: 05/16/2023] [Indexed: 06/15/2023]
Abstract
Single-particle cryo-electron microscopy (cryo-EM) is a technique that takes projection images of biomolecules frozen at cryogenic temperatures. A major advantage of this technique is its ability to image single biomolecules in heterogeneous conformations. While this poses a challenge for data analysis, recent algorithmic advances have enabled the recovery of heterogeneous conformations from the noisy imaging data. Here, we review methods for the reconstruction and heterogeneity analysis of cryo-EM images, ranging from linear-transformation-based methods to nonlinear deep generative models. We overview the dimensionality-reduction techniques used in heterogeneous 3D reconstruction methods and specify what information each method can infer from the data. Then, we review the methods that use cryo-EM images to estimate probability distributions over conformations in reduced subspaces or predefined by atomistic simulations. We conclude with the ongoing challenges for the cryo-EM community.
Collapse
Affiliation(s)
- Wai Shing Tang
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States. https://twitter.com/WaiShingTang
| | - Ellen D Zhong
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ, 08544, United States. https://twitter.com/ZhongingAlong
| | - Sonya M Hanson
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States; Center for Computational Biology, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States. https://twitter.com/sonyahans
| | - Erik H Thiede
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States. https://twitter.com/erik_der_elch
| | - Pilar Cossio
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States; Center for Computational Biology, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States.
| |
Collapse
|
7
|
Lee S, Seok C, Park H. Benchmarking applicability of medium-resolution cryo-EM protein structures for structure-based drug design. J Comput Chem 2023; 44:1360-1368. [PMID: 36847771 DOI: 10.1002/jcc.27091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/18/2023] [Accepted: 02/05/2023] [Indexed: 03/01/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is gaining large attention for high-resolution protein structure determination in solutions. However, a very high percentage of cryo-EM structures correspond to resolutions of 3-5 Å, making the structures difficult to be used in in silico drug design. In this study, we analyze how useful cryo-EM protein structures are for in silico drug design by evaluating ligand docking accuracy. From realistic cross-docking scenarios using medium resolution (3-5 Å) cryo-EM structures and a popular docking tool Autodock-Vina, only 20% of docking succeeded, when the success rate doubles in the same kind of cross-docking but using high-resolution (<2 Å) crystal structures instead. We decipher the reason for failures by decomposing the contribution from resolution-dependent and independent factors. The heterogeneity in the protein side-chain and backbone conformations is identified as the major resolution-dependent factor causing docking difficulty from our analysis, while intrinsic receptor flexibility mainly comprises the resolution-independent factor. We demonstrate the flexibility implementation in current ligand docking tools is able to rescue only a portion of failures (10%), and the limited performance was majorly due to potential structural errors than conformational changes. Our work suggests the strong necessity of more robust method developments on ligand docking and EM modeling techniques in order to fully utilize cryo-EM structures for in silico drug design.
Collapse
Affiliation(s)
- Seho Lee
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea.,Galux Inc., Seoul, Republic of Korea
| | - Hahnbeom Park
- Brain Science Institute, Korea Institute of Science and Technology, Seoul, Republic of Korea
| |
Collapse
|
8
|
End-to-end differentiable blind tip reconstruction for noisy atomic force microscopy images. Sci Rep 2023; 13:129. [PMID: 36599879 DOI: 10.1038/s41598-022-27057-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 12/23/2022] [Indexed: 01/06/2023] Open
Abstract
Observing the structural dynamics of biomolecules is vital to deepening our understanding of biomolecular functions. High-speed (HS) atomic force microscopy (AFM) is a powerful method to measure biomolecular behavior at near physiological conditions. In the AFM, measured image profiles on a molecular surface are distorted by the tip shape through the interactions between the tip and molecule. Once the tip shape is known, AFM images can be approximately deconvolved to reconstruct the surface geometry of the sample molecule. Thus, knowing the correct tip shape is an important issue in the AFM image analysis. The blind tip reconstruction (BTR) method developed by Villarrubia (J Res Natl Inst Stand Technol 102:425, 1997) is an algorithm that estimates tip shape only from AFM images using mathematical morphology operators. While the BTR works perfectly for noise-free AFM images, the algorithm is susceptible to noise. To overcome this issue, we here propose an alternative BTR method, called end-to-end differentiable BTR, based on a modern machine learning approach. In the method, we introduce a loss function including a regularization term to prevent overfitting to noise, and the tip shape is optimized with automatic differentiation and backpropagations developed in deep learning frameworks. Using noisy pseudo-AFM images of myosin V motor domain as test cases, we show that our end-to-end differentiable BTR is robust against noise in AFM images. The method can also detect a double-tip shape and deconvolve doubled molecular images. Finally, application to real HS-AFM data of myosin V walking on an actin filament shows that the method can reconstruct the accurate surface geometry of actomyosin consistent with the structural model. Our method serves as a general post-processing for reconstructing hidden molecular surfaces from any AFM images. Codes are available at https://github.com/matsunagalab/differentiable_BTR .
Collapse
|
9
|
Development of hidden Markov modeling method for molecular orientations and structure estimation from high-speed atomic force microscopy time-series images. PLoS Comput Biol 2022; 18:e1010384. [PMID: 36580448 PMCID: PMC9833559 DOI: 10.1371/journal.pcbi.1010384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 01/11/2023] [Accepted: 12/20/2022] [Indexed: 12/30/2022] Open
Abstract
High-speed atomic force microscopy (HS-AFM) is a powerful technique for capturing the time-resolved behavior of biomolecules. However, structural information in HS-AFM images is limited to the surface geometry of a sample molecule. Inferring latent three-dimensional structures from the surface geometry is thus important for getting more insights into conformational dynamics of a target biomolecule. Existing methods for estimating the structures are based on the rigid-body fitting of candidate structures to each frame of HS-AFM images. Here, we extend the existing frame-by-frame rigid-body fitting analysis to multiple frames to exploit orientational correlations of a sample molecule between adjacent frames in HS-AFM data due to the interaction with the stage. In the method, we treat HS-AFM data as time-series data, and they are analyzed with the hidden Markov modeling. Using simulated HS-AFM images of the taste receptor type 1 as a test case, the proposed method shows a more robust estimation of molecular orientations than the frame-by-frame analysis. The method is applicable in integrative modeling of conformational dynamics using HS-AFM data.
Collapse
|
10
|
Gama Lima Costa R, Fushman D. Reweighting methods for elucidation of conformation ensembles of proteins. Curr Opin Struct Biol 2022; 77:102470. [PMID: 36183447 PMCID: PMC9771963 DOI: 10.1016/j.sbi.2022.102470] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 08/24/2022] [Accepted: 08/28/2022] [Indexed: 12/24/2022]
Abstract
Proteins are inherently dynamic macromolecules that exist in equilibrium among multiple conformational states, and motions of protein backbone and side chains are fundamental to biological function. The ability to characterize the conformational landscape is particularly important for intrinsically disordered proteins, multidomain proteins, and weakly bound complexes, where single-structure representations are inadequate. As the focus of structural biology shifts from relatively rigid macromolecules toward larger and more complex systems and molecular assemblies, there is a need for structural approaches that can paint a more realistic picture of such conformationally heterogeneous systems. Here, we review reweighting methods for elucidation of structural ensembles based on experimental data, with the focus on applications to multidomain proteins.
Collapse
Affiliation(s)
- Raquel Gama Lima Costa
- Chemical Physics Program, Institute for Physical Sciences and Technology, University of Maryland, College Park, 20742, MD, USA.
| | - David Fushman
- Chemical Physics Program, Institute for Physical Sciences and Technology, University of Maryland, College Park, 20742, MD, USA; Department of Chemistry and Biochemistry, Center for Biomolecular Structure and Organization, University of Maryland, College Park, 20742, MD, USA.
| |
Collapse
|
11
|
Ray KK, Verma AR, Gonzalez RL, Kinz-Thompson CD. Inferring the shape of data: a probabilistic framework for analysing experiments in the natural sciences. Proc Math Phys Eng Sci 2022; 478:20220177. [PMID: 37767180 PMCID: PMC10521765 DOI: 10.1098/rspa.2022.0177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 09/26/2022] [Indexed: 09/29/2023] Open
Abstract
A critical step in data analysis for many different types of experiments is the identification of features with theoretically defined shapes in N -dimensional datasets; examples of this process include finding peaks in multi-dimensional molecular spectra or emitters in fluorescence microscopy images. Identifying such features involves determining if the overall shape of the data is consistent with an expected shape; however, it is generally unclear how to quantitatively make this determination. In practice, many analysis methods employ subjective, heuristic approaches, which complicates the validation of any ensuing results-especially as the amount and dimensionality of the data increase. Here, we present a probabilistic solution to this problem by using Bayes' rule to calculate the probability that the data have any one of several potential shapes. This probabilistic approach may be used to objectively compare how well different theories describe a dataset, identify changes between datasets and detect features within data using a corollary method called Bayesian Inference-based Template Search; several proof-of-principle examples are provided. Altogether, this mathematical framework serves as an automated 'engine' capable of computationally executing analysis decisions currently made by visual inspection across the sciences.
Collapse
Affiliation(s)
- Korak Kumar Ray
- Department of Chemistry, Columbia University, New York, NY 10027, USA
| | - Anjali R. Verma
- Department of Chemistry, Columbia University, New York, NY 10027, USA
| | - Ruben L. Gonzalez
- Department of Chemistry, Columbia University, New York, NY 10027, USA
| | | |
Collapse
|
12
|
Conformational ensembles of intrinsically disordered proteins and flexible multidomain proteins. Biochem Soc Trans 2022; 50:541-554. [PMID: 35129612 DOI: 10.1042/bst20210499] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 01/13/2022] [Accepted: 01/17/2022] [Indexed: 12/29/2022]
Abstract
Intrinsically disordered proteins (IDPs) and multidomain proteins with flexible linkers show a high level of structural heterogeneity and are best described by ensembles consisting of multiple conformations with associated thermodynamic weights. Determining conformational ensembles usually involves the integration of biophysical experiments and computational models. In this review, we discuss current approaches to determine conformational ensembles of IDPs and multidomain proteins, including the choice of biophysical experiments, computational models used to sample protein conformations, models to calculate experimental observables from protein structure, and methods to refine ensembles against experimental data. We also provide examples of recent applications of integrative conformational ensemble determination to study IDPs and multidomain proteins and suggest future directions for research in the field.
Collapse
|
13
|
Wälti MA, Canagarajah B, Schwieters CD, Clore GM. Visualization of Sparsely-populated Lower-order Oligomeric States of Human Mitochondrial Hsp60 by Cryo-electron Microscopy. J Mol Biol 2021; 433:167322. [PMID: 34688687 PMCID: PMC8627483 DOI: 10.1016/j.jmb.2021.167322] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 10/14/2021] [Accepted: 10/15/2021] [Indexed: 11/19/2022]
Abstract
Human mitochondrial Hsp60 (mtHsp60) is a class I chaperonin, 51% identical in sequence to the prototypical E. coli chaperonin GroEL. mtHsp60 maintains the proteome within the mitochondrion and is associated with various neurodegenerative diseases and cancers. The oligomeric assembly of mtHsp60 into heptameric ring structures that enclose a folding chamber only occurs upon addition of ATP and is significantly more labile than that of GroEL, where the only oligomeric species is a tetradecamer. The lability of the mtHsp60 heptamer provides an opportunity to detect and visualize lower-order oligomeric states that may represent intermediates along the assembly/disassembly pathway. Using cryo-electron microscopy we show that, in addition to the fully-formed heptamer and an "inverted" tetradecamer in which the two heptamers associate via their apical domains, thereby blocking protein substrate access, well-defined lower-order oligomeric species, populated at less than 6% of the total particles, are observed. Specifically, we observe open trimers, tetramers, pentamers and hexamers (comprising ∼4% of the total particles) with rigid body rotations from one subunit to the next within ∼1.5-3.5° of that for the heptamer, indicating that these may lie directly on the assembly/disassembly pathway. We also observe a closed-ring hexamer (∼2% of the particles) which may represent an off-pathway species in the assembly/disassembly process in so far that conversion to the mature heptamer would require the closed-ring hexamer to open to accept an additional subunit. Lastly, we observe several classes of tetramers where additional subunits characterized by fuzzy electron density are caught in the act of oligomer extension.
Collapse
Affiliation(s)
- Marielle A Wälti
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| | - Bertram Canagarajah
- Laboratory of Cell and Molecular Biology, and National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| | - Charles D Schwieters
- Computational Biomolecular Nuclear Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| | - G Marius Clore
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA.
| |
Collapse
|
14
|
Shekhar M, Terashi G, Gupta C, Sarkar D, Debussche G, Sisco NJ, Nguyen J, Mondal A, Vant J, Fromme P, Van Horn WD, Tajkhorshid E, Kihara D, Dill K, Perez A, Singharoy A. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. MATTER 2021; 4:3195-3216. [PMID: 35874311 PMCID: PMC9302471 DOI: 10.1016/j.matt.2021.09.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Cryo-electron microscopy (EM) requires molecular modeling to refine structural details from data. Ensemble models arrive at low free-energy molecular structures, but are computationally expensive and limited to resolving only small proteins that cannot be resolved by cryo-EM. Here, we introduce CryoFold - a pipeline of molecular dynamics simulations that determines ensembles of protein structures directly from sequence by integrating density data of varying sparsity at 3-5 Å resolution with coarse-grained topological knowledge of the protein folds. We present six examples showing its broad applicability for folding proteins between 72 to 2000 residues, including large membrane and multi-domain systems, and results from two EMDB competitions. Driven by data from a single state, CryoFold discovers ensembles of common low-energy models together with rare low-probability structures that capture the equilibrium distribution of proteins constrained by the density maps. Many of these conformations, unseen by traditional methods, are experimentally validated and functionally relevant. We arrive at a set of best practices for data-guided protein folding that are controlled using a Python GUI.
Collapse
Affiliation(s)
- Mrinal Shekhar
- Center for Biophysics and Quantitative Biology, Department of Biochemistry, NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Chitrak Gupta
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Gaspard Debussche
- Department of Mathematics and Computer Sciences, Grenoble INP, 38000 Grenoble, France
| | - Nicholas J Sisco
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Virginia G. Piper Center for Personalized Diagnostics, Arizona State University, Tempe, AZ 85281, USA
| | - Jonathan Nguyen
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Arup Mondal
- Chemistry Department, Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | - John Vant
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Petra Fromme
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Wade D Van Horn
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Virginia G. Piper Center for Personalized Diagnostics, Arizona State University, Tempe, AZ 85281, USA
| | - Emad Tajkhorshid
- Center for Biophysics and Quantitative Biology, Department of Biochemistry, NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Ken Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
| | - Alberto Perez
- Chemistry Department, Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | - Abhishek Singharoy
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
15
|
Eigenfeld M, Kerpes R, Becker T. Recombinant protein linker production as a basis for non-invasive determination of single-cell yeast age in heterogeneous yeast populations. RSC Adv 2021; 11:31923-31932. [PMID: 35495491 PMCID: PMC9041608 DOI: 10.1039/d1ra05276d] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 09/16/2021] [Indexed: 11/30/2022] Open
Abstract
The physiological and metabolic diversity of a yeast culture is the sum of individual cell phenotypes. As well as environmental conditions, genetics, and numbers of cell divisions, a major factor influencing cell characteristics is cell age. A postcytokinesis bud scar on the mother cell, a benchmark in the replicative life span, is a quantifiable indicator of cell age, characterized by significant amounts of chitin. We developed a binding process for visualizing the bud scars of Saccharomyces pastorianus var. carlsbergensis using a protein linker containing a polyhistidine tag, a superfolder green fluorescent protein (sfGFP), and a chitin-binding domain (His6-SUMO-sfGFP-ChBD). The binding did not affect yeast viability; thus, our method provides the basis for non-invasive cell age determination using flow cytometry. The His6-SUMO-sfGFP-ChBD protein was synthesized in Escherichia coli, purified using two-stage chromatography, and checked for monodispersity and purity. Linker-cell binding and the characteristics of the bound complex were determined using flow cytometry and confocal laser scanning microscopy (CLSM). Flow cytometry showed that protein binding increased to 60 455 ± 2706 fluorescence units per cell. The specific coupling of the linker to yeast cells was additionally verified by CLSM and adsorption isotherms using yeast cells, E. coli cells, and chitin resin. We found a relationship between the median bud scar number, the median of the fluorescence units, and the chitin content of yeast cells. A fast measurement of yeast population dynamics by flow cytometry is possible, using this protein binding technique. Rapid qualitative determination of yeast cell age distribution can therefore be performed.
Collapse
Affiliation(s)
- Marco Eigenfeld
- Technical University of Munich, Chair of Brewing and Beverage Technology, Research Group Beverage and Cereal Biotechnology Weihenstephaner Steig 20 85354 Freising Germany
| | - Roland Kerpes
- Technical University of Munich, Chair of Brewing and Beverage Technology, Research Group Beverage and Cereal Biotechnology Weihenstephaner Steig 20 85354 Freising Germany
| | - Thomas Becker
- Technical University of Munich, Chair of Brewing and Beverage Technology, Research Group Beverage and Cereal Biotechnology Weihenstephaner Steig 20 85354 Freising Germany
| |
Collapse
|
16
|
Yagi K, Re S, Mori T, Sugita Y. Weight average approaches for predicting dynamical properties of biomolecules. Curr Opin Struct Biol 2021; 72:88-94. [PMID: 34592697 DOI: 10.1016/j.sbi.2021.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 08/21/2021] [Accepted: 08/24/2021] [Indexed: 11/16/2022]
Abstract
Recent advances in atomistic molecular dynamics (MD) simulations of biomolecules allow us to explore their conformational spaces widely, observing large-scale conformational fluctuations or transitions between distinct structures. To reproduce or refine experimental data using MD simulations, structure ensembles, which are characterized by multiple structures and their statistical weights on the rugged free-energy landscapes, are often used. Here, we summarize weight average approaches for various experimental measurements. Weight average approaches are now applied to hybrid quantum mechanics/molecular mechanics MD simulations to predict fast vibrational motions in a protein with a high accuracy for better understanding of molecular functions from atomic structures.
Collapse
Affiliation(s)
- Kiyoshi Yagi
- RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Suyong Re
- RIKEN Center for Biosystems Dynamics Research, 1-6-5 Minatojima-Minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan; Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health, and Nutrition 7-6-8, Saito-Asagi, Ibaraki, Osaka, 567-0085, Japan
| | - Takaharu Mori
- RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuji Sugita
- RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; RIKEN Center for Biosystems Dynamics Research, 1-6-5 Minatojima-Minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan; RIKEN Center for Computational Science, 7-1-26 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.
| |
Collapse
|
17
|
Giraldo-Barreto J, Ortiz S, Thiede EH, Palacio-Rodriguez K, Carpenter B, Barnett AH, Cossio P. A Bayesian approach to extracting free-energy profiles from cryo-electron microscopy experiments. Sci Rep 2021; 11:13657. [PMID: 34211017 PMCID: PMC8249403 DOI: 10.1038/s41598-021-92621-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 06/01/2021] [Indexed: 11/08/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) extracts single-particle density projections of individual biomolecules. Although cryo-EM is widely used for 3D reconstruction, due to its single-particle nature it has the potential to provide information about a biomolecule's conformational variability and underlying free-energy landscape. However, treating cryo-EM as a single-molecule technique is challenging because of the low signal-to-noise ratio (SNR) in individual particles. In this work, we propose the cryo-BIFE method (cryo-EM Bayesian Inference of Free-Energy profiles), which uses a path collective variable to extract free-energy profiles and their uncertainties from cryo-EM images. We test the framework on several synthetic systems where the imaging parameters and conditions were controlled. We found that for realistic cryo-EM environments and relevant biomolecular systems, it is possible to recover the underlying free energy, with the pose accuracy and SNR as crucial determinants. We then use the method to study the conformational transitions of a calcium-activated channel with real cryo-EM particles. Interestingly, we recover not only the most probable conformation (used to generate a high-resolution reconstruction of the calcium-bound state) but also a metastable state that corresponds to the calcium-unbound conformation. As expected for turnover transitions within the same sample, the activation barriers are on the order of [Formula: see text]. We expect our tool for extracting free-energy profiles from cryo-EM images to enable more complete characterization of the thermodynamic ensemble of biomolecules.
Collapse
Affiliation(s)
- Julian Giraldo-Barreto
- Biophysics of Tropical Diseases Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
- Magnetism and Simulation Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Sebastian Ortiz
- Biophysics of Tropical Diseases Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Erik H Thiede
- Center for Computational Mathematics, Flatiron Institute, New York City, USA
| | - Karen Palacio-Rodriguez
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, Paris, France
| | - Bob Carpenter
- Center for Computational Mathematics, Flatiron Institute, New York City, USA
| | - Alex H Barnett
- Center for Computational Mathematics, Flatiron Institute, New York City, USA
| | - Pilar Cossio
- Biophysics of Tropical Diseases Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia.
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438, Frankfurt am Main, Germany.
| |
Collapse
|
18
|
Kulik M, Mori T, Sugita Y. Multi-Scale Flexible Fitting of Proteins to Cryo-EM Density Maps at Medium Resolution. Front Mol Biosci 2021; 8:631854. [PMID: 33842541 PMCID: PMC8025875 DOI: 10.3389/fmolb.2021.631854] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Accepted: 01/26/2021] [Indexed: 11/13/2022] Open
Abstract
Structure determination using cryo-electron microscopy (cryo-EM) medium-resolution density maps is often facilitated by flexible fitting. Avoiding overfitting, adjusting force constants driving the structure to the density map, and emulating complex conformational transitions are major concerns in the fitting. To address them, we develop a new method based on a three-step multi-scale protocol. First, flexible fitting molecular dynamics (MD) simulations with coarse-grained structure-based force field and replica-exchange scheme between different force constants replicas are performed. Second, fitted Cα atom positions guide the all-atom structure in targeted MD. Finally, the all-atom flexible fitting refinement in implicit solvent adjusts the positions of the side chains in the density map. Final models obtained via the multi-scale protocol are significantly better resolved and more reliable in comparison with long all-atom flexible fitting simulations. The protocol is useful for multi-domain systems with intricate structural transitions as it preserves the secondary structure of single domains.
Collapse
Affiliation(s)
- Marta Kulik
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako-shi, Japan
| | - Takaharu Mori
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako-shi, Japan
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako-shi, Japan.,RIKEN Center for Computational Science, Kobe, Japan.,RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| |
Collapse
|
19
|
Kinz-Thompson CD, Ray KK, Gonzalez RL. Bayesian Inference: The Comprehensive Approach to Analyzing Single-Molecule Experiments. Annu Rev Biophys 2021; 50:191-208. [PMID: 33534607 DOI: 10.1146/annurev-biophys-082120-103921] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Biophysics experiments performed at single-molecule resolution provide exceptional insight into the structural details and dynamic behavior of biological systems. However, extracting this information from the corresponding experimental data unequivocally requires applying a biophysical model. In this review, we discuss how to use probability theory to apply these models to single-molecule data. Many current single-molecule data analysis methods apply parts of probability theory, sometimes unknowingly, and thus miss out on the full set of benefits provided by this self-consistent framework. The full application of probability theory involves a process called Bayesian inference that fully accounts for the uncertainties inherent to single-molecule experiments. Additionally, using Bayesian inference provides a scientifically rigorous method of incorporating information from multiple experiments into a single analysis and finding the best biophysical model for an experiment without the risk of overfitting the data. These benefits make the Bayesian approach ideal for analyzing any type of single-molecule experiment.
Collapse
Affiliation(s)
- Colin D Kinz-Thompson
- Department of Chemistry, Columbia University, New York, New York 10027, USA; .,Department of Chemistry, Rutgers University-Newark, Newark, New Jersey 07102, USA
| | - Korak Kumar Ray
- Department of Chemistry, Columbia University, New York, New York 10027, USA;
| | - Ruben L Gonzalez
- Department of Chemistry, Columbia University, New York, New York 10027, USA;
| |
Collapse
|
20
|
Ortiz S, Stanisic L, Rodriguez BA, Rampp M, Hummer G, Cossio P. Validation tests for cryo-EM maps using an independent particle set. J Struct Biol X 2020; 4:100032. [PMID: 32743544 PMCID: PMC7385033 DOI: 10.1016/j.yjsbx.2020.100032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by providing 3D density maps of biomolecules at near-atomic resolution. However, map validation is still an open issue. Despite several efforts from the community, it is possible to overfit 3D maps to noisy data. Here, we develop a novel methodology that uses a small independent particle set (not used during the 3D refinement) to validate the maps. The main idea is to monitor how the map probability evolves over the control set during the 3D refinement. The method is complementary to the gold-standard procedure, which generates two reconstructions at each iteration. We low-pass filter the two reconstructions for different frequency cutoffs, and we calculate the probability of each filtered map given the control set. For high-quality maps, the probability should increase as a function of the frequency cutoff and the refinement iteration. We also compute the similarity between the densities of probability distributions of the two reconstructions. As higher frequencies are included, the distributions become more dissimilar. We optimized the BioEM package to perform these calculations, and tested it over systems ranging from quality data to pure noise. Our results show that with our methodology, it possible to discriminate datasets that are constructed from noise particles. We conclude that validation against a control particle set provides a powerful tool to assess the quality of cryo-EM maps.
Collapse
Affiliation(s)
- Sebastian Ortiz
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Luka Stanisic
- Max Planck Computing and Data Facility, 85748 Garching, Germany
| | - Boris A Rodriguez
- Grupo de Fósica Atómica y Molecular, Instituto de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Markus Rampp
- Max Planck Computing and Data Facility, 85748 Garching, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
- Institute of Biophysics, Goethe University, 60438 Frankfurt am Main, Germany
| | - Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| |
Collapse
|
21
|
Abstract
Cross-validation is used to determine the validity of a model on unseen data by assessing if the model is overfitted to noise. It is widely used in many fields, from artificial intelligence to structural biology in X-ray crystallography and nuclear magnetic resonance. Although there are concerns of map overfitting in cryo-electron microscopy (cryo-EM), cross-validation is rarely used. The problem is that establishing a performance metric of the maps over unseen data (given by 2D-projection images) is difficult due to the low signal-to-noise ratios in the individual particles. Here, I present recent advances for cryo-EM map reconstruction. I highlight that the gold-standard procedure can fail to detect map overfitting in certain cases, showing the necessity of assessing the map quality on unbiased data. Finally, I describe the challenges and advantages of developing a robust cross-validation methodology for cryo-EM.
Collapse
Affiliation(s)
- Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellin, Colombia.,Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| |
Collapse
|
22
|
Fraser JS, Lindorff-Larsen K, Bonomi M. What Will Computational Modeling Approaches Have to Say in the Era of Atomistic Cryo-EM Data? J Chem Inf Model 2020; 60:2410-2412. [PMID: 32090567 DOI: 10.1021/acs.jcim.0c00123] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California 94107, United States
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Massimiliano Bonomi
- Structural Bioinformatics Unit, Department of Structural Biology and Chemistry; CNRS UMR 3528; C3BI, CNRS USR 3756; Institut Pasteur, 75015 Paris, France
| |
Collapse
|
23
|
Reißer S, Zucchelli S, Gustincich S, Bussi G. Conformational ensembles of an RNA hairpin using molecular dynamics and sparse NMR data. Nucleic Acids Res 2020; 48:1164-1174. [PMID: 31889193 PMCID: PMC7026608 DOI: 10.1093/nar/gkz1184] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 12/05/2019] [Accepted: 12/09/2019] [Indexed: 01/12/2023] Open
Abstract
Solution nuclear magnetic resonance (NMR) experiments allow RNA dynamics to be determined in an aqueous environment. However, when a limited number of peaks are assigned, it is difficult to obtain structural information. We here show a protocol based on the combination of experimental data (Nuclear Overhauser Effect, NOE) and molecular dynamics simulations with enhanced sampling methods. This protocol allows to (a) obtain a maximum entropy ensemble compatible with NMR restraints and (b) obtain a minimal set of metastable conformations compatible with the experimental data (maximum parsimony). The method is applied to a hairpin of 29 nt from an inverted SINEB2, which is part of the SINEUP family and has been shown to enhance protein translation. A clustering procedure is introduced where the annotation of base-base interactions and glycosidic bond angles is used as a metric. By reweighting the contributions of the clusters, minimal sets of four conformations could be found which are compatible with the experimental data. A motif search on the structural database showed that some identified low-population states are present in experimental structures of other RNA transcripts. The introduced method can be applied to characterize RNA dynamics in systems where a limited amount of NMR information is available.
Collapse
Affiliation(s)
- Sabine Reißer
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| | - Silvia Zucchelli
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy
- Department of Health Sciences, Center for Autoimmune and Allergic Diseases (CAAD) and Interdisciplinary Research Center of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Novara, Italy
| | - Stefano Gustincich
- Central RNA Laboratory and Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia (IIT), 16163 Genova, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
24
|
Orioli S, Larsen AH, Bottaro S, Lindorff-Larsen K. How to learn from inconsistencies: Integrating molecular simulations with experimental data. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 170:123-176. [PMID: 32145944 DOI: 10.1016/bs.pmbts.2019.12.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Molecular simulations and biophysical experiments can be used to provide independent and complementary insights into the molecular origin of biological processes. A particularly useful strategy is to use molecular simulations as a modeling tool to interpret experimental measurements, and to use experimental data to refine our biophysical models. Thus, explicit integration and synergy between molecular simulations and experiments is fundamental for furthering our understanding of biological processes. This is especially true in the case where discrepancies between measured and simulated observables emerge. In this chapter, we provide an overview of some of the core ideas behind methods that were developed to improve the consistency between experimental information and numerical predictions. We distinguish between situations where experiments are used to refine our understanding and models of specific systems, and situations where experiments are used more generally to refine transferable models. We discuss different philosophies and attempt to unify them in a single framework. Until now, such integration between experiments and simulations have mostly been applied to equilibrium data, and we discuss more recent developments aimed to analyze time-dependent or time-resolved data.
Collapse
Affiliation(s)
- Simone Orioli
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Haahr Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Structural Biophysics, Niels Bohr Institute, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Atomistic Simulations Laboratory, Istituto Italiano di Tecnologia, Genova, Italy
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
25
|
Integrative Approaches in Structural Biology: A More Complete Picture from the Combination of Individual Techniques. Biomolecules 2019; 9:biom9080370. [PMID: 31416261 PMCID: PMC6723403 DOI: 10.3390/biom9080370] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 08/08/2019] [Accepted: 08/11/2019] [Indexed: 11/21/2022] Open
Abstract
With the recent technological and computational advancements, structural biology has begun to tackle more and more difficult questions, including complex biochemical pathways and transient interactions among macromolecules. This has demonstrated that, to approach the complexity of biology, one single technique is largely insufficient and unable to yield thorough answers, whereas integrated approaches have been more and more adopted with successful results. Traditional structural techniques (X-ray crystallography and Nuclear Magnetic Resonance (NMR)) and the emerging ones (cryo-electron microscopy (cryo-EM), Small Angle X-ray Scattering (SAXS)), together with molecular modeling, have pros and cons which very nicely complement one another. In this review, three examples of synergistic approaches chosen from our previous research will be revisited. The first shows how the joint use of both solution and solid-state NMR (SSNMR), X-ray crystallography, and cryo-EM is crucial to elucidate the structure of polyethylene glycol (PEG)ylated asparaginase, which would not be obtainable through any of the techniques taken alone. The second deals with the integrated use of NMR, X-ray crystallography, and SAXS in order to elucidate the catalytic mechanism of an enzyme that is based on the flexibility of the enzyme itself. The third one shows how it is possible to put together experimental data from X-ray crystallography and NMR restraints in order to refine a protein model in order to obtain a structure which simultaneously satisfies both experimental datasets and is therefore closer to the ‘real structure’.
Collapse
|
26
|
Bonomi M, Vendruscolo M. Determination of protein structural ensembles using cryo-electron microscopy. Curr Opin Struct Biol 2019; 56:37-45. [DOI: 10.1016/j.sbi.2018.10.006] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 10/24/2018] [Accepted: 10/26/2018] [Indexed: 10/27/2022]
|
27
|
Köfinger J, Różycki B, Hummer G. Inferring Structural Ensembles of Flexible and Dynamic Macromolecules Using Bayesian, Maximum Entropy, and Minimal-Ensemble Refinement Methods. Methods Mol Biol 2019; 2022:341-352. [PMID: 31396910 DOI: 10.1007/978-1-4939-9608-7_14] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The flexible and dynamic nature of biomolecules and biomolecular complexes is essential for many cellular functions in living organisms but poses a challenge for experimental methods to determine high-resolution structural models. To meet this challenge, experiments are combined with molecular simulations. The latter propose models for structural ensembles, and the experimental data can be used to steer these simulations and to select ensembles that most likely underlie the experimental data. Here, we explain in detail how the "Bayesian Inference Of ENsembles" (BioEn) method can be used to refine such ensembles using a wide range of experimental data. The "Ensemble Refinement of SAXS" (EROS) method is a special case of BioEn, inspired by the Gull-Daniell formulation of maximum entropy image processing and focused originally on X-ray solution scattering experiments (SAXS) and then extended to integrative structural modeling. We also briefly sketch the "minimum ensemble method," a maximum-parsimony refinement method that seeks to represent an ensemble with a minimal number of representative structures.
Collapse
Affiliation(s)
- Jürgen Köfinger
- Max Planck Institute of Biophysics, Frankfurt am Main, Germany.
| | - Bartosz Różycki
- Institute of Physics, Polish Academy of Sciences, Warsaw, Poland
| | - Gerhard Hummer
- Max Planck Institute of Biophysics, Frankfurt am Main, Germany.
- Department of Physics, Goethe University Frankfurt, Frankfurt am Main, Germany.
| |
Collapse
|
28
|
Cossio P, Allegretti M, Mayer F, Müller V, Vonck J, Hummer G. Bayesian inference of rotor ring stoichiometry from electron microscopy images of archaeal ATP synthase. Microscopy (Oxf) 2018; 67:266-273. [PMID: 30032235 DOI: 10.1093/jmicro/dfy033] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 06/20/2018] [Indexed: 12/24/2022] Open
Abstract
The 'Bayesian inference of electron microscopy' (BioEM) framework makes it possible to determine the stoichiometry of protein complexes using 3D coarse-grained models and a relatively small number of cryo-electron microscopy images as input. We applied the method to determine the most probable rotor ring stoichiometry of the archaeal Na+ ATP synthase from Pyrococcus furiosus, a multisubunit complex able to produce ATP under extreme conditions. Archaeal ATP synthases consist of a catalytic A1 part and a membrane-embedded AO portion. The AO portion is composed of a rotor ring and the a-subunit. The rotor ring of P. furiosus ATP synthase is composed of 16-kDa c-subunits, each consisting of four helices forming a bundle, with only one Na+-binding site per bundle. This ring was proposed to be decameric from LILBID-MS analysis of the entire ATP synthase. By contrast, the BioEM posterior favors a c9 ring stoichiometry. With BioEM, we ranked coarse-grained models of the whole complex with different ring geometry, using 6400 unprocessed particle images of the A1AO complex collected in vitreous ice. BioEM makes it possible to probabilistically establish the domain stoichiometry using low-resolution information and comparably few particle images.
Collapse
Affiliation(s)
- Pilar Cossio
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany.,Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, Medellín, Colombia
| | - Matteo Allegretti
- Department of Structural Biology, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
| | - Florian Mayer
- Department of Molecular Microbiology & Bioenergetics, Goethe University Frankfurt, Max-von-Laue-Strasse 9, Frankfurt am Main, Germany
| | - Volker Müller
- Department of Molecular Microbiology & Bioenergetics, Goethe University Frankfurt, Max-von-Laue-Strasse 9, Frankfurt am Main, Germany
| | - Janet Vonck
- Department of Structural Biology, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany.,Department of Physics, Goethe University Frankfurt, Max-von-Laue-Strasse 9, Frankfurt am Main, Germany
| |
Collapse
|
29
|
Cossio P, Hummer G. Likelihood-based structural analysis of electron microscopy images. Curr Opin Struct Biol 2018; 49:162-168. [PMID: 29579548 DOI: 10.1016/j.sbi.2018.03.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 01/24/2018] [Accepted: 03/06/2018] [Indexed: 10/17/2022]
Abstract
Likelihood-based analysis of single-particle electron microscopy images has contributed much to the recent improvements in resolution. By treating particle orientations and classes probabilistically, uncertainties in the reconstruction process are explicitly accounted for, and the risk of bias towards the initial model is diminished. As a result, the quality and reliability of the reconstructions have greatly improved at manageable computational cost. Likelihood-based analysis of electron microscopy images also offers a route to direct coordinate refinement for dynamic systems, as an alternative to 3D density reconstruction. Here, we review recent developments in the algorithms used for reconstructions of high-resolution maps, and in the integrative framework of combining likelihood methods with simulations to address conformational variability in cryo-electron microscopy.
Collapse
Affiliation(s)
- Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, Medellín, Colombia; Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany; Institute of Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| |
Collapse
|
30
|
Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments. COMPUTATION 2018. [DOI: 10.3390/computation6010015] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
31
|
Shevchuk R, Hub JS. Bayesian refinement of protein structures and ensembles against SAXS data using molecular dynamics. PLoS Comput Biol 2017; 13:e1005800. [PMID: 29045407 PMCID: PMC5662244 DOI: 10.1371/journal.pcbi.1005800] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 10/30/2017] [Accepted: 09/29/2017] [Indexed: 12/24/2022] Open
Abstract
Small-angle X-ray scattering is an increasingly popular technique used to detect protein structures and ensembles in solution. However, the refinement of structures and ensembles against SAXS data is often ambiguous due to the low information content of SAXS data, unknown systematic errors, and unknown scattering contributions from the solvent. We offer a solution to such problems by combining Bayesian inference with all-atom molecular dynamics simulations and explicit-solvent SAXS calculations. The Bayesian formulation correctly weights the SAXS data versus prior physical knowledge, it quantifies the precision or ambiguity of fitted structures and ensembles, and it accounts for unknown systematic errors due to poor buffer matching. The method further provides a probabilistic criterion for identifying the number of states required to explain the SAXS data. The method is validated by refining ensembles of a periplasmic binding protein against calculated SAXS curves. Subsequently, we derive the solution ensembles of the eukaryotic chaperone heat shock protein 90 (Hsp90) against experimental SAXS data. We find that the SAXS data of the apo state of Hsp90 is compatible with a single wide-open conformation, whereas the SAXS data of Hsp90 bound to ATP or to an ATP-analogue strongly suggest heterogenous ensembles of a closed and a wide-open state.
Collapse
Affiliation(s)
- Roman Shevchuk
- Institute for Microbiology and Genetics, University of Göttingen, Göttingen, Germany
- Göttingen Center for Molecular Biosciences (GZMB), University of Goettingen, Goettingen, Germany
| | - Jochen S. Hub
- Institute for Microbiology and Genetics, University of Göttingen, Göttingen, Germany
- Göttingen Center for Molecular Biosciences (GZMB), University of Goettingen, Goettingen, Germany
| |
Collapse
|
32
|
Bonomi M, Heller GT, Camilloni C, Vendruscolo M. Principles of protein structural ensemble determination. Curr Opin Struct Biol 2017; 42:106-116. [PMID: 28063280 DOI: 10.1016/j.sbi.2016.12.004] [Citation(s) in RCA: 230] [Impact Index Per Article: 32.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 11/18/2016] [Accepted: 12/06/2016] [Indexed: 01/19/2023]
Abstract
The biological functions of protein molecules are intimately dependent on their conformational dynamics. This aspect is particularly evident for disordered proteins, which constitute perhaps one-third of the human proteome. Therefore, structural ensembles often offer more useful representations of proteins than individual conformations. Here, we describe how the well-established principles of protein structure determination should be extended to the case of protein structural ensembles determination. These principles concern primarily how to deal with conformationally heterogeneous states, and with experimental measurements that are averaged over such states and affected by a variety of errors. We first review the growing literature of recent methods that combine experimental and computational information to model structural ensembles, highlighting their similarities and differences. We then address some conceptual problems in the determination of structural ensembles and define future goals towards the establishment of objective criteria for the comparison, validation, visualization and dissemination of such ensembles.
Collapse
Affiliation(s)
| | | | - Carlo Camilloni
- Department of Chemistry and Institute for Advanced Study, Technische Universität München, D-85747 Garching, Germany
| | | |
Collapse
|
33
|
Hummer G, Köfinger J. Bayesian ensemble refinement by replica simulations and reweighting. J Chem Phys 2016; 143:243150. [PMID: 26723635 DOI: 10.1063/1.4937786] [Citation(s) in RCA: 137] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
We describe different Bayesian ensemble refinement methods, examine their interrelation, and discuss their practical application. With ensemble refinement, the properties of dynamic and partially disordered (bio)molecular structures can be characterized by integrating a wide range of experimental data, including measurements of ensemble-averaged observables. We start from a Bayesian formulation in which the posterior is a functional that ranks different configuration space distributions. By maximizing this posterior, we derive an optimal Bayesian ensemble distribution. For discrete configurations, this optimal distribution is identical to that obtained by the maximum entropy "ensemble refinement of SAXS" (EROS) formulation. Bayesian replica ensemble refinement enhances the sampling of relevant configurations by imposing restraints on averages of observables in coupled replica molecular dynamics simulations. We show that the strength of the restraints should scale linearly with the number of replicas to ensure convergence to the optimal Bayesian result in the limit of infinitely many replicas. In the "Bayesian inference of ensembles" method, we combine the replica and EROS approaches to accelerate the convergence. An adaptive algorithm can be used to sample directly from the optimal ensemble, without replicas. We discuss the incorporation of single-molecule measurements and dynamic observables such as relaxation parameters. The theoretical analysis of different Bayesian ensemble refinement approaches provides a basis for practical applications and a starting point for further investigations.
Collapse
Affiliation(s)
- Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| | - Jürgen Köfinger
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Max-von-Laue Str. 3, 60438 Frankfurt am Main, Germany
| |
Collapse
|
34
|
Goh BC, Hadden JA, Bernardi RC, Singharoy A, McGreevy R, Rudack T, Cassidy CK, Schulten K. Computational Methodologies for Real-Space Structural Refinement of Large Macromolecular Complexes. Annu Rev Biophys 2016; 45:253-78. [PMID: 27145875 DOI: 10.1146/annurev-biophys-062215-011113] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The rise of the computer as a powerful tool for model building and refinement has revolutionized the field of structure determination for large biomolecular systems. Despite the wide availability of robust experimental methods capable of resolving structural details across a range of spatiotemporal resolutions, computational hybrid methods have the unique ability to integrate the diverse data from multimodal techniques such as X-ray crystallography and electron microscopy into consistent, fully atomistic structures. Here, commonly employed strategies for computational real-space structural refinement are reviewed, and their specific applications are illustrated for several large macromolecular complexes: ribosome, virus capsids, chemosensory array, and photosynthetic chromatophore. The increasingly important role of computational methods in large-scale structural refinement, along with current and future challenges, is discussed.
Collapse
Affiliation(s)
- Boon Chong Goh
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - Jodi A Hadden
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Energy Biosciences Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - Rafael C Bernardi
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Energy Biosciences Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - Abhishek Singharoy
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - Ryan McGreevy
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - Till Rudack
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - C Keith Cassidy
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
| | - Klaus Schulten
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Energy Biosciences Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801.,Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801;
| |
Collapse
|
35
|
Bonomi M, Camilloni C, Cavalli A, Vendruscolo M. Metainference: A Bayesian inference method for heterogeneous systems. SCIENCE ADVANCES 2016; 2:e1501177. [PMID: 26844300 PMCID: PMC4737209 DOI: 10.1126/sciadv.1501177] [Citation(s) in RCA: 130] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 11/21/2015] [Indexed: 05/22/2023]
Abstract
Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called "metainference," that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors.
Collapse
Affiliation(s)
- Massimiliano Bonomi
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
- Corresponding author. E-mail: (M.B.); (M.V.)
| | - Carlo Camilloni
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
| | - Andrea Cavalli
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
- Institute for Research in Biomedicine, CH-6500 Bellinzona, Switzerland
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
- Corresponding author. E-mail: (M.B.); (M.V.)
| |
Collapse
|
36
|
Schröder GF. Hybrid methods for macromolecular structure determination: experiment with expectations. Curr Opin Struct Biol 2015; 31:20-7. [DOI: 10.1016/j.sbi.2015.02.016] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Revised: 02/22/2015] [Accepted: 02/26/2015] [Indexed: 12/15/2022]
|
37
|
Różycki B, Boura E. Large, dynamic, multi-protein complexes: a challenge for structural biology. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2014; 26:463103. [PMID: 25335513 DOI: 10.1088/0953-8984/26/46/463103] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Structural biology elucidates atomic structures of macromolecules such as proteins, DNA, RNA, and their complexes to understand the basic mechanisms of their functions. Among proteins that pose the most difficult problems to current efforts are those which have several large domains connected by long, flexible polypeptide segments. Although abundant and critically important in biological cells, such proteins have proven intractable by conventional techniques. This gap has recently led to the advancement of hybrid methods that use state-of-the-art computational tools to combine complementary data from various high- and low-resolution experiments. In this review, we briefly discuss the individual experimental techniques to illustrate their strengths and limitations, and then focus on the use of hybrid methods in structural biology. We describe how representative structures of dynamic multi-protein complexes are obtained utilizing the EROS hybrid method that we have co-developed.
Collapse
Affiliation(s)
- Bartosz Różycki
- Institute of Physics, Polish Academy of Sciences, Al. Lotników 32/46, 02-668 Warsaw, Poland
| | | |
Collapse
|
38
|
Schneidman-Duhovny D, Pellarin R, Sali A. Uncertainty in integrative structural modeling. Curr Opin Struct Biol 2014; 28:96-104. [PMID: 25173450 PMCID: PMC4252396 DOI: 10.1016/j.sbi.2014.08.001] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2014] [Revised: 07/24/2014] [Accepted: 08/05/2014] [Indexed: 01/08/2023]
Abstract
Integrative structural modeling uses multiple types of input information and proceeds in four stages: (i) gathering information, (ii) designing model representation and converting information into a scoring function, (iii) sampling good-scoring models, and (iv) analyzing models and information. In the first stage, uncertainty originates from data that are sparse, noisy, ambiguous, or derived from heterogeneous samples. In the second stage, uncertainty can originate from a representation that is too coarse for the available information or a scoring function that does not accurately capture the information. In the third stage, the major source of uncertainty is insufficient sampling. In the fourth stage, clustering, cross-validation, and other methods are used to estimate the precision and accuracy of the models and information.
Collapse
Affiliation(s)
- Dina Schneidman-Duhovny
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA.
| | - Riccardo Pellarin
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA; Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, CA 94158, USA.
| |
Collapse
|