1
|
Gilles MA, Singer A. Cryo-EM Heterogeneity Analysis using Regularized Covariance Estimation and Kernel Regression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.28.564422. [PMID: 37961393 PMCID: PMC10634927 DOI: 10.1101/2023.10.28.564422] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Proteins and the complexes they form are central to nearly all cellular processes. Their flexibility, expressed through a continuum of states, provides a window into their biological functions. Cryogenic electron microscopy (cryo-EM) is an ideal tool to study these dynamic states as it captures specimens in non-crystalline conditions and enables high-resolution reconstructions. However, analyzing the heterogeneous distributions of conformations from cryo-EM data is challenging. We present RECOVAR, a method for analyzing these distributions based on principal component analysis (PCA) computed using a REgularized COVARiance estimator. RECOVAR is fast, robust, interpretable, expressive, and competitive with the state-of-art neural network methods on heterogeneous cryo-EM datasets. The regularized covariance method efficiently computes a large number of high-resolution principal components that can encode rich heterogeneous distributions of conformations and does so robustly thanks to an automatic regularization scheme. The novel reconstruction method based on adaptive kernel regression resolves conformational states to a higher resolution than all other tested methods on extensive independent benchmarks while remaining highly interpretable. Additionally, we exploit favorable properties of the PCA embedding to estimate the conformational density accurately. This density allows for better interpretability of the latent space by identifying stable states and low free-energy motions. Finally, we present a scheme to navigate the high-dimensional latent space by automatically identifying these low free-energy trajectories. We make the code freely available at https://github.com/ma-gilles/recovar .
Collapse
|
2
|
Verbeke EJ, Gilles MA, Bendory T, Singer A. Self Fourier shell correlation: properties and application to cryo-ET. Commun Biol 2024; 7:101. [PMID: 38228756 PMCID: PMC10791666 DOI: 10.1038/s42003-023-05724-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 12/19/2023] [Indexed: 01/18/2024] Open
Abstract
The Fourier shell correlation (FSC) is a measure of the similarity between two signals computed over corresponding shells in the frequency domain and has broad applications in microscopy. In structural biology, the FSC is ubiquitous in methods for validation, resolution determination, and signal enhancement. Computing the FSC usually requires two independent measurements of the same underlying signal, which can be limiting for some applications. Here, we analyze and extend on an approach to estimate the FSC from a single measurement. In particular, we derive the necessary conditions required to estimate the FSC from downsampled versions of a single noisy measurement. These conditions reveal additional corrections which we implement to increase the applicability of the method. We then illustrate two applications of our approach, first as an estimate of the global resolution from a single 3-D structure and second as a data-driven method for denoising tomographic reconstructions in electron cryo-tomography. These results provide general guidelines for computing the FSC from a single measurement and suggest new applications of the FSC in microscopy.
Collapse
Affiliation(s)
- Eric J Verbeke
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA.
| | - Marc Aurèle Gilles
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
| | - Tamir Bendory
- School of Electrical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Amit Singer
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
- Department of Mathematics, Princeton University, Princeton, NJ, USA
| |
Collapse
|
3
|
Vuillemot R, Harastani M, Hamitouche I, Jonic S. MDSPACE and MDTOMO Software for Extracting Continuous Conformational Landscapes from Datasets of Single Particle Images and Subtomograms Based on Molecular Dynamics Simulations: Latest Developments in ContinuousFlex Software Package. Int J Mol Sci 2023; 25:20. [PMID: 38203192 PMCID: PMC10779004 DOI: 10.3390/ijms25010020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/16/2023] [Accepted: 12/17/2023] [Indexed: 01/12/2024] Open
Abstract
Cryo electron microscopy (cryo-EM) instrumentation allows obtaining 3D reconstruction of the structure of biomolecular complexes in vitro (purified complexes studied by single particle analysis) and in situ (complexes studied in cells by cryo electron tomography). Standard cryo-EM approaches allow high-resolution reconstruction of only a few conformational states of a molecular complex, as they rely on data classification into a given number of classes to increase the resolution of the reconstruction from the most populated classes while discarding all other classes. Such discrete classification approaches result in a partial picture of the full conformational variability of the complex, due to continuous conformational transitions with many, uncountable intermediate states. In this article, we present the software with a user-friendly graphical interface for running two recently introduced methods, namely, MDSPACE and MDTOMO, to obtain continuous conformational landscapes of biomolecules by analyzing in vitro and in situ cryo-EM data (single particle images and subtomograms) based on molecular dynamics simulations of an available atomic model of one of the conformations. The MDSPACE and MDTOMO software is part of the open-source ContinuousFlex software package (starting from version 3.4.2 of ContinuousFlex), which can be run as a plugin of the Scipion software package (version 3.1 and later), broadly used in the cryo-EM field.
Collapse
Affiliation(s)
| | | | | | - Slavica Jonic
- IMPMC-UMR 7590 CNRS, Sorbonne Université, MNHN, 75005 Paris, France
| |
Collapse
|
4
|
Verbeke EJ, Gilles MA, Bendory T, Singer A. Self Fourier shell correlation: properties and application to cryo-ET. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.07.565363. [PMID: 37986736 PMCID: PMC10659293 DOI: 10.1101/2023.11.07.565363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
The Fourier shell correlation (FSC) is a measure of the similarity between two signals computed over corresponding shells in the frequency domain and has broad applications in microscopy. In structural biology, the FSC is ubiquitous in methods for validation, resolution determination, and signal enhancement. Computing the FSC usually requires two independent measurements of the same underlying signal, which can be limiting for some applications. Here, we analyze and extend on an approach proposed by Koho et al. [1] to estimate the FSC from a single measurement. In particular, we derive the necessary conditions required to estimate the FSC from downsampled versions of a single noisy measurement. These conditions reveal additional corrections which we implement to increase the applicability of the method. We then illustrate two applications of our approach, first as an estimate of the global resolution from a single 3-D structure and second as a data-driven method for denoising tomographic reconstructions in electron cryo-tomography. These results provide general guidelines for computing the FSC from a single measurement and suggest new applications of the FSC in microscopy.
Collapse
Affiliation(s)
- Eric J Verbeke
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
| | - Marc Aurèle Gilles
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
| | - Tamir Bendory
- School of Electrical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Amit Singer
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
- Department of Mathematics, Princeton University, Princeton, NJ, USA
| |
Collapse
|
5
|
Forsberg BO, Shah PNM, Burt A. A robust normalized local filter to estimate compositional heterogeneity directly from cryo-EM maps. Nat Commun 2023; 14:5802. [PMID: 37726277 PMCID: PMC10509264 DOI: 10.1038/s41467-023-41478-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 09/06/2023] [Indexed: 09/21/2023] Open
Abstract
Cryo electron microscopy (cryo-EM) is used by biological research to visualize biomolecular complexes in 3D, but the heterogeneity of cryo-EM reconstructions is not easily estimated. Current processing paradigms nevertheless exert great effort to reduce flexibility and heterogeneity to improve the quality of the reconstruction. Clustering algorithms are typically employed to identify populations of data with reduced variability, but lack assessment of remaining heterogeneity. Here we develope a fast and simple algorithm based on spatial filtering to estimate the heterogeneity of a reconstruction. In the absence of flexibility, this estimate approximates macromolecular component occupancy. We show that our implementation can derive reasonable input parameters, that composition heterogeneity can be estimated based on contrast loss, and that the reconstruction can be modified accordingly to emulate altered constituent occupancy. This stands to benefit conventionally employed maximum-likelihood classification methods, whereas we here limit considerations to cryo-EM map interpretation, quantification, and particle-image signal subtraction.
Collapse
Affiliation(s)
- Björn O Forsberg
- Department of Physiology and Pharmacology, Karolinska Institute, 171 77, Stockholm, Sweden.
- Division of Structural Biology, University of Oxford, OX3 7BN, Oxford, UK.
| | - Pranav N M Shah
- Division of Structural Biology, University of Oxford, OX3 7BN, Oxford, UK
| | - Alister Burt
- MRC Laboratory of Molecular Biology, Cambridge, CB2 0QH, UK
| |
Collapse
|
6
|
Punjani A, Fleet DJ. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nat Methods 2023; 20:860-870. [PMID: 37169929 PMCID: PMC10250194 DOI: 10.1038/s41592-023-01853-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 58.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 03/16/2023] [Indexed: 05/13/2023]
Abstract
Modeling flexible macromolecules is one of the foremost challenges in single-particle cryogenic-electron microscopy (cryo-EM), with the potential to illuminate fundamental questions in structural biology. We introduce Three-Dimensional Flexible Refinement (3DFlex), a motion-based neural network model for continuous molecular heterogeneity for cryo-EM data. 3DFlex exploits knowledge that conformational variability of a protein is often the result of physical processes that transport density over space and tend to preserve local geometry. From two-dimensional image data, 3DFlex enables the determination of high-resolution 3D density, and provides an explicit model of a flexible protein's motion over its conformational landscape. Experimentally, for large molecular machines (tri-snRNP spliceosome complex, translocating ribosome) and small flexible proteins (TRPV1 ion channel, αVβ8 integrin, SARS-CoV-2 spike), 3DFlex learns nonrigid molecular motions while resolving details of moving secondary structure elements. 3DFlex can improve 3D density resolution beyond the limits of existing methods because particle images contribute coherent signal over the conformational landscape.
Collapse
Affiliation(s)
- Ali Punjani
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.
- Structura Biotechnology Inc., Toronto, Ontario, Canada.
| | - David J Fleet
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.
- Google Research, Toronto, Ontario, Canada.
| |
Collapse
|
7
|
Toader B, Sigworth FJ, Lederman RR. Methods for Cryo-EM Single Particle Reconstruction of Macromolecules Having Continuous Heterogeneity. J Mol Biol 2023; 435:168020. [PMID: 36863660 PMCID: PMC10164696 DOI: 10.1016/j.jmb.2023.168020] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 03/02/2023]
Abstract
Macromolecules change their shape (conformation) in the process of carrying out their functions. The imaging by cryo-electron microscopy of rapidly-frozen, individual copies of macromolecules (single particles) is a powerful and general approach to understanding the motions and energy landscapes of macromolecules. Widely-used computational methods already allow the recovery of a few distinct conformations from heterogeneous single-particle samples, but the treatment of complex forms of heterogeneity such as the continuum of possible transitory states and flexible regions remains largely an open problem. In recent years there has been a surge of new approaches for treating the more general problem of continuous heterogeneity. This paper surveys the current state of the art in this area.
Collapse
Affiliation(s)
- Bogdan Toader
- Department of Statistics and Data Science, Yale University, United States.
| | - Fred J Sigworth
- Department of Cellular and Molecular Physiology, Yale University, United States
| | - Roy R Lederman
- Department of Statistics and Data Science, Yale University, United States. https://twitter.com/roylederman
| |
Collapse
|
8
|
Vuillemot R, Mirzaei A, Harastani M, Hamitouche I, Fréchin L, Klaholz BP, Miyashita O, Tama F, Rouiller I, Jonic S. MDSPACE: Extracting Continuous Conformational Landscapes from Cryo-EM Single Particle Datasets Using 3D-to-2D Flexible Fitting based on Molecular Dynamics Simulation. J Mol Biol 2023; 435:167951. [PMID: 36638910 DOI: 10.1016/j.jmb.2023.167951] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 12/08/2022] [Accepted: 01/03/2023] [Indexed: 01/12/2023]
Abstract
This article presents an original approach for extracting atomic-resolution landscapes of continuous conformational variability of biomolecular complexes from cryo electron microscopy (cryo-EM) single particle images. This approach is based on a new 3D-to-2D flexible fitting method, which uses molecular dynamics (MD) simulation and is embedded in an iterative conformational-landscape refinement scheme. This new approach is referred to as MDSPACE, which stands for Molecular Dynamics simulation for Single Particle Analysis of Continuous Conformational hEterogeneity. The article describes the MDSPACE approach and shows its performance using synthetic and experimental datasets.
Collapse
Affiliation(s)
- Rémi Vuillemot
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France; Department of Biochemistry & Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria, Australia
| | - Alex Mirzaei
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Mohamad Harastani
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Ilyes Hamitouche
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Léo Fréchin
- Centre for Integrative Biology, Department of Integrated Structural Biology, IGBMC-UMR 7104 CNRS, U964 Inserm, Université de Strasbourg, Strasbourg, France
| | - Bruno P Klaholz
- Centre for Integrative Biology, Department of Integrated Structural Biology, IGBMC-UMR 7104 CNRS, U964 Inserm, Université de Strasbourg, Strasbourg, France
| | | | - Florence Tama
- RIKEN Center for Computational Science, Kobe, Japan; Institute of Transformative Biomolecules, Graduate School of Science, Nagoya University, Nagoya, Japan; Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Japan
| | - Isabelle Rouiller
- Department of Biochemistry & Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria, Australia
| | - Slavica Jonic
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France.
| |
Collapse
|
9
|
DiIorio MC, Kulczyk AW. Exploring the Structural Variability of Dynamic Biological Complexes by Single-Particle Cryo-Electron Microscopy. MICROMACHINES 2022; 14:118. [PMID: 36677177 PMCID: PMC9866264 DOI: 10.3390/mi14010118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 05/15/2023]
Abstract
Biological macromolecules and assemblies precisely rearrange their atomic 3D structures to execute cellular functions. Understanding the mechanisms by which these molecular machines operate requires insight into the ensemble of structural states they occupy during the functional cycle. Single-particle cryo-electron microscopy (cryo-EM) has become the preferred method to provide near-atomic resolution, structural information about dynamic biological macromolecules elusive to other structure determination methods. Recent advances in cryo-EM methodology have allowed structural biologists not only to probe the structural intermediates of biochemical reactions, but also to resolve different compositional and conformational states present within the same dataset. This article reviews newly developed sample preparation and single-particle analysis (SPA) techniques for high-resolution structure determination of intrinsically dynamic and heterogeneous samples, shedding light upon the intricate mechanisms employed by molecular machines and helping to guide drug discovery efforts.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry and Microbiology, Rutgers University, 75 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
10
|
Abstract
Cryo-electron microscopy (CryoEM) has become a vital technique in structural biology. It is an interdisciplinary field that takes advantage of advances in biochemistry, physics, and image processing, among other disciplines. Innovations in these three basic pillars have contributed to the boosting of CryoEM in the past decade. This work reviews the main contributions in image processing to the current reconstruction workflow of single particle analysis (SPA) by CryoEM. Our review emphasizes the time evolution of the algorithms across the different steps of the workflow differentiating between two groups of approaches: analytical methods and deep learning algorithms. We present an analysis of the current state of the art. Finally, we discuss the emerging problems and challenges still to be addressed in the evolution of CryoEM image processing methods in SPA.
Collapse
Affiliation(s)
- Jose Luis Vilas
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Jose Maria Carazo
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Carlos Oscar S. Sorzano
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
11
|
Hamitouche I, Jonic S. DeepHEMNMA: ResNet-based hybrid analysis of continuous conformational heterogeneity in cryo-EM single particle images. Front Mol Biosci 2022; 9:965645. [PMID: 36158571 PMCID: PMC9493108 DOI: 10.3389/fmolb.2022.965645] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
Single-particle cryo-electron microscopy (cryo-EM) is a technique for biomolecular structure reconstruction from vitrified samples containing many copies of a biomolecular complex (known as single particles) at random unknown 3D orientations and positions. Cryo-EM allows reconstructing multiple conformations of the complexes from images of the same sample, which usually requires many rounds of 2D and 3D classifications to disentangle and interpret the combined conformational, orientational, and translational heterogeneity. The elucidation of different conformations is the key to understand molecular mechanisms behind the biological functions of the complexes and the key to novel drug discovery. Continuous conformational heterogeneity, due to gradual conformational transitions giving raise to many intermediate conformational states of the complexes, is both an obstacle for high-resolution 3D reconstruction of the conformational states and an opportunity to obtain information about multiple coexisting conformational states at once. HEMNMA method, specifically developed for analyzing continuous conformational heterogeneity in cryo-EM, determines the conformation, orientation, and position of the complex in each single particle image by image analysis using normal modes (the motion directions simulated for a given atomic structure or EM map), which in turn allows determining the full conformational space of the complex but at the price of high computational cost. In this article, we present a new method, referred to as DeepHEMNMA, which speeds up HEMNMA by combining it with a residual neural network (ResNet) based deep learning approach. The performance of DeepHEMNMA is shown using synthetic and experimental single particle images.
Collapse
Affiliation(s)
| | - Slavica Jonic
- IMPMC - UMR 7590 CNRS, Sorbonne Université, MNHN, Paris, France
| |
Collapse
|
12
|
Shi Y, Singer A. Ab-initio contrast estimation and denoising of cryo-EM images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 224:107018. [PMID: 35901641 PMCID: PMC9392052 DOI: 10.1016/j.cmpb.2022.107018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 06/22/2022] [Accepted: 07/08/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE The contrast of cryo-EM images varies from one to another, primarily due to the uneven thickness of the ice layer. This contrast variation can affect the quality of 2-D class averaging, 3-D ab-initio modeling, and 3-D heterogeneity analysis. Contrast estimation is currently performed during 3-D iterative refinement. As a result, the estimates are not available at the earlier computational stages of class averaging and ab-initio modeling. This paper aims to solve the contrast estimation problem directly from the picked particle images in the ab-initio stage, without estimating the 3-D volume, image rotations, or class averages. METHODS The key observation underlying our analysis is that the 2-D covariance matrix of the raw images is related to the covariance of the underlying clean images, the noise variance, and the contrast variability between images. We show that the contrast variability can be derived from the 2-D covariance matrix and we apply the existing Covariance Wiener Filtering (CWF) framework to estimate it. We also demonstrate a modification of CWF to estimate the contrast of individual images. RESULTS Our method improves the contrast estimation by a large margin, compared to the previous CWF method. Its estimation accuracy is often comparable to that of an oracle that knows the ground truth covariance of the clean images. The more accurate contrast estimation also improves the quality of image restoration as demonstrated in both synthetic and experimental datasets. CONCLUSIONS This paper proposes an effective method for contrast estimation directly from noisy images without using any 3-D volume information. It enables contrast correction in the earlier stage of single particle analysis, and may improve the accuracy of downstream processing.
Collapse
Affiliation(s)
- Yunpeng Shi
- Program in Applied and Computational Mathematics, Princeton University, United States.
| | - Amit Singer
- Program in Applied and Computational Mathematics, Princeton University, United States; Department of Mathematics, Princeton University, United States
| |
Collapse
|
13
|
Wu Z, Chen E, Zhang S, Ma Y, Mao Y. Visualizing Conformational Space of Functional Biomolecular Complexes by Deep Manifold Learning. Int J Mol Sci 2022; 23:8872. [PMID: 36012133 PMCID: PMC9408802 DOI: 10.3390/ijms23168872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 07/29/2022] [Accepted: 08/03/2022] [Indexed: 11/17/2022] Open
Abstract
The cellular functions are executed by biological macromolecular complexes in nonequilibrium dynamic processes, which exhibit a vast diversity of conformational states. Solving the conformational continuum of important biomolecular complexes at the atomic level is essential to understanding their functional mechanisms and guiding structure-based drug discovery. Here, we introduce a deep manifold learning framework, named AlphaCryo4D, which enables atomic-level cryogenic electron microscopy (cryo-EM) reconstructions that approximately visualize the conformational space of biomolecular complexes of interest. AlphaCryo4D integrates 3D deep residual learning with manifold embedding of pseudo-energy landscapes, which simultaneously improves 3D classification accuracy and reconstruction resolution via an energy-based particle-voting algorithm. In blind assessments using simulated heterogeneous datasets, AlphaCryo4D achieved 3D classification accuracy three times those of alternative methods and reconstructed continuous conformational changes of a 130-kDa protein at sub-3 Å resolution. By applying this approach to analyze several experimental datasets of the proteasome, ribosome and spliceosome, we demonstrate its potential generality in exploring hidden conformational space or transient states of macromolecular complexes that remain hitherto invisible. Integration of this approach with time-resolved cryo-EM further allows visualization of conformational continuum in a nonequilibrium regime at the atomic level, thus potentially enabling therapeutic discovery against highly dynamic biomolecular targets.
Collapse
Affiliation(s)
- Zhaolong Wu
- State Key Laboratory for Artificial Microstructure and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, China
- Peking-Tsinghua Joint Center for Life Sciences, Academy of Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Center for Quantitative Biology, Academy of Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Enbo Chen
- State Key Laboratory for Artificial Microstructure and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, China
- Peking-Tsinghua Joint Center for Life Sciences, Academy of Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Shuwen Zhang
- State Key Laboratory for Artificial Microstructure and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, China
- Peking-Tsinghua Joint Center for Life Sciences, Academy of Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Yinping Ma
- Computing Center, Peking University, Beijing 100871, China
| | - Youdong Mao
- State Key Laboratory for Artificial Microstructure and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, China
- Peking-Tsinghua Joint Center for Life Sciences, Academy of Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Center for Quantitative Biology, Academy of Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- National Biomedical Imaging Center, Peking University, Beijing 100871, China
| |
Collapse
|
14
|
Zhou Y, Moscovich A, Bartesaghi A. Data-driven determination of number of discrete conformations in single-particle cryo-EM. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106892. [PMID: 35597206 PMCID: PMC10131080 DOI: 10.1016/j.cmpb.2022.106892] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/11/2022] [Accepted: 05/12/2022] [Indexed: 05/20/2023]
Abstract
BACKGROUND AND OBJECTIVE One of the strengths of single-particle cryo-EM compared to other structural determination techniques is its ability to image heterogeneous samples containing multiple molecular species, different oligomeric states or distinct conformations. This is achieved using routines for in-silico 3D classification that are now well established in the field and have successfully been used to characterize the structural heterogeneity of important biomolecules. These techniques, however, rely on expert-user knowledge and trial-and-error experimentation to determine the correct number of conformations, making it a labor intensive, subjective, and difficult to reproduce procedure. METHODS We propose an approach to address the problem of automatically determining the number of discrete conformations present in heterogeneous single-particle cryo-EM datasets. We do this by systematically evaluating all possible partitions of the data and selecting the result that maximizes the average variance of similarities measured between particle images and the corresponding 3D reconstructions. RESULTS Using this strategy, we successfully analyzed datasets of heterogeneous protein complexes, including: 1) in-silico mixtures obtained by combining closely related antibody-bound HIV-1 Env trimers and other important membrane channels, and 2) naturally occurring mixtures from diverse and dynamic protein complexes representing varying degrees of structural heterogeneity and conformational plasticity. CONCLUSIONS The availability of unsupervised strategies for 3D classification combined with existing approaches for fully automatic pre-processing and 3D refinement, represents an important step towards converting single-particle cryo-EM into a high-throughput technique.
Collapse
Affiliation(s)
- Ye Zhou
- Department of Computer Science, Duke University, Durham, NC 27708, USA
| | - Amit Moscovich
- Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, Israel
| | - Alberto Bartesaghi
- Department of Computer Science, Duke University, Durham, NC 27708, USA; Department of Biochemistry, Duke University School of Medicine, Durham, NC 27708, USA; Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA.
| |
Collapse
|
15
|
Seitz E, Acosta-Reyes F, Maji S, Schwander P, Frank J. Recovery of Conformational Continuum From Single-Particle Cryo-EM Images: Optimization of ManifoldEM Informed by Ground Truth. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING 2022; 8:462-478. [PMID: 36258699 PMCID: PMC9575687 DOI: 10.1109/tci.2022.3174801] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This work is based on the manifold-embedding approach to study biological molecules exhibiting continuous conformational changes. Previous work established a method-now termed ManifoldEM-capable of reconstructing 3D movies and accompanying free-energy landscapes from single-particle cryo-EM images of macromolecules exercising multiple conformational degrees of freedom. While ManifoldEM has proven its viability in several experimental studies, critical limitations and uncertainties have been found throughout its extended development and use. Guided by insights from studies with cryo-EM ground-truth data, simulated from atomic structures undergoing conformational changes, we have built a novel framework, ESPER, able to retrieve the free-energy landscape and respective 3D Coulomb potential maps for all states simulated. As shown by a direct comparison of ground truth vs. recovered maps, and analysis of experimental data from the 80S ribosome and ryanodine receptor, ESPER offers substantial improvements relative to the previous work.
Collapse
Affiliation(s)
- Evan Seitz
- Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, New York, NY 10032 USA, and also with the Department of Biological Sciences, Columbia University, New York, NY 10027 USA
| | - Francisco Acosta-Reyes
- Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, New York, NY 10032 USA
| | - Suvrajit Maji
- Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, New York, NY 10032 USA
| | - Peter Schwander
- Department of Physics, University of Wisconsin-Milwaukee, Milwaukee, WI 53211 USA
| | - Joachim Frank
- Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, New York, NY 10032 USA, and also with the Department of Biological Sciences, Columbia University, New York, NY 10027 USA
| |
Collapse
|
16
|
Wang X, Lu Y, Lin X. Heterogeneous cryo-EM projection image classification using a two-stage spectral clustering based on novel distance measures. Brief Bioinform 2022; 23:6543485. [PMID: 35255494 DOI: 10.1093/bib/bbac032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 01/17/2022] [Accepted: 01/23/2022] [Indexed: 11/13/2022] Open
Abstract
Single-particle cryo-electron microscopy (cryo-EM) has become one of the mainstream technologies in the field of structural biology to determine the three-dimensional (3D) structures of biological macromolecules. Heterogeneous cryo-EM projection image classification is an effective way to discover conformational heterogeneity of biological macromolecules in different functional states. However, due to the low signal-to-noise ratio of the projection images, the classification of heterogeneous cryo-EM projection images is a very challenging task. In this paper, two novel distance measures between projection images integrating the reliability of common lines, pixel intensity and class averages are designed, and then a two-stage spectral clustering algorithm based on the two distance measures is proposed for heterogeneous cryo-EM projection image classification. In the first stage, the novel distance measure integrating common lines and pixel intensities of projection images is used to obtain preliminary classification results through spectral clustering. In the second stage, another novel distance measure integrating the first novel distance measure and class averages generated from each group of projection images is used to obtain the final classification results through spectral clustering. The proposed two-stage spectral clustering algorithm is applied on a simulated and a real cryo-EM dataset for heterogeneous reconstruction. Results show that the two novel distance measures can be used to improve the classification performance of spectral clustering, and using the proposed two-stage spectral clustering algorithm can achieve higher classification and reconstruction accuracy than using RELION and XMIPP.
Collapse
Affiliation(s)
- Xiangwen Wang
- School of Information Science and Engineering, Lanzhou University, 730000, Lanzhou, China.,College of Computer Science and Engineering, Northwest Normal University, 730070, Lanzhou, China
| | - Yonggang Lu
- School of Information Science and Engineering, Lanzhou University, 730000, Lanzhou, China
| | - Xianghong Lin
- College of Computer Science and Engineering, Northwest Normal University, 730070, Lanzhou, China
| |
Collapse
|
17
|
Abstract
Correlated motions in proteins arising from the collective movements of residues have long been proposed to be fundamentally important to key properties of proteins, from allostery and catalysis to evolvability. Recent breakthroughs in structural biology have made it possible to capture proteins undergoing complex conformational changes, yet intrinsic correlated motions within a conformation remain one of the least understood facets of protein structure. For many decades, the analysis of total X-ray scattering held the promise of animating crystal structures with correlated motions. With recent advances in both X-ray detectors and data interpretation methods, this long-held promise can now be met. In this Perspective, we will introduce how correlated motions are captured in total scattering and provide guidelines for the collection, interpretation, and validation of data. As structural biology continues to push the boundaries, we see an opportunity to gain atomistic insight into correlated motions using total scattering as a bridge between theory and experiment.
Collapse
Affiliation(s)
- Da Xu
- Department of Chemistry and Chemical Biology, Cornell University, 259 East Avenue, Ithaca, New York 14853, United States
| | - Steve P Meisburger
- Department of Chemistry and Chemical Biology, Cornell University, 259 East Avenue, Ithaca, New York 14853, United States
| | - Nozomi Ando
- Department of Chemistry and Chemical Biology, Cornell University, 259 East Avenue, Ithaca, New York 14853, United States
| |
Collapse
|
18
|
Punjani A, Fleet DJ. 3D variability analysis: Resolving continuous flexibility and discrete heterogeneity from single particle cryo-EM. J Struct Biol 2021; 213:107702. [PMID: 33582281 DOI: 10.1016/j.jsb.2021.107702] [Citation(s) in RCA: 455] [Impact Index Per Article: 151.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 01/12/2021] [Accepted: 01/26/2021] [Indexed: 01/06/2023]
Abstract
Single particle cryo-EM excels in determining static structures of protein molecules, but existing 3D reconstruction methods have been ineffective in modelling flexible proteins. We introduce 3D variability analysis (3DVA), an algorithm that fits a linear subspace model of conformational change to cryo-EM data at high resolution. 3DVA enables the resolution and visualization of detailed molecular motions of both large and small proteins, revealing new biological insight from single particle cryo-EM data. Experimental results demonstrate the ability of 3DVA to resolve multiple flexible motions of α-helices in the sub-50 kDa transmembrane domain of a GPCR complex, bending modes of a sodium ion channel, five types of symmetric and symmetry-breaking flexibility in a proteasome, large motions in a spliceosome complex, and discrete conformational states of a ribosome assembly. 3DVA is implemented in the cryoSPARC software package.
Collapse
Affiliation(s)
- Ali Punjani
- Department of Computer Sciences, University of Toronto M5S 3G4, Canada; Vector Institute, 710-661 University Ave., Toronto M5G 1M1, Canada; Structura Biotechnology Inc., 129-100 College Ave., Toronto M5G 1L5, Canada.
| | - David J Fleet
- Department of Computer Sciences, University of Toronto M5S 3G4, Canada; Vector Institute, 710-661 University Ave., Toronto M5G 1M1, Canada.
| |
Collapse
|
19
|
Zhong ED, Bepler T, Berger B, Davis JH. CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks. Nat Methods 2021; 18:176-185. [PMID: 33542510 PMCID: PMC8183613 DOI: 10.1038/s41592-020-01049-4] [Citation(s) in RCA: 241] [Impact Index Per Article: 80.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Accepted: 12/18/2020] [Indexed: 12/18/2022]
Abstract
Cryo-electron microscopy (cryo-EM) single-particle analysis has proven powerful in determining the structures of rigid macromolecules. However, many imaged protein complexes exhibit conformational and compositional heterogeneity that poses a major challenge to existing three-dimensional reconstruction methods. Here, we present cryoDRGN, an algorithm that leverages the representation power of deep neural networks to directly reconstruct continuous distributions of 3D density maps and map per-particle heterogeneity of single-particle cryo-EM datasets. Using cryoDRGN, we uncovered residual heterogeneity in high-resolution datasets of the 80S ribosome and the RAG complex, revealed a new structural state of the assembling 50S ribosome, and visualized large-scale continuous motions of a spliceosome complex. CryoDRGN contains interactive tools to visualize a dataset's distribution of per-particle variability, generate density maps for exploratory analysis, extract particle subsets for use with other tools and generate trajectories to visualize molecular motions. CryoDRGN is open-source software freely available at http://cryodrgn.csail.mit.edu .
Collapse
Affiliation(s)
- Ellen D Zhong
- Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.,Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Tristan Bepler
- Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.,Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA. .,Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Joseph H Davis
- Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA. .,Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
20
|
Slater TJA, Wang YC, Leteba GM, Quiroz J, Camargo PHC, Haigh SJ, Allen CS. Automated Single-Particle Reconstruction of Heterogeneous Inorganic Nanoparticles. MICROSCOPY AND MICROANALYSIS : THE OFFICIAL JOURNAL OF MICROSCOPY SOCIETY OF AMERICA, MICROBEAM ANALYSIS SOCIETY, MICROSCOPICAL SOCIETY OF CANADA 2020; 26:1168-1175. [PMID: 33176893 DOI: 10.1017/s1431927620024642] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Single-particle reconstruction can be used to perform three-dimensional (3D) imaging of homogeneous populations of nano-sized objects, in particular viruses and proteins. Here, it is demonstrated that it can also be used to obtain 3D reconstructions of heterogeneous populations of inorganic nanoparticles. An automated acquisition scheme in a scanning transmission electron microscope is used to collect images of thousands of nanoparticles. Particle images are subsequently semi-automatically clustered in terms of their properties and separate 3D reconstructions are performed from selected particle image clusters. The result is a 3D dataset that is representative of the full population. The study demonstrates a methodology that allows 3D imaging and analysis of inorganic nanoparticles in a fully automated manner that is truly representative of large particle populations.
Collapse
Affiliation(s)
- Thomas J A Slater
- Electron Physical Sciences Imaging Centre, Diamond Light Source Ltd., OxfordshireOX11 0DE, UK
| | - Yi-Chi Wang
- School of Materials, University of Manchester, Oxford Road, ManchesterM13 9PL, UK
- Chinese Academy of Sciences, Beijing Institute of Nanoengergy and Nanosystems, Beijing100083, P.R. China
| | - Gerard M Leteba
- Department of Chemical Engineering, Catalysis Institute, University of Cape Town, Rondebosch7701, South Africa
| | - Jhon Quiroz
- Department of Chemistry, University of Helsinki, A.I. Virtasen aukio 1, Helsinki, Finland
| | - Pedro H C Camargo
- Department of Chemistry, University of Helsinki, A.I. Virtasen aukio 1, Helsinki, Finland
| | - Sarah J Haigh
- School of Materials, University of Manchester, Oxford Road, ManchesterM13 9PL, UK
| | - Christopher S Allen
- Electron Physical Sciences Imaging Centre, Diamond Light Source Ltd., OxfordshireOX11 0DE, UK
- Department of Materials, University of Oxford, Parks Road, OxfordOX1 3PH, UK
| |
Collapse
|
21
|
Poitevin F, Kushner A, Li X, Dao Duc K. Structural Heterogeneities of the Ribosome: New Frontiers and Opportunities for Cryo-EM. Molecules 2020; 25:E4262. [PMID: 32957592 PMCID: PMC7570653 DOI: 10.3390/molecules25184262] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/11/2020] [Accepted: 09/15/2020] [Indexed: 12/18/2022] Open
Abstract
The extent of ribosomal heterogeneity has caught increasing interest over the past few years, as recent studies have highlighted the presence of structural variations of the ribosome. More precisely, the heterogeneity of the ribosome covers multiple scales, including the dynamical aspects of ribosomal motion at the single particle level, specialization at the cellular and subcellular scale, or evolutionary differences across species. Upon solving the ribosome atomic structure at medium to high resolution, cryogenic electron microscopy (cryo-EM) has enabled investigating all these forms of heterogeneity. In this review, we present some recent advances in quantifying ribosome heterogeneity, with a focus on the conformational and evolutionary variations of the ribosome and their functional implications. These efforts highlight the need for new computational methods and comparative tools, to comprehensively model the continuous conformational transition pathways of the ribosome, as well as its evolution. While developing these methods presents some important challenges, it also provides an opportunity to extend our interpretation and usage of cryo-EM data, which would more generally benefit the study of molecular dynamics and evolution of proteins and other complexes.
Collapse
Affiliation(s)
- Frédéric Poitevin
- Department of LCLS Data Analytics, Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA;
| | - Artem Kushner
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; (A.K.); (X.L.)
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Xinpei Li
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; (A.K.); (X.L.)
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Khanh Dao Duc
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; (A.K.); (X.L.)
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Department of Zoology, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
22
|
Ortiz S, Stanisic L, Rodriguez BA, Rampp M, Hummer G, Cossio P. Validation tests for cryo-EM maps using an independent particle set. J Struct Biol X 2020; 4:100032. [PMID: 32743544 PMCID: PMC7385033 DOI: 10.1016/j.yjsbx.2020.100032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by providing 3D density maps of biomolecules at near-atomic resolution. However, map validation is still an open issue. Despite several efforts from the community, it is possible to overfit 3D maps to noisy data. Here, we develop a novel methodology that uses a small independent particle set (not used during the 3D refinement) to validate the maps. The main idea is to monitor how the map probability evolves over the control set during the 3D refinement. The method is complementary to the gold-standard procedure, which generates two reconstructions at each iteration. We low-pass filter the two reconstructions for different frequency cutoffs, and we calculate the probability of each filtered map given the control set. For high-quality maps, the probability should increase as a function of the frequency cutoff and the refinement iteration. We also compute the similarity between the densities of probability distributions of the two reconstructions. As higher frequencies are included, the distributions become more dissimilar. We optimized the BioEM package to perform these calculations, and tested it over systems ranging from quality data to pure noise. Our results show that with our methodology, it possible to discriminate datasets that are constructed from noise particles. We conclude that validation against a control particle set provides a powerful tool to assess the quality of cryo-EM maps.
Collapse
Affiliation(s)
- Sebastian Ortiz
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Luka Stanisic
- Max Planck Computing and Data Facility, 85748 Garching, Germany
| | - Boris A Rodriguez
- Grupo de Fósica Atómica y Molecular, Instituto de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Markus Rampp
- Max Planck Computing and Data Facility, 85748 Garching, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
- Institute of Biophysics, Goethe University, 60438 Frankfurt am Main, Germany
| | - Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| |
Collapse
|
23
|
Abstract
Single-particle electron cryomicroscopy (cryo-EM) is an increasingly popular technique for elucidating the three-dimensional structure of proteins and other biologically significant complexes at near-atomic resolution. It is an imaging method that does not require crystallization and can capture molecules in their native states. In single-particle cryo-EM, the three-dimensional molecular structure needs to be determined from many noisy two-dimensional tomographic projections of individual molecules, whose orientations and positions are unknown. The high level of noise and the unknown pose parameters are two key elements that make reconstruction a challenging computational problem. Even more challenging is the inference of structural variability and flexible motions when the individual molecules being imaged are in different conformational states. This review discusses computational methods for structure determination by single-particle cryo-EM and their guiding principles from statistical inference, machine learning, and signal processing that also play a significant role in many other data science applications.
Collapse
Affiliation(s)
- Amit Singer
- Department of Mathematics and Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
| | - Fred J Sigworth
- Departments of Cellular and Molecular Physiology, Biomedical Engineering, and Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
24
|
Maji S, Liao H, Dashti A, Mashayekhi G, Ourmazd A, Frank J. Propagation of Conformational Coordinates Across Angular Space in Mapping the Continuum of States from Cryo-EM Data by Manifold Embedding. J Chem Inf Model 2020; 60:2484-2491. [PMID: 32207941 PMCID: PMC7466846 DOI: 10.1021/acs.jcim.9b01115] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recent approaches to the study of biological molecules employ manifold learning to single-particle cryo-EM data sets to map the continuum of states of a molecule into a low-dimensional space spanned by eigenvectors or "conformational coordinates". This is done separately for each projection direction (PD) on an angular grid. One important step in deriving a consolidated map of occupancies, from which the free energy landscape of the molecule can be derived, is to propagate the conformational coordinates from a given choice of "anchor PD" across the entire angular space. Even when one eigenvector dominates, its sign might invert from one PD to the next. The propagation of the second eigenvector is particularly challenging when eigenvalues of the second and third eigenvector are closely matched, leading to occasional inversions in their ranking as we move across the angular grid. In the absence of a computational approach, this propagation across the angular space has been done thus far "by hand" using visual clues, thus greatly limiting the general use of the technique. In this work we have developed a method that is able to solve the propagation problem computationally, by using optical flow and a probabilistic graphical model. We demonstrate its utility by selected examples.
Collapse
Affiliation(s)
- Suvrajit Maji
- Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168 Street, New York, NY 10032, USA
| | - Hstau Liao
- Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168 Street, New York, NY 10032, USA
| | - Ali Dashti
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI 53211, USA
| | - Ghoncheh Mashayekhi
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI 53211, USA
| | - Abbas Ourmazd
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI 53211, USA
| | - Joachim Frank
- Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168 Street, New York, NY 10032, USA
- Department of Biological Sciences, Columbia University, 600 Fairchild Center, New York, NY 10027, USA
| |
Collapse
|
25
|
Zelesko N, Moscovich A, Kileel J, Singer A. EARTHMOVER-BASED MANIFOLD LEARNING FOR ANALYZING MOLECULAR CONFORMATION SPACES. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2020; 2020:1715-1719. [PMID: 36570366 PMCID: PMC9788962 DOI: 10.1109/isbi45749.2020.9098723] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
In this paper, we propose a novel approach for manifold learning that combines the Earthmover's distance (EMD) with the diffusion maps method for dimensionality reduction. We demonstrate the potential benefits of this approach for learning shape spaces of proteins and other flexible macromolecules using a simulated dataset of 3-D density maps that mimic the non-uniform rotary motion of ATP synthase. Our results show that EMD-based diffusion maps require far fewer samples to recover the intrinsic geometry than the standard diffusion maps algorithm that is based on the Euclidean distance. To reduce the computational burden of calculating the EMD for all volume pairs, we employ a wavelet-based approximation to the EMD which reduces the computation of the pairwise EMDs to a computation of pairwise weighted- ℓ 1 distances between wavelet coefficient vectors.
Collapse
Affiliation(s)
| | - Amit Moscovich
- Program in Applied and Computational Mathematics, Princeton University
| | - Joe Kileel
- Program in Applied and Computational Mathematics, Princeton University
| | - Amit Singer
- Program in Applied and Computational Mathematics, Princeton University,Department of Mathematics, Princeton University
| |
Collapse
|
26
|
Bendory T, Bartesaghi A, Singer A. Single-particle cryo-electron microscopy: Mathematical theory, computational challenges, and opportunities. IEEE SIGNAL PROCESSING MAGAZINE 2020; 37:58-76. [PMID: 32395065 PMCID: PMC7213211 DOI: 10.1109/msp.2019.2957822] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
In recent years, an abundance of new molecular structures have been elucidated using cryo-electron microscopy (cryo-EM), largely due to advances in hardware technology and data processing techniques. Owing to these new exciting developments, cryo-EM was selected by Nature Methods as Method of the Year 2015, and the Nobel Prize in Chemistry 2017 was awarded to three pioneers in the field. The main goal of this article is to introduce the challenging and exciting computational tasks involved in reconstructing 3-D molecular structures by cryo-EM. Determining molecular structures requires a wide range of computational tools in a variety of fields, including signal processing, estimation and detection theory, high-dimensional statistics, convex and non-convex optimization, spectral algorithms, dimensionality reduction, and machine learning. The tools from these fields must be adapted to work under exceptionally challenging conditions, including extreme noise levels, the presence of missing data, and massively large datasets as large as several Terabytes. In addition, we present two statistical models: multi-reference alignment and multi-target detection, that abstract away much of the intricacies of cryo-EM, while retaining some of its essential features. Based on these abstractions, we discuss some recent intriguing results in the mathematical theory of cryo-EM, and delineate relations with group theory, invariant theory, and information theory.
Collapse
Affiliation(s)
- Tamir Bendory
- Tel Aviv University, Electrical Engineering, Tel Aviv, Israel
| | - Alberto Bartesaghi
- Computer Science, Biochemistry, and Electrical and Computer Engineering, Durham, NC, USA, Duke University
| | - Amit Singer
- Princeton University, Applied and Computational Mathematics, Princeton, NJ USA
| |
Collapse
|
27
|
Abstract
Cross-validation is used to determine the validity of a model on unseen data by assessing if the model is overfitted to noise. It is widely used in many fields, from artificial intelligence to structural biology in X-ray crystallography and nuclear magnetic resonance. Although there are concerns of map overfitting in cryo-electron microscopy (cryo-EM), cross-validation is rarely used. The problem is that establishing a performance metric of the maps over unseen data (given by 2D-projection images) is difficult due to the low signal-to-noise ratios in the individual particles. Here, I present recent advances for cryo-EM map reconstruction. I highlight that the gold-standard procedure can fail to detect map overfitting in certain cases, showing the necessity of assessing the map quality on unbiased data. Finally, I describe the challenges and advantages of developing a robust cross-validation methodology for cryo-EM.
Collapse
Affiliation(s)
- Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellin, Colombia.,Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| |
Collapse
|