1
|
Abstract
Cryo-electron microscopy (CryoEM) has become a vital technique in structural biology. It is an interdisciplinary field that takes advantage of advances in biochemistry, physics, and image processing, among other disciplines. Innovations in these three basic pillars have contributed to the boosting of CryoEM in the past decade. This work reviews the main contributions in image processing to the current reconstruction workflow of single particle analysis (SPA) by CryoEM. Our review emphasizes the time evolution of the algorithms across the different steps of the workflow differentiating between two groups of approaches: analytical methods and deep learning algorithms. We present an analysis of the current state of the art. Finally, we discuss the emerging problems and challenges still to be addressed in the evolution of CryoEM image processing methods in SPA.
Collapse
Affiliation(s)
- Jose Luis Vilas
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Jose Maria Carazo
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Carlos Oscar S. Sorzano
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
2
|
Sorzano COS, Jiménez-Moreno A, Maluenda D, Martínez M, Ramírez-Aportela E, Krieger J, Melero R, Cuervo A, Conesa J, Filipovic J, Conesa P, del Caño L, Fonseca YC, Jiménez-de la Morena J, Losana P, Sánchez-García R, Strelak D, Fernández-Giménez E, de Isidro-Gómez FP, Herreros D, Vilas JL, Marabini R, Carazo JM. On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy. Acta Crystallogr D Struct Biol 2022; 78:410-423. [PMID: 35362465 PMCID: PMC8972802 DOI: 10.1107/s2059798322001978] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 02/18/2022] [Indexed: 12/05/2022] Open
Abstract
Cryo-electron microscopy (cryoEM) has become a well established technique to elucidate the 3D structures of biological macromolecules. Projection images from thousands of macromolecules that are assumed to be structurally identical are combined into a single 3D map representing the Coulomb potential of the macromolecule under study. This article discusses possible caveats along the image-processing path and how to avoid them to obtain a reliable 3D structure. Some of these problems are very well known in the community. These may be referred to as sample-related (such as specimen denaturation at interfaces or non-uniform projection geometry leading to underrepresented projection directions). The rest are related to the algorithms used. While some have been discussed in depth in the literature, such as the use of an incorrect initial volume, others have received much less attention. However, they are fundamental in any data-analysis approach. Chiefly among them, instabilities in estimating many of the key parameters that are required for a correct 3D reconstruction that occur all along the processing workflow are referred to, which may significantly affect the reliability of the whole process. In the field, the term overfitting has been coined to refer to some particular kinds of artifacts. It is argued that overfitting is a statistical bias in key parameter-estimation steps in the 3D reconstruction process, including intrinsic algorithmic bias. It is also shown that common tools (Fourier shell correlation) and strategies (gold standard) that are normally used to detect or prevent overfitting do not fully protect against it. Alternatively, it is proposed that detecting the bias that leads to overfitting is much easier when addressed at the level of parameter estimation, rather than detecting it once the particle images have been combined into a 3D map. Comparing the results from multiple algorithms (or at least, independent executions of the same algorithm) can detect parameter bias. These multiple executions could then be averaged to give a lower variance estimate of the underlying parameters.
Collapse
Affiliation(s)
- C. O. S. Sorzano
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - A. Jiménez-Moreno
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Maluenda
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - M. Martínez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - E. Ramírez-Aportela
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Krieger
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - R. Melero
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - A. Cuervo
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Conesa
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | | | - P. Conesa
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - L. del Caño
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - Y. C. Fonseca
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Jiménez-de la Morena
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - P. Losana
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - R. Sánchez-García
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Strelak
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
- Masaryk University, Brno, Czech Republic
| | - E. Fernández-Giménez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - F. P. de Isidro-Gómez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Herreros
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. L. Vilas
- School of Engineering and Applied Science, Yale University, New Haven, CT 06520-829, USA
| | - R. Marabini
- Escuela Politecnica Superior, Universidad Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain
| | - J. M. Carazo
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
4
|
A Survey of the Use of Iterative Reconstruction Algorithms in Electron Microscopy. BIOMED RESEARCH INTERNATIONAL 2017; 2017:6482567. [PMID: 29312997 PMCID: PMC5623807 DOI: 10.1155/2017/6482567] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 03/09/2017] [Indexed: 11/18/2022]
Abstract
One of the key steps in Electron Microscopy is the tomographic reconstruction of a three-dimensional (3D) map of the specimen being studied from a set of two-dimensional (2D) projections acquired at the microscope. This tomographic reconstruction may be performed with different reconstruction algorithms that can be grouped into several large families: direct Fourier inversion methods, back-projection methods, Radon methods, or iterative algorithms. In this review, we focus on the latter family of algorithms, explaining the mathematical rationale behind the different algorithms in this family as they have been introduced in the field of Electron Microscopy. We cover their use in Single Particle Analysis (SPA) as well as in Electron Tomography (ET).
Collapse
|
6
|
Schrapp MJ, Herman GT. Data fusion in X-ray computed tomography using a superiorization approach. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2014; 85:053701. [PMID: 24880376 DOI: 10.1063/1.4872378] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
X-ray computed tomography (CT) is an important and widespread inspection technique in industrial non-destructive testing. However, large-sized and heavily absorbing objects cause artifacts due to either the lack of penetration of the specimen in specific directions or by having data from only a limited angular range of views. In such cases, valuable information about the specimen is not revealed by the CT measurements alone. Further imaging modalities, such as optical scanning and ultrasonic testing, are able to provide data (such as an edge map) that are complementary to the CT acquisition. In this paper, a superiorization approach (a newly developed method for constrained optimization) is used to incorporate the complementary data into the CT reconstruction; this allows precise localization of edges that are not resolvable from the CT data by itself. Superiorization, as presented in this paper, exploits the fact that the simultaneous algebraic reconstruction technique (SART), often used for CT reconstruction, is resilient to perturbations; i.e., it can be modified to produce an output that is as consistent with the CT measurements as the output of unmodified SART, but is more consistent with the complementary data. The application of this superiorized SART method to measured data of a turbine blade demonstrates a clear improvement in the quality of the reconstructed image.
Collapse
Affiliation(s)
- Michael J Schrapp
- Siemens AG, CT Munich, Germany and Physics Department E21, Technical University of Munich, Munich, Germany
| | - Gabor T Herman
- Department of Computer Science, The Graduate Center, City University of New York, New York, New York 10016, USA
| |
Collapse
|
7
|
Abstract
Three-dimensional (3D) reconstruction of an object mass density from the set of its 2D line projections lies at a core of both single-particle reconstruction technique and electron tomography. Both techniques utilize electron microscope to collect a set of projections of either multiple objects representing in principle the same macromolecular complex in an isolated form, or a subcellular structure isolated in situ. Therefore, the goal of macromolecular electron microscopy is to invert the projection transformation to recover the distribution of the mass density of the original object. The problem is interesting in that in its discrete form it is ill-posed and not invertible. Various algorithms have been proposed to cope with the practical difficulties of this inversion problem and their differ widely in terms of their robustness with respect to noise in the data, completeness of the collected projection dataset, errors in projections orientation parameters, abilities to efficiently handle large datasets, and other obstacles typically encountered in molecular electron microscopy. Here, we review the theoretical foundations of 3D reconstruction from line projections followed by an overview of reconstruction algorithms routinely used in practice of electron microscopy.
Collapse
Affiliation(s)
- Pawel A Penczek
- Department of Biochemistry and Molecular Biology, The University of Texas, Houston Medical School, Houston, Texas, USA
| |
Collapse
|
8
|
Abstract
Image restoration techniques are used to obtain, given experimental measurements, the best possible approximation of the original object within the limits imposed by instrumental conditions and noise level in the data. In molecular electron microscopy (EM), we are mainly interested in linear methods that preserve the respective relationships between mass densities within the restored map. Here, we describe the methodology of image restoration in structural EM, and more specifically, we will focus on the problem of the optimum recovery of Fourier amplitudes given electron microscope data collected under various defocus settings. We discuss in detail two classes of commonly used linear methods, the first of which consists of methods based on pseudoinverse restoration, and which is further subdivided into mean-square error, chi-square error, and constrained based restorations, where the methods in the latter two subclasses explicitly incorporates non-white distribution of noise in the data. The second class of methods is based on the Wiener filtration approach. We show that the Wiener filter-based methodology can be used to obtain a solution to the problem of amplitude correction (or "sharpening") of the EM map that makes it visually comparable to maps determined by X-ray crystallography, and thus amenable to comparative interpretation. Finally, we present a semiheuristic Wiener filter-based solution to the problem of image restoration given sets of heterogeneous solutions. We conclude the chapter with a discussion of image restoration protocols implemented in commonly used single particle software packages.
Collapse
Affiliation(s)
- Pawel A Penczek
- Department of Biochemistry and Molecular Biology, The University of Texas, Houston Medical School, Houston, Texas, USA
| |
Collapse
|