1
|
Zhang H, Li H, Zhang F, Zhu P. A strategy combining denoising and cryo-EM single particle analysis. Brief Bioinform 2023; 24:7140293. [PMID: 37096633 DOI: 10.1093/bib/bbad148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 02/21/2023] [Accepted: 03/28/2023] [Indexed: 04/26/2023] Open
Abstract
In cryogenic electron microscopy (cryo-EM) single particle analysis (SPA), high-resolution three-dimensional structures of biological macromolecules are determined by iteratively aligning and averaging a large number of two-dimensional projections of molecules. Since the correlation measures are sensitive to the signal-to-noise ratio, various parameter estimation steps in SPA will be disturbed by the high-intensity noise in cryo-EM. However, denoising algorithms tend to damage high frequencies and suppress mid- and high-frequency contrast of micrographs, which exactly the precise parameter estimation relies on, therefore, limiting their application in SPA. In this study, we suggest combining a cryo-EM image processing pipeline with denoising and maximizing the signal's contribution in various parameter estimation steps. To solve the inherent flaws of denoising algorithms, we design an algorithm named MScale to correct the amplitude distortion caused by denoising and propose a new orientation determination strategy to compensate for the high-frequency loss. In the experiments on several real datasets, the denoised particles are successfully applied in the class assignment estimation and orientation determination tasks, ultimately enhancing the quality of biomacromolecule reconstruction. The case study on classification indicates that our strategy not only improves the resolution of difficult classes (up to 5 Å) but also resolves an additional class. In the case study on orientation determination, our strategy improves the resolution of the final reconstructed density map by 0.34 Å compared with conventional strategy. The code is available at https://github.com/zhanghui186/Mscale.
Collapse
Affiliation(s)
- Hui Zhang
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hongjia Li
- University of Chinese Academy of Sciences, Beijing 100049, China
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Fa Zhang
- School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Ping Zhu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Sorzano COS, Vilas JL, Ramírez-Aportela E, Krieger J, Del Hoyo D, Herreros D, Fernandez-Giménez E, Marchán D, Macías JR, Sánchez I, Del Caño L, Fonseca-Reyna Y, Conesa P, García-Mena A, Burguet J, García Condado J, Méndez García J, Martínez M, Muñoz-Barrutia A, Marabini R, Vargas J, Carazo JM. Image processing tools for the validation of CryoEM maps. Faraday Discuss 2022; 240:210-227. [PMID: 35861059 DOI: 10.1039/d2fd00059h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The number of maps deposited in public databases (Electron Microscopy Data Bank, EMDB) determined by cryo-electron microscopy has quickly grown in recent years. With this rapid growth, it is critical to guarantee their quality. So far, map validation has primarily focused on the agreement between maps and models. From the image processing perspective, the validation has been mostly restricted to using two half-maps and the measurement of their internal consistency. In this article, we suggest that map validation can be taken much further from the point of view of image processing if 2D classes, particles, angles, coordinates, defoci, and micrographs are also provided. We present a progressive validation scheme that qualifies a result validation status from 0 to 5 and offers three optional qualifiers (A, W, and O) that can be added. The simplest validation state is 0, while the most complete would be 5AWO. This scheme has been implemented in a website https://biocomp.cnb.csic.es/EMValidationService/ to which reconstructed maps and their ESI can be uploaded.
Collapse
Affiliation(s)
- C O S Sorzano
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - J L Vilas
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | | | - J Krieger
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - D Del Hoyo
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - D Herreros
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | | | - D Marchán
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - J R Macías
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - I Sánchez
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - L Del Caño
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - Y Fonseca-Reyna
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - P Conesa
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - A García-Mena
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - J Burguet
- Depto. de Óptica, Univ. Complutense de Madrid, Pl. Ciencias, 1, 28040, Madrid, Spain
| | - J García Condado
- Biocruces Bizkaia Instituto Investigación Sanitaria, Cruces Plaza, 48903, Barakaldo, Bizkaia, Spain
| | | | - M Martínez
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| | - A Muñoz-Barrutia
- Univ. Carlos III de Madrid, Avda. de la Universidad 30, 28911, Leganés, Madrid, Spain
| | - R Marabini
- Escuela Politécnica Superior, Univ. Autónoma de Madrid, CSIC, C. Francisco Tomás y Valiente, 11, 28049, Madrid, Spain
| | - J Vargas
- Depto. de Óptica, Univ. Complutense de Madrid, Pl. Ciencias, 1, 28040, Madrid, Spain
| | - J M Carazo
- Natl. Center of Biotechnology, CSIC, c/Darwin, 3, 28049, Madrid, Spain.
| |
Collapse
|
3
|
Abstract
Cryo-electron microscopy (CryoEM) has become a vital technique in structural biology. It is an interdisciplinary field that takes advantage of advances in biochemistry, physics, and image processing, among other disciplines. Innovations in these three basic pillars have contributed to the boosting of CryoEM in the past decade. This work reviews the main contributions in image processing to the current reconstruction workflow of single particle analysis (SPA) by CryoEM. Our review emphasizes the time evolution of the algorithms across the different steps of the workflow differentiating between two groups of approaches: analytical methods and deep learning algorithms. We present an analysis of the current state of the art. Finally, we discuss the emerging problems and challenges still to be addressed in the evolution of CryoEM image processing methods in SPA.
Collapse
Affiliation(s)
- Jose Luis Vilas
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Jose Maria Carazo
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Carlos Oscar S. Sorzano
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
4
|
Sorzano COS, Jiménez-Moreno A, Maluenda D, Martínez M, Ramírez-Aportela E, Krieger J, Melero R, Cuervo A, Conesa J, Filipovic J, Conesa P, del Caño L, Fonseca YC, Jiménez-de la Morena J, Losana P, Sánchez-García R, Strelak D, Fernández-Giménez E, de Isidro-Gómez FP, Herreros D, Vilas JL, Marabini R, Carazo JM. On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy. Acta Crystallogr D Struct Biol 2022; 78:410-423. [PMID: 35362465 PMCID: PMC8972802 DOI: 10.1107/s2059798322001978] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 02/18/2022] [Indexed: 12/05/2022] Open
Abstract
Cryo-electron microscopy (cryoEM) has become a well established technique to elucidate the 3D structures of biological macromolecules. Projection images from thousands of macromolecules that are assumed to be structurally identical are combined into a single 3D map representing the Coulomb potential of the macromolecule under study. This article discusses possible caveats along the image-processing path and how to avoid them to obtain a reliable 3D structure. Some of these problems are very well known in the community. These may be referred to as sample-related (such as specimen denaturation at interfaces or non-uniform projection geometry leading to underrepresented projection directions). The rest are related to the algorithms used. While some have been discussed in depth in the literature, such as the use of an incorrect initial volume, others have received much less attention. However, they are fundamental in any data-analysis approach. Chiefly among them, instabilities in estimating many of the key parameters that are required for a correct 3D reconstruction that occur all along the processing workflow are referred to, which may significantly affect the reliability of the whole process. In the field, the term overfitting has been coined to refer to some particular kinds of artifacts. It is argued that overfitting is a statistical bias in key parameter-estimation steps in the 3D reconstruction process, including intrinsic algorithmic bias. It is also shown that common tools (Fourier shell correlation) and strategies (gold standard) that are normally used to detect or prevent overfitting do not fully protect against it. Alternatively, it is proposed that detecting the bias that leads to overfitting is much easier when addressed at the level of parameter estimation, rather than detecting it once the particle images have been combined into a 3D map. Comparing the results from multiple algorithms (or at least, independent executions of the same algorithm) can detect parameter bias. These multiple executions could then be averaged to give a lower variance estimate of the underlying parameters.
Collapse
Affiliation(s)
- C. O. S. Sorzano
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - A. Jiménez-Moreno
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Maluenda
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - M. Martínez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - E. Ramírez-Aportela
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Krieger
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - R. Melero
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - A. Cuervo
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Conesa
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | | | - P. Conesa
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - L. del Caño
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - Y. C. Fonseca
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. Jiménez-de la Morena
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - P. Losana
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - R. Sánchez-García
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Strelak
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
- Masaryk University, Brno, Czech Republic
| | - E. Fernández-Giménez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - F. P. de Isidro-Gómez
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - D. Herreros
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| | - J. L. Vilas
- School of Engineering and Applied Science, Yale University, New Haven, CT 06520-829, USA
| | - R. Marabini
- Escuela Politecnica Superior, Universidad Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain
| | - J. M. Carazo
- Biocomputing Unit, Centro Nacional de Biotecnologia (CNB-CSIC), Calle Darwin 3, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
5
|
Gomez-Blanco J, Kaur S, Strauss M, Vargas J. Hierarchical autoclassification of cryo-EM samples and macromolecular energy landscape determination. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 216:106673. [PMID: 35149430 DOI: 10.1016/j.cmpb.2022.106673] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 01/10/2022] [Accepted: 01/28/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Cryo-electron microscopy using single particle analysis is a powerful technique for obtaining 3D reconstructions of macromolecules in near native conditions. One of its major advances is its capacity to reveal conformations of dynamic molecular complexes. Most popular and successful current approaches to analyzing heterogeneous complexes are founded on Bayesian inference. However, these 3D classification methods require the tuning of specific parameters by the user and the use of complicated 3D re-classification procedures for samples affected by extensive heterogeneity. Thus, the success of these approaches highly depends on the user experience. We introduce a robust approach to identify many different conformations presented in a cryo-EM dataset based on Bayesian inference through Relion classification methods that does not require tuning of parameters and reclassification strategies. METHODS The algorithm allows both 2D and 3D classification and is based on a hierarchical clustering approach that runs automatically without requiring typical inputs, such as the number of conformations present in the dataset or the required classification iterations. This approach is applied to robustly determine the energy landscapes of macromolecules. RESULTS We tested the performance of the methods proposed here using four different datasets, comprising structurally homogeneous and highly heterogeneous cases. In all cases, the approach provided excellent results. The routines are publicly available as part of the CryoMethods plugin included in the Scipion package. CONCLUSIONS Our results show that the proposed method can be used to align and classify homogeneous and heterogeneous datasets without requiring previous alignment information or any prior knowledge about the number of co-existing conformations. The approach can be used for both 2D and 3D autoclassification and only requires an initial volume. In addition, the approach is robust to the "attractor" problem providing many different conformations/views for samples affected by extensive heterogeneity. The obtained 3D classes can render high resolution 3D structures, while the obtained energy landscapes can be used to determine structural trajectories.
Collapse
Affiliation(s)
- J Gomez-Blanco
- Departamento de Óptica, Universidad Complutense de Madrid, Plaza de Ciencias 1, 28040, Spain
| | - S Kaur
- Department of Anatomy and Cell Biology, McGill University, 3640 Rue University, Montréal, QC H3A 0C7, Canada
| | - M Strauss
- Department of Anatomy and Cell Biology, McGill University, 3640 Rue University, Montréal, QC H3A 0C7, Canada
| | - J Vargas
- Departamento de Óptica, Universidad Complutense de Madrid, Plaza de Ciencias 1, 28040, Spain.
| |
Collapse
|
6
|
Advances in Xmipp for Cryo-Electron Microscopy: From Xmipp to Scipion. Molecules 2021; 26:molecules26206224. [PMID: 34684805 PMCID: PMC8537808 DOI: 10.3390/molecules26206224] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 09/28/2021] [Accepted: 09/29/2021] [Indexed: 11/21/2022] Open
Abstract
Xmipp is an open-source software package consisting of multiple programs for processing data originating from electron microscopy and electron tomography, designed and managed by the Biocomputing Unit of the Spanish National Center for Biotechnology, although with contributions from many other developers over the world. During its 25 years of existence, Xmipp underwent multiple changes and updates. While there were many publications related to new programs and functionality added to Xmipp, there is no single publication on the Xmipp as a package since 2013. In this article, we give an overview of the changes and new work since 2013, describe technologies and techniques used during the development, and take a peek at the future of the package.
Collapse
|
7
|
Sorzano COS, Jiménez-Moreno A, Maluenda D, Ramírez-Aportela E, Martínez M, Cuervo A, Melero R, Conesa JJ, Sánchez-García R, Strelak D, Filipovic J, Fernández-Giménez E, de Isidro-Gómez F, Herreros D, Conesa P, Del Caño L, Fonseca Y, de la Morena JJ, Macías JR, Losana P, Marabini R, Carazo JM. Image Processing in Cryo-Electron Microscopy of Single Particles: The Power of Combining Methods. Methods Mol Biol 2021; 2305:257-289. [PMID: 33950394 DOI: 10.1007/978-1-0716-1406-8_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Cryo-electron microscopy has established as a mature structural biology technique to elucidate the three-dimensional structure of biological macromolecules. The Coulomb potential of the sample is imaged by an electron beam, and fast semi-conductor detectors produce movies of the sample under study. These movies have to be further processed by a whole pipeline of image-processing algorithms that produce the final structure of the macromolecule. In this chapter, we illustrate this whole processing pipeline putting in value the strength of "meta algorithms," which are the combination of several algorithms, each one with different mathematical rationale, in order to distinguish correctly from incorrectly estimated parameters. We show how this strategy leads to superior performance of the whole pipeline as well as more confident assessments about the reconstructed structures. The "meta algorithms" strategy is common to many fields and, in particular, it has provided excellent results in bioinformatics. We illustrate this combination using the workflow engine, Scipion.
Collapse
Affiliation(s)
| | | | | | | | | | - Ana Cuervo
- National Centre for Biotechnology (CSIC), Madrid, Spain
| | - Robert Melero
- National Centre for Biotechnology (CSIC), Madrid, Spain
| | | | | | - David Strelak
- National Centre for Biotechnology (CSIC), Madrid, Spain
| | | | | | | | | | - Pablo Conesa
- National Centre for Biotechnology (CSIC), Madrid, Spain
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Méndez J, Garduño E, Carazo JM, S Sorzano CO. Identification of incorrectly oriented particles in cryo-EM single particle analysis. J Struct Biol 2021; 213:107771. [PMID: 34324977 DOI: 10.1016/j.jsb.2021.107771] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 06/23/2021] [Accepted: 07/18/2021] [Indexed: 11/24/2022]
Abstract
The quality of a 3D map produced by the single-particle analysis method is highly dependent on an accurate assignment of orientations to the many experimental images. However, the problem's complexity implies the presence of several local minima in the optimized goal functions. Consequently, validation methods to confirm the angular assignment are very useful to yield higher-resolution 3D maps. In this work, we present a graph-signal-processing-based methodology that analyzes the correlation landscape as a function of the orientation, an approach allowing the estimation of the assigned orientations' reliability. Using this method, we may identify low-reliability images that probably incorrectly contribute to the final 3D reconstruction.
Collapse
Affiliation(s)
- Jeison Méndez
- Posgrado en Ingeniería Eléctrica, Universidad Nacional Autónoma de México, Cd.Universitaria, C.P.04510, Mexico City, Mexico.
| | - Edgar Garduño
- Department of Computer Science, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| | - José María Carazo
- National Center of Biotechnology, CSIC, Campus Univ. Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain.
| | - Carlos Oscar S Sorzano
- Univ. San Pablo CEU, Campus Urb. Montepríncipe s/n, 28668, Boadilla del Monte, Madrid, Spain; National Center of Biotechnology, CSIC, Campus Univ. Autónoma de Madrid, 28049 Cantoblanco, Madrid, Spain.
| |
Collapse
|
9
|
Ortiz S, Stanisic L, Rodriguez BA, Rampp M, Hummer G, Cossio P. Validation tests for cryo-EM maps using an independent particle set. J Struct Biol X 2020; 4:100032. [PMID: 32743544 PMCID: PMC7385033 DOI: 10.1016/j.yjsbx.2020.100032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by providing 3D density maps of biomolecules at near-atomic resolution. However, map validation is still an open issue. Despite several efforts from the community, it is possible to overfit 3D maps to noisy data. Here, we develop a novel methodology that uses a small independent particle set (not used during the 3D refinement) to validate the maps. The main idea is to monitor how the map probability evolves over the control set during the 3D refinement. The method is complementary to the gold-standard procedure, which generates two reconstructions at each iteration. We low-pass filter the two reconstructions for different frequency cutoffs, and we calculate the probability of each filtered map given the control set. For high-quality maps, the probability should increase as a function of the frequency cutoff and the refinement iteration. We also compute the similarity between the densities of probability distributions of the two reconstructions. As higher frequencies are included, the distributions become more dissimilar. We optimized the BioEM package to perform these calculations, and tested it over systems ranging from quality data to pure noise. Our results show that with our methodology, it possible to discriminate datasets that are constructed from noise particles. We conclude that validation against a control particle set provides a powerful tool to assess the quality of cryo-EM maps.
Collapse
Affiliation(s)
- Sebastian Ortiz
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Luka Stanisic
- Max Planck Computing and Data Facility, 85748 Garching, Germany
| | - Boris A Rodriguez
- Grupo de Fósica Atómica y Molecular, Instituto de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
| | - Markus Rampp
- Max Planck Computing and Data Facility, 85748 Garching, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
- Institute of Biophysics, Goethe University, 60438 Frankfurt am Main, Germany
| | - Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| |
Collapse
|
10
|
Abstract
Cross-validation is used to determine the validity of a model on unseen data by assessing if the model is overfitted to noise. It is widely used in many fields, from artificial intelligence to structural biology in X-ray crystallography and nuclear magnetic resonance. Although there are concerns of map overfitting in cryo-electron microscopy (cryo-EM), cross-validation is rarely used. The problem is that establishing a performance metric of the maps over unseen data (given by 2D-projection images) is difficult due to the low signal-to-noise ratios in the individual particles. Here, I present recent advances for cryo-EM map reconstruction. I highlight that the gold-standard procedure can fail to detect map overfitting in certain cases, showing the necessity of assessing the map quality on unbiased data. Finally, I describe the challenges and advantages of developing a robust cross-validation methodology for cryo-EM.
Collapse
Affiliation(s)
- Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia UdeA, Calle 70 No. 52-21, Medellin, Colombia.,Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| |
Collapse
|
11
|
Gomez-Blanco J, Kaur S, Ortega J, Vargas J. A robust approach to ab initio cryo-electron microscopy initial volume determination. J Struct Biol 2019; 208:107397. [PMID: 31568828 DOI: 10.1016/j.jsb.2019.09.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 09/24/2019] [Accepted: 09/26/2019] [Indexed: 10/25/2022]
Abstract
Structural information from macromolecules provides key insights into the way complexes perform their biological functions. The reconstruction process leading to the final three-dimensional (3D) map is iterative and requires an initial volume to prime the refinement procedure. Particle images are aligned to this first reference and subsequently a new map is calculated from these particles. The accurate determination of an ab initio initial volume is still a challenging and open problem in cryo-electron microscopy (cryo-EM). Different algorithms are available to estimate an initial volume from the dataset. Some of these methods provide multiple candidate initial maps and users looking for robustness typically run different approaches. In this case, users arbitrarily evaluate the different obtained candidate maps, as we lack robust methods to objectively assess the accuracy of initial references. This workflow is subjective and error-prone preventing implementation of high-throughput data processing procedures. In this work, we present a robust method to determine the best initial map or maps from a set of ab initio initial volumes obtained from one or multiple different approaches. The method is based on evaluating multiple small subsets of candidate initial volumes and particle images through reference-based 3D classifications. Obtained 3D classes of accurate initial maps will result majoritarian and the respective attracted particles will be aligned with high angular accuracies. We have tested the proposed approach with structurally homogeneous and heterogeneous datasets providing satisfactory results with both type of data.
Collapse
Affiliation(s)
- J Gomez-Blanco
- Departament of Anatomy and Cell Biology, McGill University, 3640 Rue University, Montréal, QC H3A 0C7, Canada
| | - S Kaur
- Departament of Anatomy and Cell Biology, McGill University, 3640 Rue University, Montréal, QC H3A 0C7, Canada
| | - J Ortega
- Departament of Anatomy and Cell Biology, McGill University, 3640 Rue University, Montréal, QC H3A 0C7, Canada
| | - J Vargas
- Departament of Anatomy and Cell Biology, McGill University, 3640 Rue University, Montréal, QC H3A 0C7, Canada.
| |
Collapse
|
12
|
Advances in image processing for single-particle analysis by electron cryomicroscopy and challenges ahead. Curr Opin Struct Biol 2018; 52:127-145. [PMID: 30509756 DOI: 10.1016/j.sbi.2018.11.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 10/26/2018] [Accepted: 11/17/2018] [Indexed: 12/20/2022]
Abstract
Electron cryomicroscopy (cryoEM) is essential for the study and functional understanding of non-crystalline macromolecules such as proteins. These molecules cannot be imaged using X-ray crystallography or other popular methods. CryoEM has been successfully used to visualize macromolecular complexes such as ribosomes, viruses, and ion channels. Determination of structural models of these at various conformational states leads to insight on how these molecules function. Recent advances in imaging technology have given cryoEM a scientific rebirth. As a result of these technological advances image processing and analysis have yielded molecular structures at atomic resolution. Nevertheless there continue to be challenges in image processing, and in this article we will touch on the most essential in order to derive an accurate three-dimensional model from noisy projection images. Traditional approaches, such as k-means clustering for class averaging, will be provided as background. We will then highlight new approaches for each image processing subproblem, including a 3D reconstruction method for asymmetric molecules using just two projection images and deep learning algorithms for automated particle picking.
Collapse
|
13
|
Sorzano C, Vargas J, de la Rosa-Trevín J, Jiménez A, Maluenda D, Melero R, Martínez M, Ramírez-Aportela E, Conesa P, Vilas J, Marabini R, Carazo J. A new algorithm for high-resolution reconstruction of single particles by electron microscopy. J Struct Biol 2018; 204:329-337. [DOI: 10.1016/j.jsb.2018.08.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 07/19/2018] [Accepted: 08/04/2018] [Indexed: 01/01/2023]
|
14
|
Cossio P, Hummer G. Likelihood-based structural analysis of electron microscopy images. Curr Opin Struct Biol 2018; 49:162-168. [PMID: 29579548 DOI: 10.1016/j.sbi.2018.03.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 01/24/2018] [Accepted: 03/06/2018] [Indexed: 10/17/2022]
Abstract
Likelihood-based analysis of single-particle electron microscopy images has contributed much to the recent improvements in resolution. By treating particle orientations and classes probabilistically, uncertainties in the reconstruction process are explicitly accounted for, and the risk of bias towards the initial model is diminished. As a result, the quality and reliability of the reconstructions have greatly improved at manageable computational cost. Likelihood-based analysis of electron microscopy images also offers a route to direct coordinate refinement for dynamic systems, as an alternative to 3D density reconstruction. Here, we review recent developments in the algorithms used for reconstructions of high-resolution maps, and in the integrative framework of combining likelihood methods with simulations to address conformational variability in cryo-electron microscopy.
Collapse
Affiliation(s)
- Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, Medellín, Colombia; Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany; Institute of Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| |
Collapse
|
15
|
Conesa Mingo P, Gutierrez J, Quintana A, de la Rosa Trevín JM, Zaldívar-Peraza A, Cuenca Alba J, Kazemi M, Vargas J, Del Cano L, Segura J, Sorzano COS, Carazo JM. Scipion web tools: Easy to use cryo-EM image processing over the web. Protein Sci 2017; 27:269-275. [PMID: 28971542 DOI: 10.1002/pro.3315] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Revised: 09/28/2017] [Accepted: 09/28/2017] [Indexed: 11/08/2022]
Abstract
Macromolecular structural determination by Electron Microscopy under cryogenic conditions is revolutionizing the field of structural biology, interesting a large community of potential users. Still, the path from raw images to density maps is complex, and sophisticated image processing suites are required in this process, often demanding the installation and understanding of different software packages. Here, we present Scipion Web Tools, a web-based set of tools/workflows derived from the Scipion image processing framework, specially tailored to nonexpert users in need of very precise answers at several key stages of the structural elucidation process.
Collapse
Affiliation(s)
- Pablo Conesa Mingo
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | - José Gutierrez
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | - Adrián Quintana
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | | | | | - Jesús Cuenca Alba
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | - Mohsen Kazemi
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | - Javier Vargas
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | - Laura Del Cano
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | - Joan Segura
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| | | | - Jose María Carazo
- Centro Nacional de Biotecnología (CNB-CSIC), Cantoblanco, Madrid, 28049, Spain
| |
Collapse
|
16
|
Heymann JB. Guidelines for using Bsoft for high resolution reconstruction and validation of biomolecular structures from electron micrographs. Protein Sci 2017; 27:159-171. [PMID: 28891250 DOI: 10.1002/pro.3293] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Revised: 09/05/2017] [Accepted: 09/06/2017] [Indexed: 12/12/2022]
Abstract
Cryo-electron microscopy (cryoEM) is becoming popular as a tool to solve biomolecular structures with the recent availability of direct electron detectors allowing automated acquisition of high resolution data. The Bsoft software package, developed over 20 years for analyzing electron micrographs, offers a full workflow for validated single particle analysis with extensive functionality, enabling customization for specific cases. With the increasing use of cryoEM and its automation, proper validation of the results is a bigger concern. The three major validation approaches, independent data sets, resolution-limited processing, and coherence testing, can be incorporated into any Bsoft workflow. Here, the main workflow is divided into four phases: (i) micrograph preprocessing, (ii) particle picking, (iii) particle alignment and reconstruction, and (iv) interpretation. Each of these phases represents a conceptual unit that can be automated, followed by a check point to assess the results. The aim in the first three phases is to reconstruct one or more validated maps at the best resolution possible. Map interpretation then involves identification of components, segmentation, quantification, and modeling. The algorithms in Bsoft are well established, with future plans focused on ease of use, automation and institutionalizing validation.
Collapse
Affiliation(s)
- J Bernard Heymann
- Laboratory for Structural Biology Research, National Institute of Arthritis, Musculoskeletal and Skin Diseases, NIH, Bethesda, Maryland, 20892
| |
Collapse
|
17
|
Quantitative analysis of 3D alignment quality: its impact on soft-validation, particle pruning and homogeneity analysis. Sci Rep 2017; 7:6307. [PMID: 28740215 PMCID: PMC5524947 DOI: 10.1038/s41598-017-06526-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Accepted: 06/14/2017] [Indexed: 11/16/2022] Open
Abstract
Single Particle Analysis using cryo-electron microscopy is a structural biology technique aimed at capturing the three-dimensional (3D) conformation of biological macromolecules. Projection images used to construct the 3D density map are characterized by a very low signal-to-noise ratio to minimize radiation damage in the samples. As a consequence, the 3D image alignment process is a challenging and error prone task which usually determines the success or failure of obtaining a high quality map. In this work, we present an approach able to quantify the alignment precision and accuracy of the 3D alignment process, which is then being used to help the reconstruction process in a number of ways, such as: (1) Providing quality indicators of the macromolecular map for soft validation, (2) Assessing the degree of homogeneity of the sample and, (3), Selecting subsets of representative images. We present experimental results in which the quality of the finally obtained 3D maps is clearly improved.
Collapse
|
18
|
|