1
|
Sorzano COS, Jiménez-Moreno A, Maluenda D, Martínez M, Ramírez-Aportela E, Krieger J, Melero R, Cuervo A, Conesa J, Filipovic J, Conesa P, del Caño L, Fonseca YC, Jiménez-de la Morena J, Losana P, Sánchez-García R, Strelak D, Fernández-Giménez E, de Isidro-Gómez FP, Herreros D, Vilas JL, Marabini R, Carazo JM. On bias, variance, overfitting, gold standard and consensus in single-particle analysis by cryo-electron microscopy. Acta Crystallogr D Struct Biol 2022; 78:410-423. [PMID: 35362465 PMCID: PMC8972802 DOI: 10.1107/s2059798322001978] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 02/18/2022] [Indexed: 12/05/2022] Open
Abstract
Single-particle analysis (SPA) by cryo-electron microscopy comprises the estimation of many parameters along its image-processing pipeline. Overfitting observed in SPA is normally due to misestimated parameters, and the only way to identify these is by comparing the estimates of multiple algorithms or, at least, multiple executions of the same algorithm. Cryo-electron microscopy (cryoEM) has become a well established technique to elucidate the 3D structures of biological macromolecules. Projection images from thousands of macromolecules that are assumed to be structurally identical are combined into a single 3D map representing the Coulomb potential of the macromolecule under study. This article discusses possible caveats along the image-processing path and how to avoid them to obtain a reliable 3D structure. Some of these problems are very well known in the community. These may be referred to as sample-related (such as specimen denaturation at interfaces or non-uniform projection geometry leading to underrepresented projection directions). The rest are related to the algorithms used. While some have been discussed in depth in the literature, such as the use of an incorrect initial volume, others have received much less attention. However, they are fundamental in any data-analysis approach. Chiefly among them, instabilities in estimating many of the key parameters that are required for a correct 3D reconstruction that occur all along the processing workflow are referred to, which may significantly affect the reliability of the whole process. In the field, the term overfitting has been coined to refer to some particular kinds of artifacts. It is argued that overfitting is a statistical bias in key parameter-estimation steps in the 3D reconstruction process, including intrinsic algorithmic bias. It is also shown that common tools (Fourier shell correlation) and strategies (gold standard) that are normally used to detect or prevent overfitting do not fully protect against it. Alternatively, it is proposed that detecting the bias that leads to overfitting is much easier when addressed at the level of parameter estimation, rather than detecting it once the particle images have been combined into a 3D map. Comparing the results from multiple algorithms (or at least, independent executions of the same algorithm) can detect parameter bias. These multiple executions could then be averaged to give a lower variance estimate of the underlying parameters.
Collapse
|
2
|
Maluenda D, Majtner T, Horvath P, Vilas JL, Jiménez-Moreno A, Mota J, Ramírez-Aportela E, Sánchez-García R, Conesa P, del Caño L, Rancel Y, Fonseca Y, Martínez M, Sharov G, García C, Strelak D, Melero R, Marabini R, Carazo JM, Sorzano COS. Flexible workflows for on-the-fly electron-microscopy single-particle image processing using Scipion. Acta Crystallogr D Struct Biol 2019; 75:882-894. [PMID: 31588920 PMCID: PMC6778851 DOI: 10.1107/s2059798319011860] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 08/28/2019] [Indexed: 01/18/2023] Open
Abstract
Electron microscopy of macromolecular structures is an approach that is in increasing demand in the field of structural biology. The automation of image acquisition has greatly increased the potential throughput of electron microscopy. Here, the focus is on the possibilities in Scipion to implement flexible and robust image-processing workflows that allow the electron-microscope operator and the user to monitor the quality of image acquisition, assessing very simple acquisition measures or obtaining a first estimate of the initial volume, or the data resolution and heterogeneity, without any need for programming skills. These workflows can implement intelligent automatic decisions and they can warn the user of possible acquisition failures. These concepts are illustrated by analysis of the well known 2.2 Å resolution β-galactosidase data set.
Collapse
Affiliation(s)
- D. Maluenda
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - T. Majtner
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - P. Horvath
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - J. L. Vilas
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - A. Jiménez-Moreno
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - J. Mota
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | | | - R. Sánchez-García
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - P. Conesa
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - L. del Caño
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - Y. Rancel
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - Y. Fonseca
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - M. Martínez
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - G. Sharov
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, England
| | | | - D. Strelak
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - R. Melero
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - R. Marabini
- Universidad Autónoma de Madrid, Madrid, Spain
| | - J. M. Carazo
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| | - C. O. S. Sorzano
- National Center for Biotechnology (CSIC), 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
3
|
Sorzano COS, Jiménez A, Mota J, Vilas JL, Maluenda D, Martínez M, Ramírez-Aportela E, Majtner T, Segura J, Sánchez-García R, Rancel Y, del Caño L, Conesa P, Melero R, Jonic S, Vargas J, Cazals F, Freyberg Z, Krieger J, Bahar I, Marabini R, Carazo JM. Survey of the analysis of continuous conformational variability of biological macromolecules by electron microscopy. Acta Crystallogr F Struct Biol Commun 2019; 75:19-32. [PMID: 30605122 PMCID: PMC6317454 DOI: 10.1107/s2053230x18015108] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 10/26/2018] [Indexed: 11/10/2022] Open
Abstract
Single-particle analysis by electron microscopy is a well established technique for analyzing the three-dimensional structures of biological macromolecules. Besides its ability to produce high-resolution structures, it also provides insights into the dynamic behavior of the structures by elucidating their conformational variability. Here, the different image-processing methods currently available to study continuous conformational changes are reviewed.
Collapse
Affiliation(s)
| | - A. Jiménez
- National Center of Biotechnology (CSIC), Spain
| | - J. Mota
- National Center of Biotechnology (CSIC), Spain
| | - J. L. Vilas
- National Center of Biotechnology (CSIC), Spain
| | - D. Maluenda
- National Center of Biotechnology (CSIC), Spain
| | - M. Martínez
- National Center of Biotechnology (CSIC), Spain
| | | | - T. Majtner
- National Center of Biotechnology (CSIC), Spain
| | - J. Segura
- National Center of Biotechnology (CSIC), Spain
| | | | - Y. Rancel
- National Center of Biotechnology (CSIC), Spain
| | - L. del Caño
- National Center of Biotechnology (CSIC), Spain
| | - P. Conesa
- National Center of Biotechnology (CSIC), Spain
| | - R. Melero
- National Center of Biotechnology (CSIC), Spain
| | - S. Jonic
- Sorbonne Université, UMR CNRS 7590, Muséum National d’Histoire Naturelle, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | | | - F. Cazals
- Inria Sophia Antipolis – Méditerranée, France
| | | | | | | | | | | |
Collapse
|