1
|
Kimanius D, Schwab J. Confronting heterogeneity in cryogenic electron microscopy data: Innovative strategies and future perspectives with data-driven methods. Curr Opin Struct Biol 2024; 86:102815. [PMID: 38657561 DOI: 10.1016/j.sbi.2024.102815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/26/2024] [Accepted: 03/26/2024] [Indexed: 04/26/2024]
Abstract
The surge in the influx of data from cryogenic electron microscopy (cryo-EM) experiments has intensified the demand for robust algorithms capable of autonomously managing structurally heterogeneous datasets. This presents a wealth of exciting opportunities from a data science viewpoint, inspiring the development of numerous innovative, application-specific methods, many of which leverage contemporary data-driven techniques. However, addressing the challenges posed by heterogeneous datasets remains a paramount yet unresolved issue in the field. Here, we explore the subtleties of this challenge and the array of strategies devised to confront it. We pinpoint the shortcomings of existing methodologies and deliberate on prospective avenues for improvement. Specifically, our discussion focuses on strategies to mitigate model overfitting and manage data noise, as well as the effects of constraints, priors, and invariances on the optimization process.
Collapse
Affiliation(s)
- Dari Kimanius
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK; CZ Imaging Institute, 3400 Bridge Parkway, Redwood City, CA 94065, USA.
| | - Johannes Schwab
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| |
Collapse
|
2
|
Vuillemot R, Harastani M, Hamitouche I, Jonic S. MDSPACE and MDTOMO Software for Extracting Continuous Conformational Landscapes from Datasets of Single Particle Images and Subtomograms Based on Molecular Dynamics Simulations: Latest Developments in ContinuousFlex Software Package. Int J Mol Sci 2023; 25:20. [PMID: 38203192 PMCID: PMC10779004 DOI: 10.3390/ijms25010020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/16/2023] [Accepted: 12/17/2023] [Indexed: 01/12/2024] Open
Abstract
Cryo electron microscopy (cryo-EM) instrumentation allows obtaining 3D reconstruction of the structure of biomolecular complexes in vitro (purified complexes studied by single particle analysis) and in situ (complexes studied in cells by cryo electron tomography). Standard cryo-EM approaches allow high-resolution reconstruction of only a few conformational states of a molecular complex, as they rely on data classification into a given number of classes to increase the resolution of the reconstruction from the most populated classes while discarding all other classes. Such discrete classification approaches result in a partial picture of the full conformational variability of the complex, due to continuous conformational transitions with many, uncountable intermediate states. In this article, we present the software with a user-friendly graphical interface for running two recently introduced methods, namely, MDSPACE and MDTOMO, to obtain continuous conformational landscapes of biomolecules by analyzing in vitro and in situ cryo-EM data (single particle images and subtomograms) based on molecular dynamics simulations of an available atomic model of one of the conformations. The MDSPACE and MDTOMO software is part of the open-source ContinuousFlex software package (starting from version 3.4.2 of ContinuousFlex), which can be run as a plugin of the Scipion software package (version 3.1 and later), broadly used in the cryo-EM field.
Collapse
Affiliation(s)
| | | | | | - Slavica Jonic
- IMPMC-UMR 7590 CNRS, Sorbonne Université, MNHN, 75005 Paris, France
| |
Collapse
|
3
|
Krieger JM, Sorzano COS, Carazo JM. Scipion-EM-ProDy: A Graphical Interface for the ProDy Python Package within the Scipion Workflow Engine Enabling Integration of Databases, Simulations and Cryo-Electron Microscopy Image Processing. Int J Mol Sci 2023; 24:14245. [PMID: 37762547 PMCID: PMC10532346 DOI: 10.3390/ijms241814245] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/10/2023] [Accepted: 09/15/2023] [Indexed: 09/29/2023] Open
Abstract
Macromolecular assemblies, such as protein complexes, undergo continuous structural dynamics, including global reconfigurations critical for their function. Two fast analytical methods are widely used to study these global dynamics, namely elastic network model normal mode analysis and principal component analysis of ensembles of structures. These approaches have found wide use in various computational studies, driving the development of complex pipelines in several software packages. One common theme has been conformational sampling through hybrid simulations incorporating all-atom molecular dynamics and global modes of motion. However, wide functionality is only available for experienced programmers with limited capabilities for other users. We have, therefore, integrated one popular and extensively developed software for such analyses, the ProDy Python application programming interface, into the Scipion workflow engine. This enables a wider range of users to access a complete range of macromolecular dynamics pipelines beyond the core functionalities available in its command-line applications and the normal mode wizard in VMD. The new protocols and pipelines can be further expanded and integrated into larger workflows, together with other software packages for cryo-electron microscopy image analysis and molecular simulations. We present the resulting plugin, Scipion-EM-ProDy, in detail, highlighting the rich functionality made available by its development.
Collapse
Affiliation(s)
- James M. Krieger
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Campus Universidad Autónoma de Madrid, Darwin 3, Cantoblanco, 28049 Madrid, Spain
| | | | - Jose Maria Carazo
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Campus Universidad Autónoma de Madrid, Darwin 3, Cantoblanco, 28049 Madrid, Spain
| |
Collapse
|
4
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
5
|
Tang WS, Zhong ED, Hanson SM, Thiede EH, Cossio P. Conformational heterogeneity and probability distributions from single-particle cryo-electron microscopy. Curr Opin Struct Biol 2023; 81:102626. [PMID: 37311334 DOI: 10.1016/j.sbi.2023.102626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 04/25/2023] [Accepted: 05/16/2023] [Indexed: 06/15/2023]
Abstract
Single-particle cryo-electron microscopy (cryo-EM) is a technique that takes projection images of biomolecules frozen at cryogenic temperatures. A major advantage of this technique is its ability to image single biomolecules in heterogeneous conformations. While this poses a challenge for data analysis, recent algorithmic advances have enabled the recovery of heterogeneous conformations from the noisy imaging data. Here, we review methods for the reconstruction and heterogeneity analysis of cryo-EM images, ranging from linear-transformation-based methods to nonlinear deep generative models. We overview the dimensionality-reduction techniques used in heterogeneous 3D reconstruction methods and specify what information each method can infer from the data. Then, we review the methods that use cryo-EM images to estimate probability distributions over conformations in reduced subspaces or predefined by atomistic simulations. We conclude with the ongoing challenges for the cryo-EM community.
Collapse
Affiliation(s)
- Wai Shing Tang
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States. https://twitter.com/WaiShingTang
| | - Ellen D Zhong
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ, 08544, United States. https://twitter.com/ZhongingAlong
| | - Sonya M Hanson
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States; Center for Computational Biology, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States. https://twitter.com/sonyahans
| | - Erik H Thiede
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States. https://twitter.com/erik_der_elch
| | - Pilar Cossio
- Center for Computational Mathematics, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States; Center for Computational Biology, Flatiron Institute, 162 5th Ave, New York, NY, 10010, United States.
| |
Collapse
|
6
|
Vuillemot R, Rouiller I, Jonić S. MDTOMO method for continuous conformational variability analysis in cryo electron subtomograms based on molecular dynamics simulations. Sci Rep 2023; 13:10596. [PMID: 37391578 PMCID: PMC10313669 DOI: 10.1038/s41598-023-37037-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 06/14/2023] [Indexed: 07/02/2023] Open
Abstract
Cryo electron tomography (cryo-ET) allows observing macromolecular complexes in their native environment. The common routine of subtomogram averaging (STA) allows obtaining the three-dimensional (3D) structure of abundant macromolecular complexes, and can be coupled with discrete classification to reveal conformational heterogeneity of the sample. However, the number of complexes extracted from cryo-ET data is usually small, which restricts the discrete-classification results to a small number of enough populated states and, thus, results in a largely incomplete conformational landscape. Alternative approaches are currently being investigated to explore the continuity of the conformational landscapes that in situ cryo-ET studies could provide. In this article, we present MDTOMO, a method for analyzing continuous conformational variability in cryo-ET subtomograms based on Molecular Dynamics (MD) simulations. MDTOMO allows obtaining an atomic-scale model of conformational variability and the corresponding free-energy landscape, from a given set of cryo-ET subtomograms. The article presents the performance of MDTOMO on a synthetic ABC exporter dataset and an in situ SARS-CoV-2 spike dataset. MDTOMO allows analyzing dynamic properties of molecular complexes to understand their biological functions, which could also be useful for structure-based drug discovery.
Collapse
Affiliation(s)
- Rémi Vuillemot
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, CC 115, 4 Place Jussieu, 75005, Paris, France
- Department of Biochemistry and Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Isabelle Rouiller
- Department of Biochemistry and Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC, 3010, Australia
- Australian Research Council Centre for Cryo-Electron Microscopy of Membrane Proteins, Parkville, VIC, 3052, Australia
| | - Slavica Jonić
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, CC 115, 4 Place Jussieu, 75005, Paris, France.
| |
Collapse
|
7
|
Chen M, Toader B, Lederman R. Integrating Molecular Models Into CryoEM Heterogeneity Analysis Using Scalable High-resolution Deep Gaussian Mixture Models. J Mol Biol 2023; 435:168014. [PMID: 36806476 PMCID: PMC10164680 DOI: 10.1016/j.jmb.2023.168014] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/07/2023] [Accepted: 02/08/2023] [Indexed: 02/17/2023]
Abstract
Resolving the structural variability of proteins is often key to understanding the structure-function relationship of those macromolecular machines. Single particle analysis using Cryogenic electron microscopy (CryoEM), combined with machine learning algorithms, provides a way to reveal the dynamics within the protein system from noisy micrographs. Here, we introduce an improved computational method that uses Gaussian mixture models for protein structure representation and deep neural networks for conformation space embedding. By integrating information from molecular models into the heterogeneity analysis, we can analyze continuous protein conformational changes using structural information at the frequency of 1/3 Å-1, and present the results in a more interpretable form.
Collapse
Affiliation(s)
- Muyuan Chen
- Division of CryoEM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA, USA
| | - Bogdan Toader
- Department of Statistics and Data Science, Yale University, New Haven, CT, USA
| | - Roy Lederman
- Department of Statistics and Data Science, Yale University, New Haven, CT, USA
| |
Collapse
|
8
|
Verkhivker G, Alshahrani M, Gupta G, Xiao S, Tao P. From Deep Mutational Mapping of Allosteric Protein Landscapes to Deep Learning of Allostery and Hidden Allosteric Sites: Zooming in on "Allosteric Intersection" of Biochemical and Big Data Approaches. Int J Mol Sci 2023; 24:7747. [PMID: 37175454 PMCID: PMC10178073 DOI: 10.3390/ijms24097747] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 04/22/2023] [Accepted: 04/23/2023] [Indexed: 05/15/2023] Open
Abstract
The recent advances in artificial intelligence (AI) and machine learning have driven the design of new expert systems and automated workflows that are able to model complex chemical and biological phenomena. In recent years, machine learning approaches have been developed and actively deployed to facilitate computational and experimental studies of protein dynamics and allosteric mechanisms. In this review, we discuss in detail new developments along two major directions of allosteric research through the lens of data-intensive biochemical approaches and AI-based computational methods. Despite considerable progress in applications of AI methods for protein structure and dynamics studies, the intersection between allosteric regulation, the emerging structural biology technologies and AI approaches remains largely unexplored, calling for the development of AI-augmented integrative structural biology. In this review, we focus on the latest remarkable progress in deep high-throughput mining and comprehensive mapping of allosteric protein landscapes and allosteric regulatory mechanisms as well as on the new developments in AI methods for prediction and characterization of allosteric binding sites on the proteome level. We also discuss new AI-augmented structural biology approaches that expand our knowledge of the universe of protein dynamics and allostery. We conclude with an outlook and highlight the importance of developing an open science infrastructure for machine learning studies of allosteric regulation and validation of computational approaches using integrative studies of allosteric mechanisms. The development of community-accessible tools that uniquely leverage the existing experimental and simulation knowledgebase to enable interrogation of the allosteric functions can provide a much-needed boost to further innovation and integration of experimental and computational technologies empowered by booming AI field.
Collapse
Affiliation(s)
- Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (M.A.); (G.G.)
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (M.A.); (G.G.)
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (M.A.); (G.G.)
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX 75275, USA; (S.X.); (P.T.)
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX 75275, USA; (S.X.); (P.T.)
| |
Collapse
|
9
|
Vuillemot R, Mirzaei A, Harastani M, Hamitouche I, Fréchin L, Klaholz BP, Miyashita O, Tama F, Rouiller I, Jonic S. MDSPACE: Extracting Continuous Conformational Landscapes from Cryo-EM Single Particle Datasets Using 3D-to-2D Flexible Fitting based on Molecular Dynamics Simulation. J Mol Biol 2023; 435:167951. [PMID: 36638910 DOI: 10.1016/j.jmb.2023.167951] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 12/08/2022] [Accepted: 01/03/2023] [Indexed: 01/12/2023]
Abstract
This article presents an original approach for extracting atomic-resolution landscapes of continuous conformational variability of biomolecular complexes from cryo electron microscopy (cryo-EM) single particle images. This approach is based on a new 3D-to-2D flexible fitting method, which uses molecular dynamics (MD) simulation and is embedded in an iterative conformational-landscape refinement scheme. This new approach is referred to as MDSPACE, which stands for Molecular Dynamics simulation for Single Particle Analysis of Continuous Conformational hEterogeneity. The article describes the MDSPACE approach and shows its performance using synthetic and experimental datasets.
Collapse
Affiliation(s)
- Rémi Vuillemot
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France; Department of Biochemistry & Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria, Australia
| | - Alex Mirzaei
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Mohamad Harastani
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Ilyes Hamitouche
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Léo Fréchin
- Centre for Integrative Biology, Department of Integrated Structural Biology, IGBMC-UMR 7104 CNRS, U964 Inserm, Université de Strasbourg, Strasbourg, France
| | - Bruno P Klaholz
- Centre for Integrative Biology, Department of Integrated Structural Biology, IGBMC-UMR 7104 CNRS, U964 Inserm, Université de Strasbourg, Strasbourg, France
| | | | - Florence Tama
- RIKEN Center for Computational Science, Kobe, Japan; Institute of Transformative Biomolecules, Graduate School of Science, Nagoya University, Nagoya, Japan; Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Japan
| | - Isabelle Rouiller
- Department of Biochemistry & Pharmacology and Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Victoria, Australia
| | - Slavica Jonic
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France.
| |
Collapse
|
10
|
Harastani M, Vuillemot R, Hamitouche I, Moghadam NB, Jonic S. ContinuousFlex: Software package for analyzing continuous conformational variability of macromolecules in cryo electron microscopy and tomography data. J Struct Biol 2022; 214:107906. [PMID: 36244611 DOI: 10.1016/j.jsb.2022.107906] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 09/02/2022] [Accepted: 10/07/2022] [Indexed: 11/06/2022]
Abstract
ContinuousFlex is a user-friendly open-source software package for analyzing continuous conformational variability of macromolecules in cryo electron microscopy (cryo-EM) and cryo electron tomography (cryo-ET) data. In 2019, ContinuousFlex became available as a plugin for Scipion, an image processing software package extensively used in the cryo-EM field. Currently, ContinuousFlex contains software for running (1) recently published methods HEMNMA-3D, TomoFlow, and NMMD; (2) earlier published methods HEMNMA and StructMap; and (3) methods for simulating cryo-EM and cryo-ET data with conformational variability and methods for data preprocessing. It also includes external software for molecular dynamics simulation (GENESIS) and normal mode analysis (ElNemo), used in some of the mentioned methods. The HEMNMA software has been presented in the past, but not the software of other methods. Besides, ContinuousFlex currently also offers a deep learning extension of HEMNMA, named DeepHEMNMA. In this article, we review these methods in the context of the ContinuousFlex package, developed to facilitate their use by the community.
Collapse
Affiliation(s)
- Mohamad Harastani
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Rémi Vuillemot
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Ilyes Hamitouche
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Nima Barati Moghadam
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France
| | - Slavica Jonic
- IMPMC-UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle, Paris, France.
| |
Collapse
|