1
|
Chen L, Li Q, Nasif KFA, Xie Y, Deng B, Niu S, Pouriyeh S, Dai Z, Chen J, Xie CY. AI-Driven Deep Learning Techniques in Protein Structure Prediction. Int J Mol Sci 2024; 25:8426. [PMID: 39125995 PMCID: PMC11313475 DOI: 10.3390/ijms25158426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 07/29/2024] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Protein structure prediction is important for understanding their function and behavior. This review study presents a comprehensive review of the computational models used in predicting protein structure. It covers the progression from established protein modeling to state-of-the-art artificial intelligence (AI) frameworks. The paper will start with a brief introduction to protein structures, protein modeling, and AI. The section on established protein modeling will discuss homology modeling, ab initio modeling, and threading. The next section is deep learning-based models. It introduces some state-of-the-art AI models, such as AlphaFold (AlphaFold, AlphaFold2, AlphaFold3), RoseTTAFold, ProteinBERT, etc. This section also discusses how AI techniques have been integrated into established frameworks like Swiss-Model, Rosetta, and I-TASSER. The model performance is compared using the rankings of CASP14 (Critical Assessment of Structure Prediction) and CASP15. CASP16 is ongoing, and its results are not included in this review. Continuous Automated Model EvaluatiOn (CAMEO) complements the biennial CASP experiment. Template modeling score (TM-score), global distance test total score (GDT_TS), and Local Distance Difference Test (lDDT) score are discussed too. This paper then acknowledges the ongoing difficulties in predicting protein structure and emphasizes the necessity of additional searches like dynamic protein behavior, conformational changes, and protein-protein interactions. In the application section, this paper introduces some applications in various fields like drug design, industry, education, and novel protein development. In summary, this paper provides a comprehensive overview of the latest advancements in established protein modeling and deep learning-based models for protein structure predictions. It emphasizes the significant advancements achieved by AI and identifies potential areas for further investigation.
Collapse
Affiliation(s)
- Lingtao Chen
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Qiaomu Li
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Kazi Fahim Ahmad Nasif
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Ying Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Bobin Deng
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Shuteng Niu
- Department of Computer Science, Bowling Green State University, Bowling Green, OH 43403, USA;
| | - Seyedamin Pouriyeh
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Zhiyu Dai
- Division of Pulmonary and Critical Care Medicine, John T. Milliken Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, USA;
| | - Jiawei Chen
- College of Computing, Data Science and Society, University of California, Berkeley, CA 94720, USA;
| | - Chloe Yixin Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| |
Collapse
|
2
|
Premageetha GT, Vinothkumar KR, Bose S. Exploring advances in single particle CryoEM with apoferritin: From blobs to true atomic resolution. Int J Biochem Cell Biol 2024; 169:106536. [PMID: 38307321 DOI: 10.1016/j.biocel.2024.106536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 01/21/2024] [Accepted: 01/23/2024] [Indexed: 02/04/2024]
Abstract
Deciphering the three-dimensional structures of macromolecules is of paramount importance for gaining insights into their functions and roles in human health and disease. Single particle cryoEM has emerged as a powerful technique that enables direct visualization of macromolecules and their complexes, and through subsequent averaging, achieve near atomic-level resolution. A major breakthrough was recently achieved with the determination of the apoferritin structure at true atomic resolution. In this review, we discuss the latest technological innovations across the entire single-particle workflow, which have been instrumental in driving the resolution revolution and in transforming cryoEM as a mainstream technique in structural biology. We illustrate these advancements using apoferritin as an example that has served as an excellent benchmark sample for assessing emerging technologies. We further explore whether the existing technology can routinely generate atomic structures of dynamic macromolecules that more accurately represent real-world samples, the limitations in the workflow, and the current approaches employed to overcome them.
Collapse
Affiliation(s)
- Gowtham ThambraRajan Premageetha
- Institute for Stem Cell Science and Regenerative Medicine, GKVK Post, Bangalore 560065, India; Manipal Academy of Higher Education, Tiger Circle Road, Manipal, Karnataka 576104, India.
| | - Kutti R Vinothkumar
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Post, Bangalore 560065, India
| | - Sucharita Bose
- Institute for Stem Cell Science and Regenerative Medicine, GKVK Post, Bangalore 560065, India.
| |
Collapse
|
3
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
4
|
Conesa P, Fonseca YC, Jiménez de la Morena J, Sharov G, de la Rosa-Trevín JM, Cuervo A, García Mena A, Rodríguez de Francisco B, del Hoyo D, Herreros D, Marchan D, Strelak D, Fernández-Giménez E, Ramírez-Aportela E, de Isidro-Gómez FP, Sánchez I, Krieger J, Vilas JL, del Cano L, Gragera M, Iceta M, Martínez M, Losana P, Melero R, Marabini R, Carazo JM, Sorzano COS. Scipion3: A workflow engine for cryo-electron microscopy image processing and structural biology. BIOLOGICAL IMAGING 2023; 3:e13. [PMID: 38510163 PMCID: PMC10951921 DOI: 10.1017/s2633903x23000132] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 05/29/2023] [Accepted: 06/15/2023] [Indexed: 03/22/2024]
Abstract
Image-processing pipelines require the design of complex workflows combining many different steps that bring the raw acquired data to a final result with biological meaning. In the image-processing domain of cryo-electron microscopy single-particle analysis (cryo-EM SPA), hundreds of steps must be performed to obtain the three-dimensional structure of a biological macromolecule by integrating data spread over thousands of micrographs containing millions of copies of allegedly the same macromolecule. The execution of such complicated workflows demands a specific tool to keep track of all these steps performed. Additionally, due to the extremely low signal-to-noise ratio (SNR), the estimation of any image parameter is heavily affected by noise resulting in a significant fraction of incorrect estimates. Although low SNR and processing millions of images by hundreds of sequential steps requiring substantial computational resources are specific to cryo-EM, these characteristics may be shared by other biological imaging domains. Here, we present Scipion, a Python generic open-source workflow engine specifically adapted for image processing. Its main characteristics are: (a) interoperability, (b) smart object model, (c) gluing operations, (d) comparison operations, (e) wide set of domain-specific operations, (f) execution in streaming, (g) smooth integration in high-performance computing environments, (h) execution with and without graphical capabilities, (i) flexible visualization, (j) user authentication and private access to private data, (k) scripting capabilities, (l) high performance, (m) traceability, (n) reproducibility, (o) self-reporting, (p) reusability, (q) extensibility, (r) software updates, and (s) non-restrictive software licensing.
Collapse
Affiliation(s)
- Pablo Conesa
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | | | | | - Grigory Sharov
- Structural Studies Division, MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Ana Cuervo
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | | | | | | | - David Herreros
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - Daniel Marchan
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - David Strelak
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
- Masaryk University, Brno, Czech Republic
| | | | | | | | - Irene Sánchez
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - James Krieger
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | | | - Laura del Cano
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - Marcos Gragera
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - Mikel Iceta
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - Marta Martínez
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | | | - Roberto Melero
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
| | - Roberto Marabini
- National Center of Biotechnology (CNB-CSIC), Madrid, Spain
- Superior Polytechnic School, Autonomous University of Madrid, Madrid, Spain
| | | | | |
Collapse
|
5
|
Wang X, Lu Y, Lin X, Li J, Zhang Z. An Unsupervised Classification Algorithm for Heterogeneous Cryo-EM Projection Images Based on Autoencoders. Int J Mol Sci 2023; 24:ijms24098380. [PMID: 37176089 PMCID: PMC10179202 DOI: 10.3390/ijms24098380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 04/29/2023] [Accepted: 04/30/2023] [Indexed: 05/15/2023] Open
Abstract
Heterogeneous three-dimensional (3D) reconstruction in single-particle cryo-electron microscopy (cryo-EM) is an important but very challenging technique for recovering the conformational heterogeneity of flexible biological macromolecules such as proteins in different functional states. Heterogeneous projection image classification is a feasible solution to solve the structural heterogeneity problem in single-particle cryo-EM. The majority of heterogeneous projection image classification methods are developed using supervised learning technology or require a large amount of a priori knowledge, such as the orientations or common lines of the projection images, which leads to certain limitations in their practical applications. In this paper, an unsupervised heterogeneous cryo-EM projection image classification algorithm based on autoencoders is proposed, which only needs to know the number of heterogeneous 3D structures in the dataset and does not require any labeling information of the projection images or other a priori knowledge. A simple autoencoder with multi-layer perceptrons trained in iterative mode and a complex autoencoder with residual networks trained in one-pass learning mode are implemented to convert heterogeneous projection images into latent variables. The extracted high-dimensional features are reduced to two dimensions using the uniform manifold approximation and projection dimensionality reduction algorithm, and then clustered using the spectral clustering algorithm. The proposed algorithm is applied to two heterogeneous cryo-EM datasets for heterogeneous 3D reconstruction. Experimental results show that the proposed algorithm can effectively extract category features of heterogeneous projection images and achieve high classification and reconstruction accuracy, indicating that the proposed algorithm is effective for heterogeneous 3D reconstruction in single-particle cryo-EM.
Collapse
Affiliation(s)
- Xiangwen Wang
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
| | - Yonggang Lu
- School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
| | - Xianghong Lin
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
| | - Jianwei Li
- School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
| | - Zequn Zhang
- College of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
| |
Collapse
|
6
|
Levy A, Poitevin F, Martel J, Nashed Y, Peck A, Miolane N, Ratner D, Dunne M, Wetzstein G. CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images. COMPUTER VISION - ECCV ... : ... EUROPEAN CONFERENCE ON COMPUTER VISION : PROCEEDINGS. EUROPEAN CONFERENCE ON COMPUTER VISION 2022; 13681:540-557. [PMID: 36745134 PMCID: PMC9897229 DOI: 10.1007/978-3-031-19803-8_32] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Cryo-electron microscopy (cryo-EM) has become a tool of fundamental importance in structural biology, helping us understand the basic building blocks of life. The algorithmic challenge of cryo-EM is to jointly estimate the unknown 3D poses and the 3D electron scattering potential of a biomolecule from millions of extremely noisy 2D images. Existing reconstruction algorithms, however, cannot easily keep pace with the rapidly growing size of cryo-EM datasets due to their high computational and memory cost. We introduce cryoAI, an ab initio reconstruction algorithm for homogeneous conformations that uses direct gradient-based optimization of particle poses and the electron scattering potential from single-particle cryo-EM data. CryoAI combines a learned encoder that predicts the poses of each particle image with a physics-based decoder to aggregate each particle image into an implicit representation of the scattering potential volume. This volume is stored in the Fourier domain for computational efficiency and leverages a modern coordinate network architecture for memory efficiency. Combined with a symmetrized loss function, this framework achieves results of a quality on par with state-of-the-art cryo-EM solvers for both simulated and experimental data, one order of magnitude faster for large datasets and with significantly lower memory requirements than existing methods.
Collapse
Affiliation(s)
- Axel Levy
- LCLS, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
- Stanford University, Department of Electrical Engineering, Stanford, CA, USA
| | | | - Julien Martel
- Stanford University, Department of Electrical Engineering, Stanford, CA, USA
| | - Youssef Nashed
- ML Initiative, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Ariana Peck
- LCLS, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Nina Miolane
- University of California Santa Barbara, Department of Electrical and Computer Engineering, Santa Barbara, CA, USA
| | - Daniel Ratner
- ML Initiative, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Mike Dunne
- LCLS, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Gordon Wetzstein
- Stanford University, Department of Electrical Engineering, Stanford, CA, USA
| |
Collapse
|
7
|
Gerle C, Kishikawa JI, Yamaguchi T, Nakanishi A, Çoruh O, Makino F, Miyata T, Kawamoto A, Yokoyama K, Namba K, Kurisu G, Kato T. Structures of Multisubunit Membrane Complexes With the CRYO ARM 200. Microscopy (Oxf) 2022; 71:249-261. [PMID: 35861182 PMCID: PMC9535789 DOI: 10.1093/jmicro/dfac037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 07/18/2022] [Accepted: 07/20/2022] [Indexed: 11/18/2022] Open
Abstract
Progress in structural membrane biology has been significantly accelerated by the ongoing ‘Resolution Revolution’ in cryo-electron microscopy (cryo-EM). In particular, structure determination by single-particle analysis has evolved into the most powerful method for atomic model building of multisubunit membrane protein complexes. This has created an ever-increasing demand in cryo-EM machine time, which to satisfy is in need of new and affordable cryo-electron microscopes. Here, we review our experience in using the JEOL CRYO ARM 200 prototype for the structure determination by single-particle analysis of three different multisubunit membrane complexes: the Thermus thermophilus V-type ATPase VO complex, the Thermosynechococcus elongatus photosystem I monomer and the flagellar motor lipopolysaccharide peptidoglycan ring (LP ring) from Salmonella enterica.
Collapse
Affiliation(s)
- Christoph Gerle
- Institute for Protein Research, Osaka University, 3-2 Yamada Oka, Suita, Osaka 565-0871, Japan.,RIKEN SPring-8 Center, Life Science Research Infrastructure Group, Sayo-gun, Hyogo 679-5148, Japan
| | - Jun-Ichi Kishikawa
- Institute for Protein Research, Osaka University, 3-2 Yamada Oka, Suita, Osaka 565-0871, Japan
| | - Tomoko Yamaguchi
- Graduate School of Frontier Biosciences, Osaka University, Suita, Japan
| | - Atsuko Nakanishi
- Department of Molecular Biosciences, Kyoto Sangyo University, Kamigamo-Motoyama, Kyoto, Japan.,Research Center for Ultra-High Voltage Electron Microscopy, Osaka, University, Ibaraki, Osaka 567-0047, Japan
| | - Orkun Çoruh
- Institute for Protein Research, Osaka University, 3-2 Yamada Oka, Suita, Osaka 565-0871, Japan.,Institute of Science and Technology Austria, Klosterneuburg, 3400 Austria
| | - Fumiaki Makino
- Graduate School of Frontier Biosciences, Osaka University, Suita, Japan.,JEOL Ltd., Akishima, Tokyo, Japan
| | - Tomoko Miyata
- Graduate School of Frontier Biosciences, Osaka University, Suita, Japan
| | - Akihiro Kawamoto
- Institute for Protein Research, Osaka University, 3-2 Yamada Oka, Suita, Osaka 565-0871, Japan
| | - Ken Yokoyama
- Department of Molecular Biosciences, Kyoto Sangyo University, Kamigamo-Motoyama, Kyoto, Japan
| | - Keiichi Namba
- Graduate School of Frontier Biosciences, Osaka University, Suita, Japan.,RIKEN Center for Biosystems Dynamics Research, Suita, Osaka, Japan.,JEOL YOKOGUSHI Research Alliance Laboratories, Osaka University, Suita, Osaka, Japan
| | - Genji Kurisu
- Institute for Protein Research, Osaka University, 3-2 Yamada Oka, Suita, Osaka 565-0871, Japan
| | - Takayuki Kato
- Institute for Protein Research, Osaka University, 3-2 Yamada Oka, Suita, Osaka 565-0871, Japan.,Graduate School of Frontier Biosciences, Osaka University, Suita, Japan
| |
Collapse
|