1
|
Jin W, Zhou Y, Bartesaghi A. Accurate size-based protein localization from cryo-ET tomograms. J Struct Biol X 2024; 10:100104. [PMID: 39044770 PMCID: PMC11263962 DOI: 10.1016/j.yjsbx.2024.100104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2024] Open
Abstract
Cryo-electron tomography (cryo-ET) combined with sub-tomogram averaging (STA) allows the determination of protein structures imaged within the native context of the cell at near-atomic resolution. Particle picking is an essential step in the cryo-ET/STA image analysis pipeline that consists in locating the position of proteins within crowded cellular tomograms so that they can be aligned and averaged in 3D to improve resolution. While extensive work in 2D particle picking has been done in the context of single-particle cryo-EM, comparatively fewer strategies have been proposed to pick particles from 3D tomograms, in part due to the challenges associated with working with noisy 3D volumes affected by the missing wedge. While strategies based on 3D template-matching and deep learning are commonly used, these methods are computationally expensive and require either an external template or manual labelling which can bias the results and limit their applicability. Here, we propose a size-based method to pick particles from tomograms that is fast, accurate, and does not require external templates or user provided labels. We compare the performance of our approach against a commonly used algorithm based on deep learning, crYOLO, and show that our method: i) has higher detection accuracy, ii) does not require user input for labeling or time-consuming training, and iii) runs efficiently on non-specialized CPU hardware. We demonstrate the effectiveness of our approach by automatically detecting particles from tomograms representing different types of samples and using these particles to determine the high-resolution structures of ribosomes imaged in vitro and in situ.
Collapse
Affiliation(s)
- Weisheng Jin
- Department of Computer Science, Duke University, Durham, USA
| | - Ye Zhou
- Department of Computer Science, Duke University, Durham, USA
| | - Alberto Bartesaghi
- Department of Computer Science, Duke University, Durham, USA
- Department of Biochemistry, Duke University School of Medicine, Durham, USA
- Department of Electrical and Computer Engineering, Pratt School of Engineering, Duke University, Durham, USA
| |
Collapse
|
2
|
Galaz-Montoya JG. The advent of preventive high-resolution structural histopathology by artificial-intelligence-powered cryogenic electron tomography. Front Mol Biosci 2024; 11:1390858. [PMID: 38868297 PMCID: PMC11167099 DOI: 10.3389/fmolb.2024.1390858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 05/08/2024] [Indexed: 06/14/2024] Open
Abstract
Advances in cryogenic electron microscopy (cryoEM) single particle analysis have revolutionized structural biology by facilitating the in vitro determination of atomic- and near-atomic-resolution structures for fully hydrated macromolecular complexes exhibiting compositional and conformational heterogeneity across a wide range of sizes. Cryogenic electron tomography (cryoET) and subtomogram averaging are rapidly progressing toward delivering similar insights for macromolecular complexes in situ, without requiring tags or harsh biochemical purification. Furthermore, cryoET enables the visualization of cellular and tissue phenotypes directly at molecular, nanometric resolution without chemical fixation or staining artifacts. This forward-looking review covers recent developments in cryoEM/ET and related technologies such as cryogenic focused ion beam milling scanning electron microscopy and correlative light microscopy, increasingly enhanced and supported by artificial intelligence algorithms. Their potential application to emerging concepts is discussed, primarily the prospect of complementing medical histopathology analysis. Machine learning solutions are poised to address current challenges posed by "big data" in cryoET of tissues, cells, and macromolecules, offering the promise of enabling novel, quantitative insights into disease processes, which may translate into the clinic and lead to improved diagnostics and targeted therapeutics.
Collapse
Affiliation(s)
- Jesús G. Galaz-Montoya
- Department of Bioengineering, James H. Clark Center, Stanford University, Stanford, CA, United States
| |
Collapse
|
3
|
Gyawali R, Dhakal A, Wang L, Cheng J. CryoSegNet: accurate cryo-EM protein particle picking by integrating the foundational AI image segmentation model and attention-gated U-Net. Brief Bioinform 2024; 25:bbae282. [PMID: 38860738 PMCID: PMC11165428 DOI: 10.1093/bib/bbae282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 05/15/2024] [Accepted: 05/29/2024] [Indexed: 06/12/2024] Open
Abstract
Picking protein particles in cryo-electron microscopy (cryo-EM) micrographs is a crucial step in the cryo-EM-based structure determination. However, existing methods trained on a limited amount of cryo-EM data still cannot accurately pick protein particles from noisy cryo-EM images. The general foundational artificial intelligence-based image segmentation model such as Meta's Segment Anything Model (SAM) cannot segment protein particles well because their training data do not include cryo-EM images. Here, we present a novel approach (CryoSegNet) of integrating an attention-gated U-shape network (U-Net) specially designed and trained for cryo-EM particle picking and the SAM. The U-Net is first trained on a large cryo-EM image dataset and then used to generate input from original cryo-EM images for SAM to make particle pickings. CryoSegNet shows both high precision and recall in segmenting protein particles from cryo-EM micrographs, irrespective of protein type, shape and size. On several independent datasets of various protein types, CryoSegNet outperforms two top machine learning particle pickers crYOLO and Topaz as well as SAM itself. The average resolution of density maps reconstructed from the particles picked by CryoSegNet is 3.33 Å, 7% better than 3.58 Å of Topaz and 14% better than 3.87 Å of crYOLO. It is publicly available at https://github.com/jianlin-cheng/CryoSegNet.
Collapse
Affiliation(s)
- Rajan Gyawali
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| | - Ashwin Dhakal
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| |
Collapse
|
4
|
Lightowler M, Li S, Ou X, Cho J, Liu B, Li A, Hofer G, Xu J, Yang T, Zou X, Lu M, Xu H. Phase Identification and Discovery of an Elusive Polymorph of Drug-Polymer Inclusion Complex Using Automated 3D Electron Diffraction. Angew Chem Int Ed Engl 2024; 63:e202317695. [PMID: 38380831 DOI: 10.1002/anie.202317695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/16/2024] [Accepted: 02/17/2024] [Indexed: 02/22/2024]
Abstract
3D electron diffraction (3D ED) has shown great potential in crystal structure determination in materials, small organic molecules, and macromolecules. In this work, an automated, low-dose and low-bias 3D ED protocol has been implemented to identify six phases from a multiple-phase melt-crystallisation product of an active pharmaceutical ingredient, griseofulvin (GSF). Batch data collection under low-dose conditions using a widely available commercial software was combined with automated data analysis to collect and process over 230 datasets in three days. Accurate unit cell parameters obtained from 3D ED data allowed direct phase identification of GSF Forms III, I and the known GSF inclusion complex (IC) with polyethylene glycol (PEG) (GSF-PEG IC-I), as well as three minor phases, namely GSF Forms II, V and an elusive new phase, GSF-PEG IC-II. Their structures were then directly determined by 3D ED. Furthermore, we reveal how the stabilities of the two GSF-PEG IC polymorphs are closely related to their crystal structures. These results demonstrate the power of automated 3D ED for accurate phase identification and direct structure determination of complex, beam-sensitive crystallisation products, which is significant for drug development where solid form screening is crucial for the overall efficacy of the drug product.
Collapse
Affiliation(s)
- Molly Lightowler
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Shuting Li
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | - Xiao Ou
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | - Jungyoun Cho
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Binbin Liu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | - Ao Li
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | - Gerhard Hofer
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Jiaoyan Xu
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Taimin Yang
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Xiaodong Zou
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| | - Ming Lu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | - Hongyi Xu
- Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, SE-106 91, Sweden
| |
Collapse
|
5
|
Bai R, Yuan M, Zhang P, Luo T, Shi Y, Wan R. Structural basis of U12-type intron engagement by the fully assembled human minor spliceosome. Science 2024; 383:1245-1252. [PMID: 38484052 DOI: 10.1126/science.adn7272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Accepted: 02/09/2024] [Indexed: 03/19/2024]
Abstract
The minor spliceosome, which is responsible for the splicing of U12-type introns, comprises five small nuclear RNAs (snRNAs), of which only one is shared with the major spliceosome. In this work, we report the 3.3-angstrom cryo-electron microscopy structure of the fully assembled human minor spliceosome pre-B complex. The atomic model includes U11 small nuclear ribonucleoprotein (snRNP), U12 snRNP, and U4atac/U6atac.U5 tri-snRNP. U11 snRNA is recognized by five U11-specific proteins (20K, 25K, 35K, 48K, and 59K) and the heptameric Sm ring. The 3' half of the 5'-splice site forms a duplex with U11 snRNA; the 5' half is recognized by U11-35K, U11-48K, and U11 snRNA. Two proteins, CENATAC and DIM2/TXNL4B, specifically associate with the minor tri-snRNP. A structural analysis uncovered how two conformationally similar tri-snRNPs are differentiated by the minor and major prespliceosomes for assembly.
Collapse
Affiliation(s)
- Rui Bai
- Research Center for Industries of the Future, Key Zhejiang Key Laboratory of Structural Biology, School of Life Sciences, Westlake University, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Westlake Laboratory of Life Sciences and Biomedicine, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Institute of Biology, Westlake Institute for Advanced Study, Xihu District, Hangzhou 310024, Zhejiang Province, China
| | - Meng Yuan
- Beijing Frontier Research Center for Biological Structure, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Pu Zhang
- Beijing Frontier Research Center for Biological Structure, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Ting Luo
- Research Center for Industries of the Future, Key Zhejiang Key Laboratory of Structural Biology, School of Life Sciences, Westlake University, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Westlake Laboratory of Life Sciences and Biomedicine, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Institute of Biology, Westlake Institute for Advanced Study, Xihu District, Hangzhou 310024, Zhejiang Province, China
| | - Yigong Shi
- Research Center for Industries of the Future, Key Zhejiang Key Laboratory of Structural Biology, School of Life Sciences, Westlake University, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Westlake Laboratory of Life Sciences and Biomedicine, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Institute of Biology, Westlake Institute for Advanced Study, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Ruixue Wan
- Research Center for Industries of the Future, Key Zhejiang Key Laboratory of Structural Biology, School of Life Sciences, Westlake University, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Westlake Laboratory of Life Sciences and Biomedicine, Xihu District, Hangzhou 310024, Zhejiang Province, China
- Institute of Biology, Westlake Institute for Advanced Study, Xihu District, Hangzhou 310024, Zhejiang Province, China
| |
Collapse
|
6
|
Huang Q, Zhou Y, Liu HF, Bartesaghi A. Joint micrograph denoising and protein localization in cryo-electron microscopy. BIOLOGICAL IMAGING 2024; 4:e4. [PMID: 38571546 PMCID: PMC10988173 DOI: 10.1017/s2633903x24000035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 12/30/2023] [Accepted: 02/05/2024] [Indexed: 04/05/2024]
Abstract
Cryo-electron microscopy (cryo-EM) is an imaging technique that allows the visualization of proteins and macromolecular complexes at near-atomic resolution. The low electron doses used to prevent radiation damage to the biological samples result in images where the power of noise is 100 times stronger than that of the signal. Accurate identification of proteins from these low signal-to-noise ratio (SNR) images is a critical task, as the detected positions serve as inputs for the downstream 3D structure determination process. Current methods either fail to identify all true positives or result in many false positives, especially when analyzing images from smaller-sized proteins that exhibit extremely low contrast, or require manual labeling that can take days to complete. Acknowledging the fact that accurate protein identification is dependent upon the visual interpretability of micrographs, we propose a framework that can perform denoising and detection in a joint manner and enable particle localization under extremely low SNR conditions using self-supervised denoising and particle identification from sparsely annotated data. We validate our approach on three challenging single-particle cryo-EM datasets and projection images from one cryo-electron tomography dataset with extremely low SNR, showing that it outperforms existing state-of-the-art methods used for cryo-EM image analysis by a significant margin. We also evaluate the performance of our algorithm under decreasing SNR conditions and show that our method is more robust to noise than competing methods.
Collapse
Affiliation(s)
- Qinwen Huang
- Department of Computer Science, Duke University, Durham27708, NC, USA
| | - Ye Zhou
- Department of Computer Science, Duke University, Durham27708, NC, USA
| | - Hsuan-Fu Liu
- Department of Biochemistry, Duke University School of Medicine, Durham27705, NC, USA
| | - Alberto Bartesaghi
- Department of Computer Science, Duke University, Durham27708, NC, USA
- Department of Biochemistry, Duke University School of Medicine, Durham27705, NC, USA
- Department of Electrical and Computer Engineering, Duke University, Durham27708, NC, USA
| |
Collapse
|
7
|
Cebi E, Lee J, Subramani VK, Bak N, Oh C, Kim KK. Cryo-electron microscopy-based drug design. Front Mol Biosci 2024; 11:1342179. [PMID: 38501110 PMCID: PMC10945328 DOI: 10.3389/fmolb.2024.1342179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 01/31/2024] [Indexed: 03/20/2024] Open
Abstract
Structure-based drug design (SBDD) has gained popularity owing to its ability to develop more potent drugs compared to conventional drug-discovery methods. The success of SBDD relies heavily on obtaining the three-dimensional structures of drug targets. X-ray crystallography is the primary method used for solving structures and aiding the SBDD workflow; however, it is not suitable for all targets. With the resolution revolution, enabling routine high-resolution reconstruction of structures, cryogenic electron microscopy (cryo-EM) has emerged as a promising alternative and has attracted increasing attention in SBDD. Cryo-EM offers various advantages over X-ray crystallography and can potentially replace X-ray crystallography in SBDD. To fully utilize cryo-EM in drug discovery, understanding the strengths and weaknesses of this technique and noting the key advancements in the field are crucial. This review provides an overview of the general workflow of cryo-EM in SBDD and highlights technical innovations that enable its application in drug design. Furthermore, the most recent achievements in the cryo-EM methodology for drug discovery are discussed, demonstrating the potential of this technique for advancing drug development. By understanding the capabilities and advancements of cryo-EM, researchers can leverage the benefits of designing more effective drugs. This review concludes with a discussion of the future perspectives of cryo-EM-based SBDD, emphasizing the role of this technique in driving innovations in drug discovery and development. The integration of cryo-EM into the drug design process holds great promise for accelerating the discovery of new and improved therapeutic agents to combat various diseases.
Collapse
Affiliation(s)
| | | | | | | | - Changsuk Oh
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| | - Kyeong Kyu Kim
- Department of Precision Medicine, Sungkyunkwan University School of Medicine, Suwon, Republic of Korea
| |
Collapse
|
8
|
Dhakal A, Gyawali R, Wang L, Cheng J. CryoTransformer: a transformer model for picking protein particles from cryo-EM micrographs. Bioinformatics 2024; 40:btae109. [PMID: 38407301 PMCID: PMC10937899 DOI: 10.1093/bioinformatics/btae109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/28/2024] [Accepted: 02/22/2024] [Indexed: 02/27/2024] Open
Abstract
MOTIVATION Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise ratio. RESULTS To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labeled cryo-EM protein particle dataset-CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score, and is poised to facilitate the automation of the cryo-EM protein particle picking. AVAILABILITY AND IMPLEMENTATION The source code and data for CryoTransformer are openly available at: https://github.com/jianlin-cheng/CryoTransformer.
Collapse
Affiliation(s)
- Ashwin Dhakal
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| | - Rajan Gyawali
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States
- NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States
| |
Collapse
|
9
|
de la Cruz MJ, Eng ET. Scaling up cryo-EM for biology and chemistry: The journey from niche technology to mainstream method. Structure 2023; 31:1487-1498. [PMID: 37820731 PMCID: PMC10841453 DOI: 10.1016/j.str.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 08/31/2023] [Accepted: 09/14/2023] [Indexed: 10/13/2023]
Abstract
Cryoelectron microscopy (cryo-EM) methods have made meaningful contributions in a wide variety of scientific research fields. In structural biology, cryo-EM routinely elucidates molecular structure from isolated biological macromolecular complexes or in a cellular context by harnessing the high-resolution power of the electron in order to image samples in a frozen, hydrated environment. For structural chemistry, the cryo-EM method popularly known as microcrystal electron diffraction (MicroED) has facilitated atomic structure generation of peptides and small molecules from their three-dimensional crystal forms. As cryo-EM has grown from an emerging technology, it has undergone modernization to enable multimodal transmission electron microscopy (TEM) techniques becoming more routine, reproducible, and accessible to accelerate research across multiple disciplines. We review recent advances in modern cryo-EM and assess how they are contributing to the future of the field with an eye to the past.
Collapse
Affiliation(s)
- M Jason de la Cruz
- Structural Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| | - Edward T Eng
- Simons Electron Microscopy Center, New York Structural Biology Center, New York, NY 10027, USA.
| |
Collapse
|
10
|
Dhakal A, Gyawali R, Wang L, Cheng J. CryoTransformer: A Transformer Model for Picking Protein Particles from Cryo-EM Micrographs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.19.563155. [PMID: 37961171 PMCID: PMC10634673 DOI: 10.1101/2023.10.19.563155] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise (SNR) ratios. To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labelled cryo-EM protein particle dataset - CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score and is poised to facilitate the automation of the cryo-EM protein particle picking.
Collapse
Affiliation(s)
- Ashwin Dhakal
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
- NextGen Precision Health, University of Missouri, Columbia, Columbia, MO 65211, USA
| | - Rajan Gyawali
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
- NextGen Precision Health, University of Missouri, Columbia, Columbia, MO 65211, USA
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
- NextGen Precision Health, University of Missouri, Columbia, Columbia, MO 65211, USA
| |
Collapse
|
11
|
Poger D, Yen L, Braet F. Big data in contemporary electron microscopy: challenges and opportunities in data transfer, compute and management. Histochem Cell Biol 2023; 160:169-192. [PMID: 37052655 PMCID: PMC10492738 DOI: 10.1007/s00418-023-02191-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/21/2023] [Indexed: 04/14/2023]
Abstract
The second decade of the twenty-first century witnessed a new challenge in the handling of microscopy data. Big data, data deluge, large data, data compliance, data analytics, data integrity, data interoperability, data retention and data lifecycle are terms that have introduced themselves to the electron microscopy sciences. This is largely attributed to the booming development of new microscopy hardware tools. As a result, large digital image files with an average size of one terabyte within one single acquisition session is not uncommon nowadays, especially in the field of cryogenic electron microscopy. This brings along numerous challenges in data transfer, compute and management. In this review, we will discuss in detail the current state of international knowledge on big data in contemporary electron microscopy and how big data can be transferred, computed and managed efficiently and sustainably. Workflows, solutions, approaches and suggestions will be provided, with the example of the latest experiences in Australia. Finally, important principles such as data integrity, data lifetime and the FAIR and CARE principles will be considered.
Collapse
Affiliation(s)
- David Poger
- Microscopy Australia, The University of Sydney, Sydney, NSW, 2006, Australia.
| | - Lisa Yen
- Microscopy Australia, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Filip Braet
- Australian Centre for Microscopy and Microanalysis, The University of Sydney, Sydney, NSW, 2006, Australia
- School of Medical Sciences (Molecular and Cellular Biomedicine), The University of Sydney, Sydney, NSW, 2006, Australia
| |
Collapse
|
12
|
Tüting C, Schmidt L, Skalidis I, Sinz A, Kastritis PL. Enabling cryo-EM density interpretation from yeast native cell extracts by proteomics data and AlphaFold structures. Proteomics 2023; 23:e2200096. [PMID: 37016452 DOI: 10.1002/pmic.202200096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 03/23/2023] [Accepted: 03/24/2023] [Indexed: 04/06/2023]
Abstract
In the cellular context, proteins participate in communities to perform their function. The detection and identification of these communities as well as in-community interactions has long been the subject of investigation, mainly through proteomics analysis with mass spectrometry. With the advent of cryogenic electron microscopy and the "resolution revolution," their visualization has recently been made possible, even in complex, native samples. The advances in both fields have resulted in the generation of large amounts of data, whose analysis requires advanced computation, often employing machine learning approaches to reach the desired outcome. In this work, we first performed a robust proteomics analysis of mass spectrometry (MS) data derived from a yeast native cell extract and used this information to identify protein communities and inter-protein interactions. Cryo-EM analysis of the cell extract provided a reconstruction of a biomolecule at medium resolution (∼8 Å (FSC = 0.143)). Utilizing MS-derived proteomics data and systematic fitting of AlphaFold-predicted atomic models, this density was assigned to the 2.6 MDa complex of yeast fatty acid synthase. Our proposed workflow identifies protein complexes in native cell extracts from Saccharomyces cerevisiae by combining proteomics, cryo-EM, and AI-guided protein structure prediction.
Collapse
Affiliation(s)
- Christian Tüting
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Biozentrum, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Lisa Schmidt
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Ioannis Skalidis
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Andrea Sinz
- Institute of Pharmacy, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Center for Structural Mass Spectrometry, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Panagiotis L Kastritis
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Biozentrum, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Chemical Biology, National Hellenic Research Foundation, Athens, Greece
| |
Collapse
|
13
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
14
|
Dhakal A, Gyawali R, Wang L, Cheng J. A large expert-curated cryo-EM image dataset for machine learning protein particle picking. Sci Data 2023; 10:392. [PMID: 37349345 PMCID: PMC10287764 DOI: 10.1038/s41597-023-02280-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 05/30/2023] [Indexed: 06/24/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of biological macromolecular complexes. Picking single-protein particles from cryo-EM micrographs is a crucial step in reconstructing protein structures. However, the widely used template-based particle picking process is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) based particle picking can potentially automate the process, its development is hindered by lack of large, high-quality labelled training data. To address this bottleneck, we present CryoPPP, a large, diverse, expert-curated cryo-EM image dataset for protein particle picking and analysis. It consists of labelled cryo-EM micrographs (images) of 34 representative protein datasets selected from the Electron Microscopy Public Image Archive (EMPIAR). The dataset is 2.6 terabytes and includes 9,893 high-resolution micrographs with labelled protein particle coordinates. The labelling process was rigorously validated through 2D particle class validation and 3D density map validation with the gold standard. The dataset is expected to greatly facilitate the development of both AI and classical methods for automated cryo-EM protein particle picking.
Collapse
Affiliation(s)
- Ashwin Dhakal
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA
| | - Rajan Gyawali
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY, 11973, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
15
|
Kim HHS, Uddin MR, Xu M, Chang YW. Computational Methods Toward Unbiased Pattern Mining and Structure Determination in Cryo-Electron Tomography Data. J Mol Biol 2023; 435:168068. [PMID: 37003470 PMCID: PMC10164694 DOI: 10.1016/j.jmb.2023.168068] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/19/2023] [Accepted: 03/26/2023] [Indexed: 04/03/2023]
Abstract
Cryo-electron tomography can uniquely probe the native cellular environment for macromolecular structures. Tomograms feature complex data with densities of diverse, densely crowded macromolecular complexes, low signal-to-noise, and artifacts such as the missing wedge effect. Post-processing of this data generally involves isolating regions or particles of interest from tomograms, organizing them into related groups, and rendering final structures through subtomogram averaging. Template-matching and reference-based structure determination are popular analysis methods but are vulnerable to biases and can often require significant user input. Most importantly, these approaches cannot identify novel complexes that reside within the imaged cellular environment. To reliably extract and resolve structures of interest, efficient and unbiased approaches are therefore of great value. This review highlights notable computational software and discusses how they contribute to making automated structural pattern discovery a possibility. Perspectives emphasizing the importance of features for user-friendliness and accessibility are also presented.
Collapse
Affiliation(s)
- Hannah Hyun-Sook Kim
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. https://twitter.com/hannahinthelab
| | - Mostofa Rafid Uddin
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. https://twitter.com/duran_rafid
| | - Min Xu
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Yi-Wei Chang
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
16
|
Verkhivker G, Alshahrani M, Gupta G, Xiao S, Tao P. From Deep Mutational Mapping of Allosteric Protein Landscapes to Deep Learning of Allostery and Hidden Allosteric Sites: Zooming in on "Allosteric Intersection" of Biochemical and Big Data Approaches. Int J Mol Sci 2023; 24:7747. [PMID: 37175454 PMCID: PMC10178073 DOI: 10.3390/ijms24097747] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 04/22/2023] [Accepted: 04/23/2023] [Indexed: 05/15/2023] Open
Abstract
The recent advances in artificial intelligence (AI) and machine learning have driven the design of new expert systems and automated workflows that are able to model complex chemical and biological phenomena. In recent years, machine learning approaches have been developed and actively deployed to facilitate computational and experimental studies of protein dynamics and allosteric mechanisms. In this review, we discuss in detail new developments along two major directions of allosteric research through the lens of data-intensive biochemical approaches and AI-based computational methods. Despite considerable progress in applications of AI methods for protein structure and dynamics studies, the intersection between allosteric regulation, the emerging structural biology technologies and AI approaches remains largely unexplored, calling for the development of AI-augmented integrative structural biology. In this review, we focus on the latest remarkable progress in deep high-throughput mining and comprehensive mapping of allosteric protein landscapes and allosteric regulatory mechanisms as well as on the new developments in AI methods for prediction and characterization of allosteric binding sites on the proteome level. We also discuss new AI-augmented structural biology approaches that expand our knowledge of the universe of protein dynamics and allostery. We conclude with an outlook and highlight the importance of developing an open science infrastructure for machine learning studies of allosteric regulation and validation of computational approaches using integrative studies of allosteric mechanisms. The development of community-accessible tools that uniquely leverage the existing experimental and simulation knowledgebase to enable interrogation of the allosteric functions can provide a much-needed boost to further innovation and integration of experimental and computational technologies empowered by booming AI field.
Collapse
Affiliation(s)
- Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (M.A.); (G.G.)
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (M.A.); (G.G.)
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA; (M.A.); (G.G.)
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX 75275, USA; (S.X.); (P.T.)
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX 75275, USA; (S.X.); (P.T.)
| |
Collapse
|
17
|
Zeng X, Kahng A, Xue L, Mahamid J, Chang YW, Xu M. High-throughput cryo-ET structural pattern mining by unsupervised deep iterative subtomogram clustering. Proc Natl Acad Sci U S A 2023; 120:e2213149120. [PMID: 37027429 PMCID: PMC10104553 DOI: 10.1073/pnas.2213149120] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 02/24/2023] [Indexed: 04/08/2023] Open
Abstract
Cryoelectron tomography directly visualizes heterogeneous macromolecular structures in their native and complex cellular environments. However, existing computer-assisted structure sorting approaches are low throughput or inherently limited due to their dependency on available templates and manual labels. Here, we introduce a high-throughput template-and-label-free deep learning approach, Deep Iterative Subtomogram Clustering Approach (DISCA), that automatically detects subsets of homogeneous structures by learning and modeling 3D structural features and their distributions. Evaluation on five experimental cryo-ET datasets shows that an unsupervised deep learning based method can detect diverse structures with a wide range of molecular sizes. This unsupervised detection paves the way for systematic unbiased recognition of macromolecular complexes in situ.
Collapse
Affiliation(s)
- Xiangrui Zeng
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| | - Anson Kahng
- Computer Science Department, University of Rochester, Rochester, NY14620
| | - Liang Xue
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg69117, Germany
- Faculty of Biosciences, Collaboration for joint PhD degree between European Molecular Biology Laboratory and Heidelberg University, Heidelberg69117, Germany
| | - Julia Mahamid
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg69117, Germany
| | - Yi-Wei Chang
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
| | - Min Xu
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA15213
| |
Collapse
|
18
|
Bendory T, Lan TY, Marshall NF, Rukshin I, Singer A. MULTI-TARGET DETECTION WITH ROTATIONS. INVERSE PROBLEMS AND IMAGING (SPRINGFIELD, MO.) 2023; 17:362-380. [PMID: 39175756 PMCID: PMC11340853 DOI: 10.3934/ipi.2022046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2024]
Abstract
We consider the multi-target detection problem of estimating a two-dimensional target image from a large noisy measurement image that contains many randomly rotated and translated copies of the target image. Motivated by single-particle cryo-electron microscopy, we focus on the low signal-to-noise regime, where it is difficult to estimate the locations and orientations of the target images in the measurement. Our approach uses autocorrelation analysis to estimate rotationally and translationally invariant features of the target image. We demonstrate that, regardless of the level of noise, our technique can be used to recover the target image when the measurement is sufficiently large.
Collapse
Affiliation(s)
- Tamir Bendory
- School of Electrical Engineering, Tel Aviv University, Israel
| | - Ti-Yen Lan
- Program in Applied and Computational Mathematics, Princeton University, USA
| | | | - Iris Rukshin
- Program in Applied and Computational Mathematics, Princeton University, USA
| | - Amit Singer
- Program in Applied and Computational Mathematics and the Department of Mathematics, Princeton University, USA
| |
Collapse
|
19
|
Dhakal A, Gyawali R, Wang L, Cheng J. CryoPPP: A Large Expert-Labelled Cryo-EM Image Dataset for Machine Learning Protein Particle Picking. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.21.529443. [PMID: 36865277 PMCID: PMC9980126 DOI: 10.1101/2023.02.21.529443] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is currently the most powerful technique for determining the structures of large protein complexes and assemblies. Picking single-protein particles from cryo-EM micrographs (images) is a key step in reconstructing protein structures. However, the widely used template-based particle picking process is labor-intensive and time-consuming. Though the emerging machine learning-based particle picking can potentially automate the process, its development is severely hindered by lack of large, high-quality, manually labelled training data. Here, we present CryoPPP, a large, diverse, expert-curated cryo-EM image dataset for single protein particle picking and analysis to address this bottleneck. It consists of manually labelled cryo-EM micrographs of 32 non-redundant, representative protein datasets selected from the Electron Microscopy Public Image Archive (EMPIAR). It includes 9,089 diverse, high-resolution micrographs (∼300 cryo-EM images per EMPIAR dataset) in which the coordinates of protein particles were labelled by human experts. The protein particle labelling process was rigorously validated by both 2D particle class validation and 3D density map validation with the gold standard. The dataset is expected to greatly facilitate the development of machine learning and artificial intelligence methods for automated cryo-EM protein particle picking. The dataset and data processing scripts are available at https://github.com/BioinfoMachineLearning/cryoppp.
Collapse
Affiliation(s)
- Ashwin Dhakal
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA. Fax: 573-882-8318
| | - Rajan Gyawali
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA. Fax: 573-882-8318
| | - Liguo Wang
- Laboratory for BioMolecular Structure (LBMS), Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA. Fax: 573-882-8318
| |
Collapse
|
20
|
Huang Q, Zhou Y, Liu HF, Bartesaghi A. Multiple-image super-resolution of cryo-electron micrographs based on deep internal learning. BIOLOGICAL IMAGING 2023; 3:e3. [PMID: 38510165 PMCID: PMC10951919 DOI: 10.1017/s2633903x2300003x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/27/2022] [Accepted: 01/23/2023] [Indexed: 03/22/2024]
Abstract
Single-particle cryo-electron microscopy (cryo-EM) is a powerful imaging modality capable of visualizing proteins and macromolecular complexes at near-atomic resolution. The low electron-doses used to prevent radiation damage to the biological samples, however, result in images where the power of the noise is 100 times greater than the power of the signal. To overcome these low signal-to-noise ratios (SNRs), hundreds of thousands of particle projections are averaged to determine the three-dimensional structure of the molecule of interest. The sampling requirements of high-resolution imaging impose limitations on the pixel sizes that can be used for acquisition, limiting the size of the field of view and requiring data collection sessions of several days to accumulate sufficient numbers of particles. Meanwhile, recent image super-resolution (SR) techniques based on neural networks have shown state-of-the-art performance on natural images. Building on these advances, here, we present a multiple-image SR algorithm based on deep internal learning designed specifically to work under low-SNR conditions. Our approach leverages the internal image statistics of cryo-EM movies and does not require training on ground-truth data. When applied to single-particle datasets of apoferritin and T20S proteasome, we show that the resolution of the 3D structure obtained from SR micrographs can surpass the limits imposed by the imaging system. Our results indicate that the combination of low magnification imaging with in silico image SR has the potential to accelerate cryo-EM data collection by virtue of including more particles in each exposure and doing so without sacrificing resolution.
Collapse
Affiliation(s)
- Qinwen Huang
- Department of Computer Science, Duke University, Durham, North Carolina, USA
| | - Ye Zhou
- Department of Computer Science, Duke University, Durham, North Carolina, USA
| | - Hsuan-Fu Liu
- Department of Biochemistry, Duke University School of Medicine, Durham, North Carolina, USA
| | - Alberto Bartesaghi
- Department of Computer Science, Duke University, Durham, North Carolina, USA
- Department of Biochemistry, Duke University School of Medicine, Durham, North Carolina, USA
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA
| |
Collapse
|
21
|
DiIorio MC, Kulczyk AW. Exploring the Structural Variability of Dynamic Biological Complexes by Single-Particle Cryo-Electron Microscopy. MICROMACHINES 2022; 14:118. [PMID: 36677177 PMCID: PMC9866264 DOI: 10.3390/mi14010118] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 05/15/2023]
Abstract
Biological macromolecules and assemblies precisely rearrange their atomic 3D structures to execute cellular functions. Understanding the mechanisms by which these molecular machines operate requires insight into the ensemble of structural states they occupy during the functional cycle. Single-particle cryo-electron microscopy (cryo-EM) has become the preferred method to provide near-atomic resolution, structural information about dynamic biological macromolecules elusive to other structure determination methods. Recent advances in cryo-EM methodology have allowed structural biologists not only to probe the structural intermediates of biochemical reactions, but also to resolve different compositional and conformational states present within the same dataset. This article reviews newly developed sample preparation and single-particle analysis (SPA) techniques for high-resolution structure determination of intrinsically dynamic and heterogeneous samples, shedding light upon the intricate mechanisms employed by molecular machines and helping to guide drug discovery efforts.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry and Microbiology, Rutgers University, 75 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
22
|
Reynolds MJ, Hachicho C, Carl AG, Gong R, Alushin GM. Bending forces and nucleotide state jointly regulate F-actin structure. Nature 2022; 611:380-386. [PMID: 36289330 PMCID: PMC9646526 DOI: 10.1038/s41586-022-05366-w] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 09/20/2022] [Indexed: 02/05/2023]
Abstract
ATP-hydrolysis-coupled actin polymerization is a fundamental mechanism of cellular force generation1-3. In turn, force4,5 and actin filament (F-actin) nucleotide state6 regulate actin dynamics by tuning F-actin's engagement of actin-binding proteins through mechanisms that are unclear. Here we show that the nucleotide state of actin modulates F-actin structural transitions evoked by bending forces. Cryo-electron microscopy structures of ADP-F-actin and ADP-Pi-F-actin with sufficient resolution to visualize bound solvent reveal intersubunit interfaces bridged by water molecules that could mediate filament lattice flexibility. Despite extensive ordered solvent differences in the nucleotide cleft, these structures feature nearly identical lattices and essentially indistinguishable protein backbone conformations that are unlikely to be discriminable by actin-binding proteins. We next introduce a machine-learning-enabled pipeline for reconstructing bent filaments, enabling us to visualize both continuous structural variability and side-chain-level detail. Bent F-actin structures reveal rearrangements at intersubunit interfaces characterized by substantial alterations of helical twist and deformations in individual protomers, transitions that are distinct in ADP-F-actin and ADP-Pi-F-actin. This suggests that phosphate rigidifies actin subunits to alter the bending structural landscape of F-actin. As bending forces evoke nucleotide-state dependent conformational transitions of sufficient magnitude to be detected by actin-binding proteins, we propose that actin nucleotide state can serve as a co-regulator of F-actin mechanical regulation.
Collapse
Affiliation(s)
- Matthew J Reynolds
- Laboratory of Structural Biophysics and Mechanobiology, The Rockefeller University, New York, NY, USA
| | - Carla Hachicho
- Laboratory of Structural Biophysics and Mechanobiology, The Rockefeller University, New York, NY, USA
| | - Ayala G Carl
- Laboratory of Structural Biophysics and Mechanobiology, The Rockefeller University, New York, NY, USA
- Tri-Institutional Program in Chemical Biology, The Rockefeller University, New York, NY, USA
| | - Rui Gong
- Laboratory of Structural Biophysics and Mechanobiology, The Rockefeller University, New York, NY, USA
| | - Gregory M Alushin
- Laboratory of Structural Biophysics and Mechanobiology, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
23
|
Bioinformatics in bioscience and bioengineering: Recent advances, applications, and perspectives. J Biosci Bioeng 2022; 134:363-373. [PMID: 36127250 DOI: 10.1016/j.jbiosc.2022.08.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 07/27/2022] [Accepted: 08/14/2022] [Indexed: 11/24/2022]
Abstract
Recent advances have led to the emergence of highly comprehensive and analytical approaches, such as omics analysis and high-resolution, time-resolved bioimaging analysis. These technologies have made it possible to obtain vast data from a single measurement. Subsequently, large datasets have pioneered the data-driven approach, an alternative to the traditional hypothesis-testing system, for researchers. However, processing, interpreting, and elucidating enormous datasets is no longer possible without computation. Bioinformatics is a field that has developed over long periods, intending to understand biological phenomena using methods collected from information science and statistics, thus solving this proposed research challenge. This review presents the latest methodologies and applications in sequencing, imaging, and mass spectrometry that were developed using bioinformatics. We presented the features of individual techniques and outlines in each part, avoiding the use of complex algorithms and formulas to allow beginning researchers to understand an overview. In the section on sequencing, we focused on comparative genomic, transcriptomic, and bacterial microbiome analyses, which are frequently used as applications of next-generation sequencing. Bioinformatic methods for handling sequence data and case studies were described. In the section on imaging, we introduced the analytical methods and microscopy imaging informatics techniques used in animal cell biology and plant physiology. We introduce informatics technologies for maximizing the value of measured data, including predicting the structure of unknown molecules and untargeted analysis in the section on mass spectrometry. Finally, we discuss the future outlook of this field. We anticipate that this review will assist biologists in using bioinformatics more effectively.
Collapse
|
24
|
Abstract
Cryo-electron microscopy (CryoEM) has become a vital technique in structural biology. It is an interdisciplinary field that takes advantage of advances in biochemistry, physics, and image processing, among other disciplines. Innovations in these three basic pillars have contributed to the boosting of CryoEM in the past decade. This work reviews the main contributions in image processing to the current reconstruction workflow of single particle analysis (SPA) by CryoEM. Our review emphasizes the time evolution of the algorithms across the different steps of the workflow differentiating between two groups of approaches: analytical methods and deep learning algorithms. We present an analysis of the current state of the art. Finally, we discuss the emerging problems and challenges still to be addressed in the evolution of CryoEM image processing methods in SPA.
Collapse
Affiliation(s)
- Jose Luis Vilas
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Jose Maria Carazo
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - Carlos Oscar S. Sorzano
- Biocomputing Unit, Centro
Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| |
Collapse
|
25
|
Ramírez-Aportela E, Carazo JM, Sorzano COS. Higher resolution in cryo-EM by the combination of macromolecular prior knowledge and image-processing tools. IUCRJ 2022; 9:632-638. [PMID: 36071808 PMCID: PMC9438491 DOI: 10.1107/s2052252522006959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
Single-particle cryo-electron microscopy has become a powerful technique for the 3D structure determination of biological molecules. The last decade has seen an astonishing development of both hardware and software, and an exponential growth of new structures obtained at medium-high resolution. However, the knowledge accumulated in this field over the years has hardly been utilized as feedback in the reconstruction of new structures. In this context, this article explores the use of the deep-learning approach deepEMhancer as a regularizer in the RELION refinement process. deepEMhancer introduces prior information derived from macromolecular structures, and contributes to noise reduction and signal enhancement, as well as a higher degree of isotropy. These features have a direct effect on image alignment and reduction of overfitting during iterative refinement. The advantages of this combination are demonstrated for several membrane proteins, for which it is especially useful because of their high disorder and flexibility.
Collapse
Affiliation(s)
- Erney Ramírez-Aportela
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Darwin 3, Campus Universidad Autónoma de Madrid, Cantoblanco, Madrid 28049, Spain
| | - Jose M. Carazo
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Darwin 3, Campus Universidad Autónoma de Madrid, Cantoblanco, Madrid 28049, Spain
| | - Carlos Oscar S. Sorzano
- Biocomputing Unit, National Centre for Biotechnology (CNB CSIC), Darwin 3, Campus Universidad Autónoma de Madrid, Cantoblanco, Madrid 28049, Spain
- Universidad CEU San Pablo, Campus Urb. Montepríncipe, Boadilla del Monte, Madrid 28668, Spain
| |
Collapse
|
26
|
Hajarolasvadi N, Sunkara V, Khavnekar S, Beck F, Brandt R, Baum D. Volumetric macromolecule identification in cryo-electron tomograms using capsule networks. BMC Bioinformatics 2022; 23:360. [PMID: 36042418 PMCID: PMC9429335 DOI: 10.1186/s12859-022-04901-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 08/23/2022] [Indexed: 11/29/2022] Open
Abstract
Background Despite recent advances in cellular cryo-electron tomography (CET), developing automated tools for macromolecule identification in submolecular resolution remains challenging due to the lack of annotated data and high structural complexities. To date, the extent of the deep learning methods constructed for this problem is limited to conventional Convolutional Neural Networks (CNNs). Identifying macromolecules of different types and sizes is a tedious and time-consuming task. In this paper, we employ a capsule-based architecture to automate the task of macromolecule identification, that we refer to as 3D-UCaps. In particular, the architecture is composed of three components: feature extractor, capsule encoder, and CNN decoder. The feature extractor converts voxel intensities of input sub-tomograms to activities of local features. The encoder is a 3D Capsule Network (CapsNet) that takes local features to generate a low-dimensional representation of the input. Then, a 3D CNN decoder reconstructs the sub-tomograms from the given representation by upsampling. Results We performed binary and multi-class localization and identification tasks on synthetic and experimental data. We observed that the 3D-UNet and the 3D-UCaps had an \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$F_1-$$\end{document}F1-score mostly above 60% and 70%, respectively, on the test data. In both network architectures, we observed degradation of at least 40% in the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$F_1$$\end{document}F1-score when identifying very small particles (PDB entry 3GL1) compared to a large particle (PDB entry 4D8Q). In the multi-class identification task of experimental data, 3D-UCaps had an \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$F_1$$\end{document}F1-score of 91% on the test data in contrast to 64% of the 3D-UNet. The better \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$F_1$$\end{document}F1-score of 3D-UCaps compared to 3D-UNet is obtained by a higher precision score. We speculate this to be due to the capsule network employed in the encoder. To study the effect of the CapsNet-based encoder architecture further, we performed an ablation study and perceived that the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$F_1$$\end{document}F1-score is boosted as network depth is increased which is in contrast to the previously reported results for the 3D-UNet. To present a reproducible work, source code, trained models, data as well as visualization results are made publicly available. Conclusion Quantitative and qualitative results show that 3D-UCaps successfully perform various downstream tasks including identification and localization of macromolecules and can at least compete with CNN architectures for this task. Given that the capsule layers extract both the existence probability and the orientation of the molecules, this architecture has the potential to lead to representations of the data that are better interpretable than those of 3D-UNet. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04901-w.
Collapse
Affiliation(s)
- Noushin Hajarolasvadi
- Department of Visual and Data-Centric Computing, Zuse Institute Berlin, Takustraße 7, 14195, Berlin, Germany.
| | - Vikram Sunkara
- Department of Visual and Data-Centric Computing, Zuse Institute Berlin, Takustraße 7, 14195, Berlin, Germany
| | - Sagar Khavnekar
- Department of CryoEM Technology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, Germany
| | - Florian Beck
- Department of CryoEM Technology, Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, Germany
| | - Robert Brandt
- Materials and Structural Analysis, Thermo Fisher Scientific, Takustraße 7, 14195, Berlin, Germany
| | - Daniel Baum
- Department of Visual and Data-Centric Computing, Zuse Institute Berlin, Takustraße 7, 14195, Berlin, Germany
| |
Collapse
|
27
|
Chung JM, Durie CL, Lee J. Artificial Intelligence in Cryo-Electron Microscopy. Life (Basel) 2022; 12:1267. [PMID: 36013446 PMCID: PMC9410485 DOI: 10.3390/life12081267] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/15/2022] [Accepted: 08/18/2022] [Indexed: 11/17/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has become an unrivaled tool for determining the structure of macromolecular complexes. The biological function of macromolecular complexes is inextricably tied to the flexibility of these complexes. Single particle cryo-EM can reveal the conformational heterogeneity of a biochemically pure sample, leading to well-founded mechanistic hypotheses about the roles these complexes play in biology. However, the processing of increasingly large, complex datasets using traditional data processing strategies is exceedingly expensive in both user time and computational resources. Current innovations in data processing capitalize on artificial intelligence (AI) to improve the efficiency of data analysis and validation. Here, we review new tools that use AI to automate the data analysis steps of particle picking, 3D map reconstruction, and local resolution determination. We discuss how the application of AI moves the field forward, and what obstacles remain. We also introduce potential future applications of AI to use cryo-EM in understanding protein communities in cells.
Collapse
Affiliation(s)
- Jeong Min Chung
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Gyeonggi, Korea
| | - Clarissa L. Durie
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
| | - Jinseok Lee
- Department of Biomedical Engineering, Kyung Hee University, Yongin-si 17104, Gyeonggi, Korea
| |
Collapse
|
28
|
Li H, Chen G, Gao S, Li J, Wan X, Zhang F. A Transfer Learning-Based Classification Model for Particle Pruning in Cryo-Electron Microscopy. JOURNAL OF COMPUTATIONAL BIOLOGY : A JOURNAL OF COMPUTATIONAL MOLECULAR CELL BIOLOGY 2022; 29:1117-1131. [PMID: 35985012 DOI: 10.1089/cmb.2022.0101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The cryo-electron microscopy (cryo-EM) single-particle analysis requires tens of thousands of particle projections to reveal structural information of macromolecular complexes. However, due to the low signal-to-noise ratio and the presence of high contrast artifacts and contaminants in the micrographs, the semiautomatic and fully automatic particle picking algorithms tend to suffer from high false-positive rates, which degrades the confidence of structure determination. In this study, we introduce PickerOptimizer (PO), a transfer learning-based classification neural network for particle pruning in cryo-EM, as an additional strategy to complement the current automated particle picking algorithms. To achieve high classification performance with minimal human intervention, we adopted two key strategies: (1) utilizing the transfer learning techniques to train the convolutional neural network, where the knowledge gained from public classification datasets is applied to the field of cryo-EM. (2) Designing a multiloss strategy, a combination of multiple loss functions, to guide the optimization of the network parameters. To reduce the domain shift between cryo-EM images and natural images for pretraining, we build the first image classification dataset for cryo-EM, which contains positive and negative samples collected from EMPIAR entries. The PO is tested on 14 public experimental datasets, achieving accuracy and F1 scores above 95% in most cases. Furthermore, three case studies are provided to verify the model performance by applying PO on problematic particle selections, showing that our algorithm achieved better or comparable performance compared with other particle pruning strategies.
Collapse
Affiliation(s)
- Hongjia Li
- High Performance Computer Research Center, Institute of Computing Technology, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Ge Chen
- University of Chinese Academy of Sciences, Beijing, China.,Domain-Oriented Computing Technology Research Center, Institute of Computing Technology, Beijing, China
| | - Shan Gao
- High Performance Computer Research Center, Institute of Computing Technology, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Jintao Li
- High Performance Computer Research Center, Institute of Computing Technology, Beijing, China
| | - Xiaohua Wan
- High Performance Computer Research Center, Institute of Computing Technology, Beijing, China
| | - Fa Zhang
- High Performance Computer Research Center, Institute of Computing Technology, Beijing, China
| |
Collapse
|
29
|
Xu Y, Dang S. Recent Technical Advances in Sample Preparation for Single-Particle Cryo-EM. Front Mol Biosci 2022; 9:892459. [PMID: 35813814 PMCID: PMC9263182 DOI: 10.3389/fmolb.2022.892459] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 05/12/2022] [Indexed: 11/25/2022] Open
Abstract
Cryo-sample preparation is a vital step in the process of obtaining high-resolution structures of macromolecules by using the single-particle cryo–electron microscopy (cryo-EM) method; however, cryo-sample preparation is commonly hampered by high uncertainty and low reproducibility. Specifically, the existence of air-water interfaces during the sample vitrification process could cause protein denaturation and aggregation, complex disassembly, adoption of preferred orientations, and other serious problems affecting the protein particles, thereby making it challenging to pursue high-resolution 3D reconstruction. Therefore, sample preparation has emerged as a critical research topic, and several new methods for application at various preparation stages have been proposed to overcome the aforementioned hurdles. Here, we summarize the methods developed for enhancing the quality of cryo-samples at distinct stages of sample preparation, and we offer insights for developing future strategies based on diverse viewpoints. We anticipate that cryo-sample preparation will no longer be a limiting step in the single-particle cryo-EM field as increasing numbers of methods are developed in the near future, which will ultimately benefit the entire research community.
Collapse
Affiliation(s)
- Yixin Xu
- Division of Life Science, The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, China
| | - Shangyu Dang
- Division of Life Science, The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, China
- Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, China
- Center of Systems Biology and Human Health, The Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR, China
- *Correspondence: Shangyu Dang,
| |
Collapse
|
30
|
Thorn A. Artificial intelligence in the experimental determination and prediction of macromolecular structures. Curr Opin Struct Biol 2022; 74:102368. [DOI: 10.1016/j.sbi.2022.102368] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 02/22/2022] [Accepted: 03/08/2022] [Indexed: 11/26/2022]
|
31
|
Hao Y, Wan X, Yan R, Liu Z, Li J, Zhang S, Cui X, Zhang F. VP-Detector: A 3D multi-scale dense convolutional neural network for macromolecule localization and classification in cryo-electron tomograms. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106871. [PMID: 35584579 DOI: 10.1016/j.cmpb.2022.106871] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 04/28/2022] [Accepted: 05/09/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE Cryo-electron tomography (cryo-ET) with subtomogram averaging (STA) is indispensable when studying macromolecule structures and functions in their native environments. Due to the low signal-to-noise ratio, the missing wedge artifacts in tomographic reconstructions, and multiple macromolecules of varied shapes and sizes, macromolecule localization and classification remain challenging. To tackle this bottleneck problem for structural determination by STA, we design an accurate macromolecule localization and classification method named voxelwise particle detector (VP-Detector). METHODS VP-Detector is a two-stage particle detection method based on a 3D multiscale dense convolutional neural network (3D MSDNet). The proposed network uses 3D hybrid dilated convolution (3D HDC) to avoid the resolution loss caused by scaling operations. Meanwhile, it uses 3D dense connectivity to encourage the reuse of feature maps to reduce trainable parameters. In addition, the weighted focal loss is proposed to focus more attention on difficult samples and rare classes, which relieves the class imbalance caused by multiple particles of various sizes. The performance of VP-Detector is evaluated on both simulated and real-world tomograms, and it shows that VP-Detector outperforms state-of-the-art methods. RESULTS The experiments show that VP-Detector outperforms the state-of-the-art methods on particle localization with an F1-score of 0.951 and a precision of 0.978. In addition, VP-Detector can replace manual particle picking in experiment on the real-world tomograms. Furthermore, it performs well in classifying large-, medium-, and small-weight proteins with accuracies of 1, 0.95, and 0.82, respectively. Finally, ablation studies demonstrate the effectiveness of 3D HDC, 3D dense connectivity, weighted focal loss, and training on small training sets. CONCLUSIONS VP-Detector can achieve high accuracy in particle detection with few trainable parameters and support training on small datasets. It can also relieve the class imbalance caused by multiple particles with various shapes and sizes.
Collapse
Affiliation(s)
- Yu Hao
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Xiaohua Wan
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Rui Yan
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhiyong Liu
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Jintao Li
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
| | - Shihua Zhang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
| | - Xuefeng Cui
- School of Computer Science and Technology, Shandong University, Qingdao, China.
| | - Fa Zhang
- High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.
| |
Collapse
|
32
|
EPicker is an exemplar-based continual learning approach for knowledge accumulation in cryoEM particle picking. Nat Commun 2022; 13:2468. [PMID: 35513367 PMCID: PMC9072698 DOI: 10.1038/s41467-022-29994-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Accepted: 01/10/2022] [Indexed: 11/25/2022] Open
Abstract
Deep learning is a popular method for facilitating particle picking in single-particle cryo-electron microscopy (cryo-EM), which is essential for developing automated processing pipelines. Most existing deep learning algorithms for particle picking rely on supervised learning where the features to be identified must be provided through a training procedure. However, the generalization performance of these algorithms on unseen datasets with different features is often unpredictable. In addition, while they perform well on the latest training datasets, these algorithms often fail to maintain the knowledge of old particles. Here, we report an exemplar-based continual learning approach, which can accumulate knowledge from the new dataset into the model by training an existing model on only a few new samples without catastrophic forgetting of old knowledge, implemented in a program called EPicker. Therefore, the ability of EPicker to identify bio-macromolecules can be expanded by continuously learning new knowledge during routine particle picking applications. Powered by the improved training strategy, EPicker is designed to pick not only protein particles but also general biological objects such as vesicles and fibers. Many existing deep learning algorithms for particle picking are not predictable on unseen datasets. Here the authors report an exemplar-based continual learning approach, EPicker, enabling accumulation of new knowledge of cryoEM particle picking without catastrophic forgetting of old knowledge.
Collapse
|
33
|
Eldar A, Amos I, Shkolnisky Y. ASOCEM: Automatic Segmentation Of Contaminations in cryo-EM. J Struct Biol 2022; 214:107871. [DOI: 10.1016/j.jsb.2022.107871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/10/2022] [Accepted: 05/17/2022] [Indexed: 11/25/2022]
|
34
|
Treder KP, Huang C, Kim JS, Kirkland AI. Applications of deep learning in electron microscopy. Microscopy (Oxf) 2022; 71:i100-i115. [DOI: 10.1093/jmicro/dfab043] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 08/30/2021] [Accepted: 11/08/2021] [Indexed: 12/25/2022] Open
Abstract
Abstract
We review the growing use of machine learning in electron microscopy (EM) driven in part by the availability of fast detectors operating at kiloHertz frame rates leading to large data sets that cannot be processed using manually implemented algorithms. We summarize the various network architectures and error metrics that have been applied to a range of EM-related problems including denoising and inpainting. We then provide a review of the application of these in both physical and life sciences, highlighting how conventional networks and training data have been specifically modified for EM.
Collapse
Affiliation(s)
- Kevin P Treder
- Department of Materials, University of Oxford, Oxford, Oxfordshire OX1 3PH, UK
| | - Chen Huang
- Rosalind Franklin Institute, Harwell Research Campus, Didcot, Oxfordshire OX11 0FA, UK
| | - Judy S Kim
- Department of Materials, University of Oxford, Oxford, Oxfordshire OX1 3PH, UK
- Rosalind Franklin Institute, Harwell Research Campus, Didcot, Oxfordshire OX11 0FA, UK
| | - Angus I Kirkland
- Department of Materials, University of Oxford, Oxford, Oxfordshire OX1 3PH, UK
- Rosalind Franklin Institute, Harwell Research Campus, Didcot, Oxfordshire OX11 0FA, UK
| |
Collapse
|
35
|
Wu JG, Yan Y, Zhang DX, Liu BW, Zheng QB, Xie XL, Liu SQ, Ge SX, Hou ZG, Xia NS. Machine Learning for Structure Determination in Single-Particle Cryo-Electron Microscopy: A Systematic Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:452-472. [PMID: 34932487 DOI: 10.1109/tnnls.2021.3131325] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recently, single-particle cryo-electron microscopy (cryo-EM) has become an indispensable method for determining macromolecular structures at high resolution to deeply explore the relevant molecular mechanism. Its recent breakthrough is mainly because of the rapid advances in hardware and image processing algorithms, especially machine learning. As an essential support of single-particle cryo-EM, machine learning has powered many aspects of structure determination and greatly promoted its development. In this article, we provide a systematic review of the applications of machine learning in this field. Our review begins with a brief introduction of single-particle cryo-EM, followed by the specific tasks and challenges of its image processing. Then, focusing on the workflow of structure determination, we describe relevant machine learning algorithms and applications at different steps, including particle picking, 2-D clustering, 3-D reconstruction, and other steps. As different tasks exhibit distinct characteristics, we introduce the evaluation metrics for each task and summarize their dynamics of technology development. Finally, we discuss the open issues and potential trends in this promising field.
Collapse
|
36
|
Lian R, Huang B, Wang L, Liu Q, Lin Y, Ling H. End-to-end orientation estimation from 2D cryo-EM images. Acta Crystallogr D Struct Biol 2022; 78:174-186. [DOI: 10.1107/s2059798321011761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 11/05/2021] [Indexed: 11/10/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) is a Nobel Prize-winning technique for determining high-resolution 3D structures of biological macromolecules. A 3D structure is reconstructed from hundreds of thousands of noisy 2D projection images. However, existing 3D reconstruction methods are still time-consuming, and one of the major computational bottlenecks is recovering the unknown orientation of the particle in each 2D image. The dominant methods typically exploit an expensive global search on each image to estimate the missing orientations. Here, a novel end-to-end supervised learning method is introduced to directly recover the missing orientations from 2D cryo-EM images. A neural network is used to approximate the mapping from images to orientations. A robust loss function is proposed for optimizing the parameters of the network, which can handle both asymmetric and symmetric 3D structures. Experiments on synthetic data sets with various symmetry types confirm that the neural network is capable of recovering orientations from 2D cryo-EM images, and the results on a real cryo-EM data set further demonstrate its potential under more challenging imaging conditions.
Collapse
|
37
|
Lees JA, Dias JM, Han S. Applications of Cryo-EM in small molecule and biologics drug design. Biochem Soc Trans 2021; 49:2627-2638. [PMID: 34812853 PMCID: PMC8786282 DOI: 10.1042/bst20210444] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/22/2021] [Accepted: 10/27/2021] [Indexed: 02/03/2023]
Abstract
Electron cryo-microscopy (cryo-EM) is a powerful technique for the structural characterization of biological macromolecules, enabling high-resolution analysis of targets once inaccessible to structural interrogation. In recent years, pharmaceutical companies have begun to utilize cryo-EM for structure-based drug design. Structural analysis of integral membrane proteins, which comprise a large proportion of druggable targets and pose particular challenges for X-ray crystallography, by cryo-EM has enabled insights into important drug target families such as G protein-coupled receptors (GPCRs), ion channels, and solute carrier (SLCs) proteins. Structural characterization of biologics, such as vaccines, viral vectors, and gene therapy agents, has also become significantly more tractable. As a result, cryo-EM has begun to make major impacts in bringing critical therapeutics to market. In this review, we discuss recent instructive examples of impacts from cryo-EM in therapeutics design, focusing largely on its implementation at Pfizer. We also discuss the opportunities afforded by emerging technological advances in cryo-EM, and the prospects for future development of the technique.
Collapse
Affiliation(s)
- Joshua A. Lees
- Discovery Sciences, Medicine Design, Pfizer Worldwide Research and Development, Groton, CT 06340, U.S.A
| | - Joao M. Dias
- Discovery Sciences, Medicine Design, Pfizer Worldwide Research and Development, Groton, CT 06340, U.S.A
| | - Seungil Han
- Discovery Sciences, Medicine Design, Pfizer Worldwide Research and Development, Groton, CT 06340, U.S.A
| |
Collapse
|
38
|
Moebel E, Martinez-Sanchez A, Lamm L, Righetto RD, Wietrzynski W, Albert S, Larivière D, Fourmentin E, Pfeffer S, Ortiz J, Baumeister W, Peng T, Engel BD, Kervrann C. Deep learning improves macromolecule identification in 3D cellular cryo-electron tomograms. Nat Methods 2021; 18:1386-1394. [PMID: 34675434 DOI: 10.1038/s41592-021-01275-4] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 08/18/2021] [Indexed: 11/10/2022]
Abstract
Cryogenic electron tomography (cryo-ET) visualizes the 3D spatial distribution of macromolecules at nanometer resolution inside native cells. However, automated identification of macromolecules inside cellular tomograms is challenged by noise and reconstruction artifacts, as well as the presence of many molecular species in the crowded volumes. Here, we present DeepFinder, a computational procedure that uses artificial neural networks to simultaneously localize multiple classes of macromolecules. Once trained, the inference stage of DeepFinder is faster than template matching and performs better than other competitive deep learning methods at identifying macromolecules of various sizes in both synthetic and experimental datasets. On cellular cryo-ET data, DeepFinder localized membrane-bound and cytosolic ribosomes (roughly 3.2 MDa), ribulose 1,5-bisphosphate carboxylase-oxygenase (roughly 560 kDa soluble complex) and photosystem II (roughly 550 kDa membrane complex) with an accuracy comparable to expert-supervised ground truth annotations. DeepFinder is therefore a promising algorithm for the semiautomated analysis of a wide range of molecular targets in cellular tomograms.
Collapse
Affiliation(s)
- Emmanuel Moebel
- Serpico Project-Team, Centre Inria Rennes-Bretagne Atlantique and CNRS-UMR 144, Inria, CNRS, Institut Curie, PSL Research University, Campus Universitaire de Beaulieu, Rennes Cedex, France
| | - Antonio Martinez-Sanchez
- Department of Computer Science, Faculty of Sciences, University of Oviedo, Oviedo, Spain.,Health Research Institute of Asturias (ISPA), Avenida Hospital Universitario s/n, Oviedo, Spain.,Institute of Neuropathology, Cluster of Excellence 'Multiscale Bioimaging: from Molecular Machines to Networks of Excitable Cells', University of Göttingen, Göttingen, Germany
| | - Lorenz Lamm
- Helmholtz Pioneer Campus, Helmholtz Zentrum München, Neuherberg, Germany.,Helmholtz AI, Helmholtz Zentrum München, Neuherberg, Germany
| | - Ricardo D Righetto
- Helmholtz Pioneer Campus, Helmholtz Zentrum München, Neuherberg, Germany
| | | | | | - Damien Larivière
- Fourmentin-Guilbert Scientific Foundation, Noisy-le-Grand, France
| | - Eric Fourmentin
- Fourmentin-Guilbert Scientific Foundation, Noisy-le-Grand, France
| | - Stefan Pfeffer
- Max Planck Institute of Biochemistry, Martinsried, Germany.,Zentrum für Molekulare Biologie der Universität Heidelberg, Heidelberg, Germany
| | - Julio Ortiz
- Max Planck Institute of Biochemistry, Martinsried, Germany.,Ernst Ruska-Centre, Wilhelm-Johnen-Straße, Jülich, Germany
| | | | - Tingying Peng
- Helmholtz AI, Helmholtz Zentrum München, Neuherberg, Germany
| | - Benjamin D Engel
- Helmholtz Pioneer Campus, Helmholtz Zentrum München, Neuherberg, Germany. .,Department of Chemistry, Technical University of Munich, Garching, Germany.
| | - Charles Kervrann
- Serpico Project-Team, Centre Inria Rennes-Bretagne Atlantique and CNRS-UMR 144, Inria, CNRS, Institut Curie, PSL Research University, Campus Universitaire de Beaulieu, Rennes Cedex, France.
| |
Collapse
|
39
|
Zielinski M, Röder C, Schröder GF. Challenges in sample preparation and structure determination of amyloids by cryo-EM. J Biol Chem 2021; 297:100938. [PMID: 34224730 PMCID: PMC8335658 DOI: 10.1016/j.jbc.2021.100938] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 06/28/2021] [Accepted: 07/01/2021] [Indexed: 01/12/2023] Open
Abstract
Amyloids share a common architecture but play disparate biological roles in processes ranging from bacterial defense mechanisms to protein misfolding diseases. Their structures are highly polymorphic, which makes them difficult to study by X-ray diffraction or NMR spectroscopy. Our understanding of amyloid structures is due in large part to recent advances in the field of cryo-EM, which allows for determining the polymorphs separately. In this review, we highlight the main stepping stones leading to the substantial number of high-resolution amyloid fibril structures known today as well as recent developments regarding automation and software in cryo-EM. We discuss that sample preparation should move closer to physiological conditions to understand how amyloid aggregation and disease are linked. We further highlight new approaches to address heterogeneity and polymorphism of amyloid fibrils in EM image processing and give an outlook to the upcoming challenges in researching the structural biology of amyloids.
Collapse
Affiliation(s)
- Mara Zielinski
- Institute of Biological Information Processing, Structural Biochemistry (IBI-7) and JuStruct, Jülich Center for Structural Biology, Forschungszentrum Jülich, Jülich, Germany
| | - Christine Röder
- Institute of Biological Information Processing, Structural Biochemistry (IBI-7) and JuStruct, Jülich Center for Structural Biology, Forschungszentrum Jülich, Jülich, Germany; Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| | - Gunnar F Schröder
- Institute of Biological Information Processing, Structural Biochemistry (IBI-7) and JuStruct, Jülich Center for Structural Biology, Forschungszentrum Jülich, Jülich, Germany; Physics Department, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany.
| |
Collapse
|
40
|
|
41
|
DeepEMhancer: a deep learning solution for cryo-EM volume post-processing. Commun Biol 2021; 4:874. [PMID: 34267316 PMCID: PMC8282847 DOI: 10.1038/s42003-021-02399-1] [Citation(s) in RCA: 591] [Impact Index Per Article: 197.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 06/17/2021] [Indexed: 01/02/2023] Open
Abstract
Cryo-EM maps are valuable sources of information for protein structure modeling. However, due to the loss of contrast at high frequencies, they generally need to be post-processed to improve their interpretability. Most popular approaches, based on global B-factor correction, suffer from limitations. For instance, they ignore the heterogeneity in the map local quality that reconstructions tend to exhibit. Aiming to overcome these problems, we present DeepEMhancer, a deep learning approach designed to perform automatic post-processing of cryo-EM maps. Trained on a dataset of pairs of experimental maps and maps sharpened using their respective atomic models, DeepEMhancer has learned how to post-process experimental maps performing masking-like and sharpening-like operations in a single step. DeepEMhancer was evaluated on a testing set of 20 different experimental maps, showing its ability to reduce noise levels and obtain more detailed versions of the experimental maps. Additionally, we illustrated the benefits of DeepEMhancer on the structure of the SARS-CoV-2 RNA polymerase. Sanchez-Garcia et al. present DeepEMhancer, a deep learning-based method that can automatically perform post-processing of raw cryo-electron microscopy density maps. The authors report that DeepEMhancer globally improves local quality of density maps, and may represent a useful tool for novel structures where PDB models are not readily available.
Collapse
|
42
|
Mill L, Wolff D, Gerrits N, Philipp P, Kling L, Vollnhals F, Ignatenko A, Jaremenko C, Huang Y, De Castro O, Audinot JN, Nelissen I, Wirtz T, Maier A, Christiansen S. Synthetic Image Rendering Solves Annotation Problem in Deep Learning Nanoparticle Segmentation. SMALL METHODS 2021; 5:e2100223. [PMID: 34927995 DOI: 10.1002/smtd.202100223] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 04/17/2021] [Indexed: 05/14/2023]
Abstract
Nanoparticles occur in various environments as a consequence of man-made processes, which raises concerns about their impact on the environment and human health. To allow for proper risk assessment, a precise and statistically relevant analysis of particle characteristics (such as size, shape, and composition) is required that would greatly benefit from automated image analysis procedures. While deep learning shows impressive results in object detection tasks, its applicability is limited by the amount of representative, experimentally collected and manually annotated training data. Here, an elegant, flexible, and versatile method to bypass this costly and tedious data acquisition process is presented. It shows that using a rendering software allows to generate realistic, synthetic training data to train a state-of-the art deep neural network. Using this approach, a segmentation accuracy can be derived that is comparable to man-made annotations for toxicologically relevant metal-oxide nanoparticle ensembles which were chosen as examples. The presented study paves the way toward the use of deep learning for automated, high-throughput particle detection in a variety of imaging techniques such as in microscopies and spectroscopies, for a wide range of applications, including the detection of micro- and nanoplastic particles in water and tissue samples.
Collapse
Affiliation(s)
- Leonid Mill
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
- Institute of Optics, Information and Photonics, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
| | - David Wolff
- Institut für Nanotechnologie und korrelative Mikroskopie, 91301, Forchheim, Germany
| | - Nele Gerrits
- Health Unit, Flemish Institute for Technological Research, Mol, 2400, Belgium
| | - Patrick Philipp
- Advanced Instrumentation for Ion Nano-Analytics, Materials Research and Technology Department, Luxembourg Institute of Science and Technology, Belvaux, L-4422, Luxembourg
| | - Lasse Kling
- Institut für Nanotechnologie und korrelative Mikroskopie, 91301, Forchheim, Germany
| | - Florian Vollnhals
- Institute of Optics, Information and Photonics, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
- Institut für Nanotechnologie und korrelative Mikroskopie, 91301, Forchheim, Germany
| | - Andrew Ignatenko
- Advanced Instrumentation for Ion Nano-Analytics, Materials Research and Technology Department, Luxembourg Institute of Science and Technology, Belvaux, L-4422, Luxembourg
| | - Christian Jaremenko
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
- Institut für Nanotechnologie und korrelative Mikroskopie, 91301, Forchheim, Germany
| | - Yixing Huang
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
- Institut für Nanotechnologie und korrelative Mikroskopie, 91301, Forchheim, Germany
| | - Olivier De Castro
- Advanced Instrumentation for Ion Nano-Analytics, Materials Research and Technology Department, Luxembourg Institute of Science and Technology, Belvaux, L-4422, Luxembourg
| | - Jean-Nicolas Audinot
- Advanced Instrumentation for Ion Nano-Analytics, Materials Research and Technology Department, Luxembourg Institute of Science and Technology, Belvaux, L-4422, Luxembourg
| | - Inge Nelissen
- Health Unit, Flemish Institute for Technological Research, Mol, 2400, Belgium
| | - Tom Wirtz
- Advanced Instrumentation for Ion Nano-Analytics, Materials Research and Technology Department, Luxembourg Institute of Science and Technology, Belvaux, L-4422, Luxembourg
| | - Andreas Maier
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
| | - Silke Christiansen
- Institute of Optics, Information and Photonics, Friedrich-Alexander-University Erlangen-Nuremberg, 91058, Erlangen, Germany
- Physics Department, Free University, 14195, Berlin, Germany
- Correlative Microscopy and Material Data Department, Fraunhofer Institute for Ceramic Technologies and Systems, 01277, Dresden, Germany
| |
Collapse
|
43
|
Ohashi M, Hosokawa F, Shinkawa T, Iwasaki K. Evaluation of automated particle picking for cryogenic electron microscopy using high-precision transmission electron microscope simulation based on a multi-slice method. ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY 2021; 77:966-979. [PMID: 34196622 DOI: 10.1107/s2059798321005106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 05/13/2021] [Indexed: 11/10/2022]
Abstract
This work describes the GRIPS automated particle-picking software for cryogenic electron microscopy and the evaluation of this software using elbis, a high-precision transmission electron microscope (TEM) image simulator. The goal was to develop a method that can pick particles under a small defocus condition where the particles are not clearly visible or under a condition where the particles are exhibiting preferred orientation. The proposed method handles these issues by repeatedly performing three processes, namely extraction, two-dimensional classification and positioning, and by introducing mask processing to exclude areas with particles that have already been picked. TEM images for evaluation were generated with a high-precision TEM image simulator. TEM images containing both particles and amorphous ice were simulated by randomly placing O atoms in the specimen. The experimental results indicate that the proposed method can be used to pick particles correctly under a relatively small defocus condition. Moreover, the results show that the mask processing introduced in the proposed method is valid for particles exhibiting preferred orientation. It is further shown that the proposed method is applicable to data collected from real samples.
Collapse
Affiliation(s)
- Masataka Ohashi
- BioNet Lab. Inc., 2-3-28 Nishiki-cho, Tachikawa, Tokyo 190-0022, Japan
| | - Fumio Hosokawa
- BioNet Lab. Inc., 2-3-28 Nishiki-cho, Tachikawa, Tokyo 190-0022, Japan
| | - Takao Shinkawa
- BioNet Lab. Inc., 2-3-28 Nishiki-cho, Tachikawa, Tokyo 190-0022, Japan
| | - Kenji Iwasaki
- University of Tsukuba, Tsukuba Advanced Research Alliance, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
44
|
Pakhrin SC, Shrestha B, Adhikari B, KC DB. Deep Learning-Based Advances in Protein Structure Prediction. Int J Mol Sci 2021; 22:5553. [PMID: 34074028 PMCID: PMC8197379 DOI: 10.3390/ijms22115553] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 05/12/2021] [Accepted: 05/18/2021] [Indexed: 12/29/2022] Open
Abstract
Obtaining an accurate description of protein structure is a fundamental step toward understanding the underpinning of biology. Although recent advances in experimental approaches have greatly enhanced our capabilities to experimentally determine protein structures, the gap between the number of protein sequences and known protein structures is ever increasing. Computational protein structure prediction is one of the ways to fill this gap. Recently, the protein structure prediction field has witnessed a lot of advances due to Deep Learning (DL)-based approaches as evidenced by the success of AlphaFold2 in the most recent Critical Assessment of protein Structure Prediction (CASP14). In this article, we highlight important milestones and progresses in the field of protein structure prediction due to DL-based methods as observed in CASP experiments. We describe advances in various steps of protein structure prediction pipeline viz. protein contact map prediction, protein distogram prediction, protein real-valued distance prediction, and Quality Assessment/refinement. We also highlight some end-to-end DL-based approaches for protein structure prediction approaches. Additionally, as there have been some recent DL-based advances in protein structure determination using Cryo-Electron (Cryo-EM) microscopy based, we also highlight some of the important progress in the field. Finally, we provide an outlook and possible future research directions for DL-based approaches in the protein structure prediction arena.
Collapse
Affiliation(s)
- Subash C. Pakhrin
- Department of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS 67260, USA;
| | - Bikash Shrestha
- Department of Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA;
| | - Badri Adhikari
- Department of Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA;
| | - Dukka B. KC
- Department of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS 67260, USA;
| |
Collapse
|
45
|
Kyrilis FL, Belapure J, Kastritis PL. Detecting Protein Communities in Native Cell Extracts by Machine Learning: A Structural Biologist's Perspective. Front Mol Biosci 2021; 8:660542. [PMID: 33937337 PMCID: PMC8082361 DOI: 10.3389/fmolb.2021.660542] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/18/2021] [Indexed: 11/13/2022] Open
Abstract
Native cell extracts hold great promise for understanding the molecular structure of ordered biological systems at high resolution. This is because higher-order biomolecular interactions, dubbed as protein communities, may be retained in their (near-)native state, in contrast to extensively purifying or artificially overexpressing the proteins of interest. The distinct machine-learning approaches are applied to discover protein-protein interactions within cell extracts, reconstruct dedicated biological networks, and report on protein community members from various organisms. Their validation is also important, e.g., by the cross-linking mass spectrometry or cell biology methods. In addition, the cell extracts are amenable to structural analysis by cryo-electron microscopy (cryo-EM), but due to their inherent complexity, sorting structural signatures of protein communities derived by cryo-EM comprises a formidable task. The application of image-processing workflows inspired by machine-learning techniques would provide improvements in distinguishing structural signatures, correlating proteomic and network data to structural signatures and subsequently reconstructed cryo-EM maps, and, ultimately, characterizing unidentified protein communities at high resolution. In this review article, we summarize recent literature in detecting protein communities from native cell extracts and identify the remaining challenges and opportunities. We argue that the progress in, and the integration of, machine learning, cryo-EM, and complementary structural proteomics approaches would provide the basis for a multi-scale molecular description of protein communities within native cell extracts.
Collapse
Affiliation(s)
- Fotis L. Kyrilis
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Jaydeep Belapure
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Panagiotis L. Kastritis
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
- Biozentrum, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| |
Collapse
|
46
|
Abstract
Abstract
Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy.
Collapse
|
47
|
Jiménez-Moreno A, Střelák D, Filipovič J, Carazo JM, Sorzano COS. DeepAlign, a 3D alignment method based on regionalized deep learning for Cryo-EM. J Struct Biol 2021; 213:107712. [PMID: 33676034 DOI: 10.1016/j.jsb.2021.107712] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 02/02/2021] [Accepted: 02/21/2021] [Indexed: 02/02/2023]
Abstract
Cryo Electron Microscopy (Cryo-EM) is currently one of the main tools to reveal the structural information of biological specimens at high resolution. Despite the great development of the techniques involved to solve the biological structures with Cryo-EM in the last years, the reconstructed 3D maps can present lower resolution due to errors committed while processing the information acquired by the microscope. One of the main problems comes from the 3D alignment step, which is an error-prone part of the reconstruction workflow due to the very low signal-to-noise ratio (SNR) common in Cryo-EM imaging. In fact, as we will show in this work, it is not unusual to find a disagreement in the alignment parameters in approximately 20-40% of the processed images, when outputs of different alignment algorithms are compared. In this work, we present a novel method to align sets of single particle images in the 3D space, called DeepAlign. Our proposal is based on deep learning networks that have been successfully used in plenty of problems in image classification. Specifically, we propose to design several deep neural networks on a regionalized basis to classify the particle images in sub-regions and, then, make a refinement of the 3D alignment parameters only inside that sub-region. We show that this method results in accurately aligned images, improving the Fourier shell correlation (FSC) resolution obtained with other state-of-the-art methods while decreasing computational time.
Collapse
Affiliation(s)
- A Jiménez-Moreno
- Centro Nac. Biotecnología (CSIC), c/Darwin, 3, 28049 Cantoblanco, Madrid, Spain
| | - D Střelák
- Centro Nac. Biotecnología (CSIC), c/Darwin, 3, 28049 Cantoblanco, Madrid, Spain; Faculty of Informatics, Masaryk University, Botanická 68a, 662 00 Brno, Czech Republic; Institute of Computer Science, Masaryk University, Botanická 68a, 60200 Brno, Czech Republic
| | - J Filipovič
- Institute of Computer Science, Masaryk University, Botanická 68a, 60200 Brno, Czech Republic
| | - J M Carazo
- Centro Nac. Biotecnología (CSIC), c/Darwin, 3, 28049 Cantoblanco, Madrid, Spain.
| | - C O S Sorzano
- Centro Nac. Biotecnología (CSIC), c/Darwin, 3, 28049 Cantoblanco, Madrid, Spain; Univ. San Pablo - CEU, Campus Urb. Montepríncipe, 28668 Boadilla del Monte, Madrid, Spain.
| |
Collapse
|
48
|
Workflow towards automated segmentation of agglomerated, non-spherical particles from electron microscopy images using artificial neural networks. Sci Rep 2021; 11:4942. [PMID: 33654161 PMCID: PMC7925552 DOI: 10.1038/s41598-021-84287-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 02/15/2021] [Indexed: 11/08/2022] Open
Abstract
We present a workflow for obtaining fully trained artificial neural networks that can perform automatic particle segmentations of agglomerated, non-spherical nanoparticles from scanning electron microscopy images "from scratch", without the need for large training data sets of manually annotated images. The whole process only requires about 15 min of hands-on time by a user and can typically be finished within less than 12 h when training on a single graphics card (GPU). After training, SEM image analysis can be carried out by the artificial neural network within seconds. This is achieved by using unsupervised learning for most of the training dataset generation, making heavy use of generative adversarial networks and especially unpaired image-to-image translation via cycle-consistent adversarial networks. We compare the segmentation masks obtained with our suggested workflow qualitatively and quantitatively to state-of-the-art methods using various metrics. Finally, we used the segmentation masks for automatically extracting particle size distributions from the SEM images of TiO2 particles, which were in excellent agreement with particle size distributions obtained manually but could be obtained in a fraction of the time.
Collapse
|
49
|
George B, Assaiya A, Roy RJ, Kembhavi A, Chauhan R, Paul G, Kumar J, Philip NS. CASSPER is a semantic segmentation-based particle picking algorithm for single-particle cryo-electron microscopy. Commun Biol 2021; 4:200. [PMID: 33589717 PMCID: PMC7884729 DOI: 10.1038/s42003-021-01721-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 01/19/2021] [Indexed: 11/27/2022] Open
Abstract
Particle identification and selection, which is a prerequisite for high-resolution structure determination of biological macromolecules via single-particle cryo-electron microscopy poses a major bottleneck for automating the steps of structure determination. Here, we present a generalized deep learning tool, CASSPER, for the automated detection and isolation of protein particles in transmission microscope images. This deep learning tool uses Semantic Segmentation and a collection of visually prepared training samples to capture the differences in the transmission intensities of protein, ice, carbon, and other impurities found in the micrograph. CASSPER is a semantic segmentation based method that does pixel-level classification and completely eliminates the need for manual particle picking. Integration of Contrast Limited Adaptive Histogram Equalization (CLAHE) in CASSPER enables high-fidelity particle detection in micrographs with variable ice thickness and contrast. A generalized CASSPER model works with high efficiency on unseen datasets and can potentially pick particles on-the-fly, enabling data processing automation.
Collapse
Affiliation(s)
- Blesson George
- Artificial Intelligence Research and Intelligent Systems (airis4D), Thelliyoor, Kerala, India
- Department of Physics, CMS College, Kottayam, Kerala, India
| | - Anshul Assaiya
- Laboratory of Membrane Protein Biology, National Centre for Cell Science, S. P. Pune University Campus, Pune, India
| | - Robin J Roy
- Artificial Intelligence Research and Intelligent Systems (airis4D), Thelliyoor, Kerala, India
| | - Ajit Kembhavi
- Inter-University Centre for Astronomy and Astrophysics (IUCAA), S. P. Pune University Campus, Pune, India
| | - Radha Chauhan
- Laboratory of Structural Biology, National Centre for Cell Science, S. P. Pune University Campus, Pune, India
| | - Geetha Paul
- Artificial Intelligence Research and Intelligent Systems (airis4D), Thelliyoor, Kerala, India
| | - Janesh Kumar
- Laboratory of Membrane Protein Biology, National Centre for Cell Science, S. P. Pune University Campus, Pune, India.
| | - Ninan S Philip
- Artificial Intelligence Research and Intelligent Systems (airis4D), Thelliyoor, Kerala, India.
| |
Collapse
|
50
|
Nguyen NP, Ersoy I, Gotberg J, Bunyak F, White TA. DRPnet: automated particle picking in cryo-electron micrographs using deep regression. BMC Bioinformatics 2021; 22:55. [PMID: 33557750 PMCID: PMC7869254 DOI: 10.1186/s12859-020-03948-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 12/22/2020] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Identification and selection of protein particles in cryo-electron micrographs is an important step in single particle analysis. In this study, we developed a deep learning-based particle picking network to automatically detect particle centers from cryoEM micrographs. This is a challenging task due to the nature of cryoEM data, having low signal-to-noise ratios with variable particle sizes, shapes, distributions, grayscale variations as well as other undesirable artifacts. RESULTS We propose a double convolutional neural network (CNN) cascade for automated detection of particles in cryo-electron micrographs. This approach, entitled Deep Regression Picker Network or "DRPnet", is simple but very effective in recognizing different particle sizes, shapes, distributions and grayscale patterns corresponding to 2D views of 3D particles. Particles are detected by the first network, a fully convolutional regression network (FCRN), which maps the particle image to a continuous distance map that acts like a probability density function of particle centers. Particles identified by FCRN are further refined to reduce false particle detections by the second classification CNN. DRPnet's first CNN pretrained with only a single cryoEM dataset can be used to detect particles from different datasets without retraining. Compared to RELION template-based autopicking, DRPnet results in better particle picking performance with drastically reduced user interactions and processing time. DRPnet also outperforms the state-of-the-art particle picking networks in terms of the supervised detection evaluation metrics recall, precision, and F-measure. To further highlight quality of the picked particle sets, we compute and present additional performance metrics assessing the resulting 3D reconstructions such as number of 2D class averages, efficiency/angular coverage, Rosenthal-Henderson plots and local/global 3D reconstruction resolution. CONCLUSION DRPnet shows greatly improved time-savings to generate an initial particle dataset compared to manual picking, followed by template-based autopicking. Compared to other networks, DRPnet has equivalent or better performance. DRPnet excels on cryoEM datasets that have low contrast or clumped particles. Evaluating other performance metrics, DRPnet is useful for higher resolution 3D reconstructions with decreased particle numbers or unknown symmetry, detecting particles with better angular orientation coverage.
Collapse
Affiliation(s)
- Nguyen Phuoc Nguyen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Ilker Ersoy
- Institute for Data Science and Informatics, University of Missouri, Columbia, MO USA
| | - Jacob Gotberg
- Research Computing Support Services, University of Missouri, Columbia, MO USA
| | - Filiz Bunyak
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Tommi A. White
- Department of Biochemistry, University of Missouri, Columbia, MO USA
- Electron Microscopy Core, University of Missouri, Columbia, MO USA
| |
Collapse
|