1
|
Sokolov I. On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition. Phys Chem Chem Phys 2024; 26:11263-11270. [PMID: 38477533 PMCID: PMC11182436 DOI: 10.1039/d3cp05673b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Atomic force microscopy (AFM or SPM) imaging is one of the best matches with machine learning (ML) analysis among microscopy techniques. The digital format of AFM images allows for direct utilization in ML algorithms without the need for additional processing. Additionally, AFM enables the simultaneous imaging of distributions of over a dozen different physicochemical properties of sample surfaces, a process known as multidimensional imaging. While this wealth of information can be challenging to analyze using traditional methods, ML provides a seamless approach to this task. However, the relatively slow speed of AFM imaging poses a challenge in applying deep learning methods broadly used in image recognition. This prospective is focused on ML recognition/classification when using a relatively small number of AFM images, aka small database. We discuss ML methods other than popular deep-learning neural networks. The described approach has already been successfully used to analyze and classify the surfaces of biological cells. It can be applied to recognize medical images, specific material processing, in forensic studies, even to identify the authenticity of arts. A general template for ML analysis specific to AFM is suggested, with a specific example of the identification of cell phenotype. Special attention is given to the analysis of the statistical significance of the obtained results, an important feature that is often overlooked in papers dealing with machine learning. A simple method for finding statistical significance is also described.
Collapse
Affiliation(s)
- I Sokolov
- Department of Mechanical Engineering, Tufts University, Medford, MA 02155, USA.
- Department of Biomedical Engineering, Tufts University, Medford, MA 02155, USA
- Department of Physics, Tufts University, Medford, MA, 02155, USA
| |
Collapse
|
2
|
Dai X, Wu L, Yoo S, Liu Q. Integrating AlphaFold and deep learning for atomistic interpretation of cryo-EM maps. Brief Bioinform 2023; 24:bbad405. [PMID: 37982712 DOI: 10.1093/bib/bbad405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 10/09/2023] [Accepted: 10/23/2023] [Indexed: 11/21/2023] Open
Abstract
Interpretation of cryo-electron microscopy (cryo-EM) maps requires building and fitting 3D atomic models of biological molecules. AlphaFold-predicted models generate initial 3D coordinates; however, model inaccuracy and conformational heterogeneity often necessitate labor-intensive manual model building and fitting into cryo-EM maps. In this work, we designed a protein model-building workflow, which combines a deep-learning cryo-EM map feature enhancement tool, CryoFEM (Cryo-EM Feature Enhancement Model) and AlphaFold. A benchmark test using 36 cryo-EM maps shows that CryoFEM achieves state-of-the-art performance in optimizing the Fourier Shell Correlations between the maps and the ground truth models. Furthermore, in a subset of 17 datasets where the initial AlphaFold predictions are less accurate, the workflow significantly improves their model accuracy. Our work demonstrates that the integration of modern deep learning image enhancement and AlphaFold may lead to automated model building and fitting for the atomistic interpretation of cryo-EM maps.
Collapse
Affiliation(s)
- Xin Dai
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY, USA
| | - Longlong Wu
- Condensed Matter Physics and Materials Science Department, Brookhaven National Laboratory, Upton, NY, USA
| | - Shinjae Yoo
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY, USA
| | - Qun Liu
- Biology Department, Brookhaven National Laboratory, Upton, NY, USA
| |
Collapse
|
3
|
DiIorio MC, Kulczyk AW. Novel Artificial Intelligence-Based Approaches for Ab Initio Structure Determination and Atomic Model Building for Cryo-Electron Microscopy. MICROMACHINES 2023; 14:1674. [PMID: 37763837 PMCID: PMC10534518 DOI: 10.3390/mi14091674] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/21/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Single particle cryo-electron microscopy (cryo-EM) has emerged as the prevailing method for near-atomic structure determination, shedding light on the important molecular mechanisms of biological macromolecules. However, the inherent dynamics and structural variability of biological complexes coupled with the large number of experimental images generated by a cryo-EM experiment make data processing nontrivial. In particular, ab initio reconstruction and atomic model building remain major bottlenecks that demand substantial computational resources and manual intervention. Approaches utilizing recent innovations in artificial intelligence (AI) technology, particularly deep learning, have the potential to overcome the limitations that cannot be adequately addressed by traditional image processing approaches. Here, we review newly proposed AI-based methods for ab initio volume generation, heterogeneous 3D reconstruction, and atomic model building. We highlight the advancements made by the implementation of AI methods, as well as discuss remaining limitations and areas for future development.
Collapse
Affiliation(s)
- Megan C. DiIorio
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
- Department of Biochemistry & Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901, USA
| |
Collapse
|
4
|
Sekmen A, Al Nasr K, Bilgin B, Koku AB, Jones C. Mathematical and Machine Learning Approaches for Classification of Protein Secondary Structure Elements from Cα Coordinates. Biomolecules 2023; 13:923. [PMID: 37371503 DOI: 10.3390/biom13060923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 05/16/2023] [Accepted: 05/16/2023] [Indexed: 06/29/2023] Open
Abstract
Determining Secondary Structure Elements (SSEs) for any protein is crucial as an intermediate step for experimental tertiary structure determination. SSEs are identified using popular tools such as DSSP and STRIDE. These tools use atomic information to locate hydrogen bonds to identify SSEs. When some spatial atomic details are missing, locating SSEs becomes a hinder. To address the problem, when some atomic information is missing, three approaches for classifying SSE types using Cα atoms in protein chains were developed: (1) a mathematical approach, (2) a deep learning approach, and (3) an ensemble of five machine learning models. The proposed methods were compared against each other and with a state-of-the-art approach, PCASSO.
Collapse
Affiliation(s)
- Ali Sekmen
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| | - Bahadir Bilgin
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
- Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye
| | - Ahmet Bugra Koku
- Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye
- Center for Robotics and AI, Middle East Technical University, Ankara 06800, Türkiye
| | - Christopher Jones
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| |
Collapse
|
5
|
Giri N, Roy RS, Cheng J. Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions. Curr Opin Struct Biol 2023; 79:102536. [PMID: 36773336 PMCID: PMC10023387 DOI: 10.1016/j.sbi.2023.102536] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/20/2022] [Accepted: 01/03/2023] [Indexed: 02/11/2023]
Abstract
Cryo-Electron Microscopy (cryo-EM) has emerged as a key technology to determine the structure of proteins, particularly large protein complexes and assemblies in recent years. A key challenge in cryo-EM data analysis is to automatically reconstruct accurate protein structures from cryo-EM density maps. In this review, we briefly overview various deep learning methods for building protein structures from cryo-EM density maps, analyze their impact, and discuss the challenges of preparing high-quality data sets for training deep learning models. Looking into the future, more advanced deep learning models of effectively integrating cryo-EM data with other sources of complementary data such as protein sequences and AlphaFold-predicted structures need to be developed to further advance the field.
Collapse
Affiliation(s)
- Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA; NextGen Precision Health, University of Missouri, Columbia, 65211, Missouri, USA. https://twitter.com/@nvngiri
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA. https://twitter.com/@rajshekhorroy
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, Missouri, USA; NextGen Precision Health, University of Missouri, Columbia, 65211, Missouri, USA.
| |
Collapse
|
6
|
Si D, Chen J, Nakamura A, Chang L, Guan H. Smart de novo Macromolecular Structure Modeling from Cryo-EM Maps. J Mol Biol 2023; 435:167967. [PMID: 36681181 DOI: 10.1016/j.jmb.2023.167967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 01/04/2023] [Accepted: 01/12/2023] [Indexed: 01/20/2023]
Abstract
The study of macromolecular structures has expanded our understanding of the amazing cell machinery and such knowledge has changed how the pharmaceutical industry develops new vaccines in recent years. Traditionally, X-ray crystallography has been the main method for structure determination, however, cryogenic electron microscopy (cryo-EM) has increasingly become more popular due to recent advancements in hardware and software. The number of cryo-EM maps deposited in the EMDataResource (formerly EMDatabase) since 2002 has been dramatically increasing and it continues to do so. De novo macromolecular complex modeling is a labor-intensive process, therefore, it is highly desirable to develop software that can automate this process. Here we discuss our automated, data-driven, and artificial intelligence approaches including map processing, feature extraction, modeling building, and target identification. Recently, we have enabled DNA/RNA modeling in our deep learning-based prediction tool, DeepTracer. We have also developed DeepTracer-ID, a tool that can identify proteins solely based on the cryo-EM map. In this paper, we will present our accumulated experiences in developing deep learning-based methods surrounding macromolecule modeling applications.
Collapse
Affiliation(s)
- Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, United States.
| | - Jason Chen
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, United States
| | - Andrew Nakamura
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, United States
| | - Luca Chang
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, United States
| | - Haowen Guan
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, United States
| |
Collapse
|
7
|
Garcia Condado J, Muñoz-Barrutia A, Sorzano COS. Automatic determination of the handedness of single-particle maps of macromolecules solved by CryoEM. J Struct Biol 2022; 214:107915. [PMID: 36341955 DOI: 10.1016/j.jsb.2022.107915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/29/2022] [Accepted: 10/25/2022] [Indexed: 12/07/2022]
Abstract
Single-Particle Analysis by Cryo-Electron Microscopy is a well-established technique to elucidate the three-dimensional (3D) structure of biological macromolecules. The orientation of the acquired projection images must be initially estimated without any reference to the final structure. In this step, algorithms may find a mirrored version of all the orientations resulting in a mirrored 3D map. It is as compatible with the acquired images as its unmirrored version from the image processing point of view, only that it is not biologically plausible. In this article, we introduce HaPi (Handedness Pipeline), the first method to automatically determine the hand of electron density maps of macromolecules solved by CryoEM. HaPi is built by training two 3D convolutional neural networks. The first determines α-helices in a map, and the second determines whether the α-helix is left-handed or right-handed. A consensus strategy defines the overall map hand. The pipeline is trained on simulated and experimental data. The handedness can be detected only for maps whose resolution is better than 5 Å. HaPi can identify the hand in 89% of new simulated maps correctly. Moreover, we evaluated all the maps deposited at the Electron Microscopy Data Bank and 11 structures uploaded with the incorrect hand were identified.
Collapse
Affiliation(s)
- J Garcia Condado
- Biocruces Bizkaia Instituto Investigación Sanitaria, Cruces Plaza, 48903 Barakaldo, Bizkaia, Spain; Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Madrid, Spain; Centro Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain
| | - A Muñoz-Barrutia
- Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Madrid, Spain
| | - C O S Sorzano
- Centro Nacional de Biotecnologia (CNB-CSIC), Darwin, 3, Campus Universidad Autonoma, 28049 Cantoblanco, Madrid, Spain.
| |
Collapse
|
8
|
Beton JG, Cragnolini T, Kaleel M, Mulvaney T, Sweeney A, Topf M. Integrating model simulation tools and
cryo‐electron
microscopy. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Joseph George Beton
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck and University College London London UK
| | - Manaz Kaleel
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Aaron Sweeney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| |
Collapse
|
9
|
Botifoll M, Pinto-Huguet I, Arbiol J. Machine learning in electron microscopy for advanced nanocharacterization: current developments, available tools and future outlook. NANOSCALE HORIZONS 2022; 7:1427-1477. [PMID: 36239693 DOI: 10.1039/d2nh00377e] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In the last few years, electron microscopy has experienced a new methodological paradigm aimed to fix the bottlenecks and overcome the challenges of its analytical workflow. Machine learning and artificial intelligence are answering this call providing powerful resources towards automation, exploration, and development. In this review, we evaluate the state-of-the-art of machine learning applied to electron microscopy (and obliquely, to materials and nano-sciences). We start from the traditional imaging techniques to reach the newest higher-dimensionality ones, also covering the recent advances in spectroscopy and tomography. Additionally, the present review provides a practical guide for microscopists, and in general for material scientists, but not necessarily advanced machine learning practitioners, to straightforwardly apply the offered set of tools to their own research. To conclude, we explore the state-of-the-art of other disciplines with a broader experience in applying artificial intelligence methods to their research (e.g., high-energy physics, astronomy, Earth sciences, and even robotics, videogames, or marketing and finances), in order to narrow down the incoming future of electron microscopy, its challenges and outlook.
Collapse
Affiliation(s)
- Marc Botifoll
- Catalan Institute of Nanoscience and Nanotechnology (ICN2), CSIC and BIST, Campus UAB, Bellaterra, 08193 Barcelona, Catalonia, Spain.
| | - Ivan Pinto-Huguet
- Catalan Institute of Nanoscience and Nanotechnology (ICN2), CSIC and BIST, Campus UAB, Bellaterra, 08193 Barcelona, Catalonia, Spain.
| | - Jordi Arbiol
- Catalan Institute of Nanoscience and Nanotechnology (ICN2), CSIC and BIST, Campus UAB, Bellaterra, 08193 Barcelona, Catalonia, Spain.
- ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Catalonia, Spain
| |
Collapse
|
10
|
Thorn A. Artificial intelligence in the experimental determination and prediction of macromolecular structures. Curr Opin Struct Biol 2022; 74:102368. [DOI: 10.1016/j.sbi.2022.102368] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 02/22/2022] [Accepted: 03/08/2022] [Indexed: 11/26/2022]
|
11
|
Behkamal B, Naghibzadeh M, Pagnani A, Saberi MR, Al Nasr K. LPTD: a novel linear programming-based topology determination method for cryo-EM maps. Bioinformatics 2022; 38:2734-2741. [PMID: 35561171 PMCID: PMC9306757 DOI: 10.1093/bioinformatics/btac170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 03/01/2022] [Accepted: 03/18/2022] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Topology determination is one of the most important intermediate steps toward building the atomic structure of proteins from their medium-resolution cryo-electron microscopy (cryo-EM) map. The main goal in the topology determination is to identify correct matches (i.e. assignment and direction) between secondary structure elements (SSEs) (α-helices and β-sheets) detected in a protein sequence and cryo-EM density map. Despite many recent advances in molecular biology technologies, the problem remains a challenging issue. To overcome the problem, this article proposes a linear programming-based topology determination (LPTD) method to solve the secondary structure topology problem in three-dimensional geometrical space. Through modeling of the protein's sequence with the aid of extracting highly reliable features and a distance-based scoring function, the secondary structure matching problem is transformed into a complete weighted bipartite graph matching problem. Subsequently, an algorithm based on linear programming is developed as a decision-making strategy to extract the true topology (native topology) between all possible topologies. The proposed automatic framework is verified using 12 experimental and 15 simulated α-β proteins. Results demonstrate that LPTD is highly efficient and extremely fast in such a way that for 77% of cases in the dataset, the native topology has been detected in the first rank topology in <2 s. Besides, this method is able to successfully handle large complex proteins with as many as 65 SSEs. Such a large number of SSEs have never been solved with current tools/methods. AVAILABILITY AND IMPLEMENTATION The LPTD package (source code and data) is publicly available at https://github.com/B-Behkamal/LPTD. Moreover, two test samples as well as the instruction of utilizing the graphical user interface have been provided in the shared readme file. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bahareh Behkamal
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran
| | - Andrea Pagnani
- Department of Applied Science and Technology (DISAT), Politecnico di Torino, Torino I-10129, Italy
- Italian Institute for Genomic Medicine (IIGM), IRCC-Candiolo, Candiolo (TO) I-10060, Italy
- INFN Sezione di Torino, Torino I-10125, Italy
| | - Mohammad Reza Saberi
- Medicinal Chemistry Department, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran
- Bioinformatics Research Group, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| |
Collapse
|
12
|
Wu JG, Yan Y, Zhang DX, Liu BW, Zheng QB, Xie XL, Liu SQ, Ge SX, Hou ZG, Xia NS. Machine Learning for Structure Determination in Single-Particle Cryo-Electron Microscopy: A Systematic Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:452-472. [PMID: 34932487 DOI: 10.1109/tnnls.2021.3131325] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recently, single-particle cryo-electron microscopy (cryo-EM) has become an indispensable method for determining macromolecular structures at high resolution to deeply explore the relevant molecular mechanism. Its recent breakthrough is mainly because of the rapid advances in hardware and image processing algorithms, especially machine learning. As an essential support of single-particle cryo-EM, machine learning has powered many aspects of structure determination and greatly promoted its development. In this article, we provide a systematic review of the applications of machine learning in this field. Our review begins with a brief introduction of single-particle cryo-EM, followed by the specific tasks and challenges of its image processing. Then, focusing on the workflow of structure determination, we describe relevant machine learning algorithms and applications at different steps, including particle picking, 2-D clustering, 3-D reconstruction, and other steps. As different tasks exhibit distinct characteristics, we introduce the evaluation metrics for each task and summarize their dynamics of technology development. Finally, we discuss the open issues and potential trends in this promising field.
Collapse
|
13
|
Behkamal B, Naghibzadeh M, Saberi MR, Tehranizadeh ZA, Pagnani A, Al Nasr K. Three-Dimensional Graph Matching to Identify Secondary Structure Correspondence of Medium-Resolution Cryo-EM Density Maps. Biomolecules 2021; 11:1773. [PMID: 34944417 PMCID: PMC8698881 DOI: 10.3390/biom11121773] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 11/18/2021] [Accepted: 11/20/2021] [Indexed: 01/15/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4-10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such as α-helices and β-sheets are observable. Consequently, finding the mapping of the secondary structures of the modeled structure (SSEs-A) to the cryo-EM map (SSEs-C) is one of the primary concerns in cryo-EM modeling. To address this issue, this study proposes a novel automatic computational method to identify SSEs correspondence in three-dimensional (3D) space. Initially, through a modeling of the target sequence with the aid of extracting highly reliable features from a generated 3D model and map, the SSEs matching problem is formulated as a 3D vector matching problem. Afterward, the 3D vector matching problem is transformed into a 3D graph matching problem. Finally, a similarity-based voting algorithm combined with the principle of least conflict (PLC) concept is developed to obtain the SSEs correspondence. To evaluate the accuracy of the method, a testing set of 25 experimental and simulated maps with a maximum of 65 SSEs is selected. Comparative studies are also conducted to demonstrate the superiority of the proposed method over some state-of-the-art techniques. The results demonstrate that the method is efficient, robust, and works well in the presence of errors in the predicted secondary structures of the cryo-EM images.
Collapse
Affiliation(s)
- Bahareh Behkamal
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran;
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad 9177948974, Iran;
| | - Mohammad Reza Saberi
- Medicinal Chemistry Department, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran; (M.R.S.); (Z.A.T.)
- Bioinformatics Research Group, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran
| | - Zeinab Amiri Tehranizadeh
- Medicinal Chemistry Department, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad 9177899191, Iran; (M.R.S.); (Z.A.T.)
| | - Andrea Pagnani
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy;
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, I-10060 Candiolo, Italy
- INFN, Sezione di Torino, I-10125 Torino, Italy
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA
| |
Collapse
|
14
|
Alshammari M, He J. Combining Cryo-EM Density Map and Residue Contact for Protein Secondary Structure Topologies. Molecules 2021; 26:7049. [PMID: 34834140 PMCID: PMC8624718 DOI: 10.3390/molecules26227049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 11/01/2021] [Accepted: 11/15/2021] [Indexed: 11/23/2022] Open
Abstract
Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. A topology of secondary structures defines the mapping between a set of sequence segments and a set of traces of secondary structures in three-dimensional space. In order to enhance accuracy in ranking secondary structure topologies, we explored a method that combines three sources of information: a set of sequence segments in 1D, a set of amino acid contact pairs in 2D, and a set of traces in 3D at the secondary structure level. A test of fourteen cases shows that the accuracy of predicted secondary structures is critical for deriving topologies. The use of significant long-range contact pairs is most effective at enriching the rank of the maximum-match topology for proteins with a large number of secondary structures, if the secondary structure prediction is fairly accurate. It was observed that the enrichment depends on the quality of initial topology candidates in this approach. We provide detailed analysis in various cases to show the potential and challenge when combining three sources of information.
Collapse
Affiliation(s)
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA;
| |
Collapse
|
15
|
He J, Huang SY. EMNUSS: a deep learning framework for secondary structure annotation in cryo-EM maps. Brief Bioinform 2021; 22:bbab156. [PMID: 33954706 PMCID: PMC8574626 DOI: 10.1093/bib/bbab156] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 03/30/2021] [Accepted: 04/06/2021] [Indexed: 02/06/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) has become one of important experimental methods in structure determination. However, despite the rapid growth in the number of deposited cryo-EM maps motivated by advances in microscopy instruments and image processing algorithms, building accurate structure models for cryo-EM maps remains a challenge. Protein secondary structure information, which can be extracted from EM maps, is beneficial for cryo-EM structure modeling. Here, we present a novel secondary structure annotation framework for cryo-EM maps at both intermediate and high resolutions, named EMNUSS. EMNUSS adopts a three-dimensional (3D) nested U-net architecture to assign secondary structures for EM maps. Tested on three diverse datasets including simulated maps, middle resolution experimental maps, and high-resolution experimental maps, EMNUSS demonstrated its accuracy and robustness in identifying the secondary structures for cyro-EM maps of various resolutions. The EMNUSS program is freely available at http://huanglab.phys.hust.edu.cn/EMNUSS.
Collapse
Affiliation(s)
- Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
16
|
Mu Y, Sazzed S, Alshammari M, Sun J, He J. A Tool for Segmentation of Secondary Structures in 3D Cryo-EM Density Map Components Using Deep Convolutional Neural Networks. FRONTIERS IN BIOINFORMATICS 2021; 1:710119. [PMID: 36303800 PMCID: PMC9581063 DOI: 10.3389/fbinf.2021.710119] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 09/28/2021] [Indexed: 07/20/2023] Open
Abstract
Although cryo-electron microscopy (cryo-EM) has been successfully used to derive atomic structures for many proteins, it is still challenging to derive atomic structures when the resolution of cryo-EM density maps is in the medium resolution range, such as 5-10 Å. Detection of protein secondary structures, such as helices and β-sheets, from cryo-EM density maps provides constraints for deriving atomic structures from such maps. As more deep learning methodologies are being developed for solving various molecular problems, effective tools are needed for users to access them. We have developed an effective software bundle, DeepSSETracer, for the detection of protein secondary structure from cryo-EM component maps in medium resolution. The bundle contains the network architecture and a U-Net model trained with a curriculum and gradient of episodic memory (GEM). The bundle integrates the deep neural network with the visualization capacity provided in ChimeraX. Using a Linux server that is remotely accessed by Windows users, it takes about 6 s on one CPU and one GPU for the trained deep neural network to detect secondary structures in a cryo-EM component map containing 446 amino acids. A test using 28 chain components of cryo-EM maps shows overall residue-level F1 scores of 0.72 and 0.65 to detect helices and β-sheets, respectively. Although deep learning applications are built on software frameworks, such as PyTorch and Tensorflow, our pioneer work here shows that integration of deep learning applications with ChimeraX is a promising and effective approach. Our experiments show that the F1 score measured at the residue level is an effective evaluation of secondary structure detection for individual classes. The test using 28 cryo-EM component maps shows that DeepSSETracer detects β-sheets more accurately than Emap2sec+, with a weighted average residue-level F1 score of 0.65 and 0.42, respectively. It also shows that Emap2sec+ detects helices more accurately than DeepSSETracer with a weighted average residue-level F1 score of 0.77 and 0.72 respectively.
Collapse
Affiliation(s)
| | | | | | | | - Jing He
- *Correspondence: Jing He, ; Jiangwen Sun,
| |
Collapse
|
17
|
Zumbado-Corrales M, Esquivel-Rodríguez J. EvoSeg: Automated Electron Microscopy Segmentation through Random Forests and Evolutionary Optimization. Biomimetics (Basel) 2021; 6:biomimetics6020037. [PMID: 34206006 PMCID: PMC8293153 DOI: 10.3390/biomimetics6020037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/17/2021] [Accepted: 05/28/2021] [Indexed: 11/30/2022] Open
Abstract
Electron Microscopy Maps are key in the study of bio-molecular structures, ranging from borderline atomic level to the sub-cellular range. These maps describe the envelopes that cover possibly a very large number of proteins that form molecular machines within the cell. Within those envelopes, we are interested to find what regions correspond to specific proteins so that we can understand how they function, and design drugs that can enhance or suppress a process that they are involved in, along with other experimental purposes. A classic approach by which we can begin the exploration of map regions is to apply a segmentation algorithm. This yields a mask where each voxel in 3D space is assigned an identifier that maps it to a segment; an ideal segmentation would map each segment to one protein unit, which is rarely the case. In this work, we present a method that uses bio-inspired optimization, through an Evolutionary-Optimized Segmentation algorithm, to iteratively improve upon baseline segments obtained from a classical approach, called watershed segmentation. The cost function used by the evolutionary optimization is based on an ideal segmentation classifier trained as part of this development, which uses basic structural information available to scientists, such as the number of expected units, volume and topology. We show that a basic initial segmentation with the additional information allows our evolutionary method to find better segmentation results, compared to the baseline generated by the watershed.
Collapse
|
18
|
DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc Natl Acad Sci U S A 2021; 118:2017525118. [PMID: 33361332 PMCID: PMC7812826 DOI: 10.1073/pnas.2017525118] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Electron cryomicroscopy (cryo-EM), a 2017 Nobel prize-awarded technology, provides direct 3D maps of macromolecules and explains the shape and interactions of protein complexes such as SARS-CoV-2 viral proteins and human cell receptors. This understanding can be combined with detailed structural information gathered using other technologies to form the basis for modeling course of diseases and for designing therapeutic drugs. However, ab initio modeling of protein complex structure remains a challenging problem. Here, we present DeepTracer, a fully automated and robust tool that determines the all-atom structure of a protein complex based solely on its cryo-EM map and amino acid sequence, with improved accuracy and efficiency compared to previous methods. We also provide a web service for global access. Information about macromolecular structure of protein complexes and related cellular and molecular mechanisms can assist the search for vaccines and drug development processes. To obtain such structural information, we present DeepTracer, a fully automated deep learning-based method for fast de novo multichain protein complex structure determination from high-resolution cryoelectron microscopy (cryo-EM) maps. We applied DeepTracer on a previously published set of 476 raw experimental cryo-EM maps and compared the results with a current state of the art method. The residue coverage increased by over 30% using DeepTracer, and the rmsd value improved from 1.29 Å to 1.18 Å. Additionally, we applied DeepTracer on a set of 62 coronavirus-related cryo-EM maps, among them 10 with no deposited structure available in EMDataResource. We observed an average residue match of 84% with the deposited structures and an average rmsd of 0.93 Å. Additional tests with related methods further exemplify DeepTracer’s competitive accuracy and efficiency of structure modeling. DeepTracer allows for exceptionally fast computations, making it possible to trace around 60,000 residues in 350 chains within only 2 h. The web service is globally accessible at https://deeptracer.uw.edu.
Collapse
|
19
|
Seffernick JT, Lindert S. Hybrid methods for combined experimental and computational determination of protein structure. J Chem Phys 2020; 153:240901. [PMID: 33380110 PMCID: PMC7773420 DOI: 10.1063/5.0026025] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/10/2020] [Indexed: 02/04/2023] Open
Abstract
Knowledge of protein structure is paramount to the understanding of biological function, developing new therapeutics, and making detailed mechanistic hypotheses. Therefore, methods to accurately elucidate three-dimensional structures of proteins are in high demand. While there are a few experimental techniques that can routinely provide high-resolution structures, such as x-ray crystallography, nuclear magnetic resonance (NMR), and cryo-EM, which have been developed to determine the structures of proteins, these techniques each have shortcomings and thus cannot be used in all cases. However, additionally, a large number of experimental techniques that provide some structural information, but not enough to assign atomic positions with high certainty have been developed. These methods offer sparse experimental data, which can also be noisy and inaccurate in some instances. In cases where it is not possible to determine the structure of a protein experimentally, computational structure prediction methods can be used as an alternative. Although computational methods can be performed without any experimental data in a large number of studies, inclusion of sparse experimental data into these prediction methods has yielded significant improvement. In this Perspective, we cover many of the successes of integrative modeling, computational modeling with experimental data, specifically for protein folding, protein-protein docking, and molecular dynamics simulations. We describe methods that incorporate sparse data from cryo-EM, NMR, mass spectrometry, electron paramagnetic resonance, small-angle x-ray scattering, Förster resonance energy transfer, and genetic sequence covariation. Finally, we highlight some of the major challenges in the field as well as possible future directions.
Collapse
Affiliation(s)
- Justin T. Seffernick
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
20
|
Behkamal B, Naghibzadeh M, Pagnani A, Saberi MR, Al Nasr K. Solving the α-helix correspondence problem at medium-resolution Cryo-EM maps through modeling and 3D matching. J Mol Graph Model 2020; 103:107815. [PMID: 33338845 DOI: 10.1016/j.jmgm.2020.107815] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 11/09/2020] [Accepted: 11/18/2020] [Indexed: 11/30/2022]
Abstract
Cryo-electron microscopy (cryo-EM) has recently emerged as a prominent biophysical method for macromolecular structure determination. Many research efforts have been devoted to produce cryo-EM images, density maps, at near-atomic resolution. Despite many advances in technology, the resolution of the generated density maps may not be sufficiently adequate and informative to directly construct the atomic structure of proteins. At medium-resolution (∼4-10 Å), secondary structure elements (α-helices and β-sheets) are discernible, whereas finding the correspondence of secondary structure elements detected in the density map with those on the sequence remains a challenging problem. In this paper, an automatic framework is proposed to solve α-helix correspondence problem in three-dimensional space. Through modeling of the sequence with the aid of a novel strategy, the α-helix correspondence problem is initially transformed into a complete weighted bipartite graph matching problem. An innovative correlation-based scoring function based on a well-known and robust statistical method is proposed for weighting the graph. Moreover, two local optimization algorithms, which are Greedy and Improved Greedy algorithms, have been presented to find α-helix correspondence. A widely used data set including 16 reconstructed and 4 experimental cryo-EM maps were chosen to verify the accuracy and reliability of the proposed automatic method. The experimental results demonstrate that the automatic method is highly efficient (86.25% accuracy), robust (11.3% error rate), fast (∼1.4 s), and works independently from cryo-EM skeleton.
Collapse
Affiliation(s)
- Bahareh Behkamal
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, 9177948944, Iran.
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, 9177948944, Iran.
| | - Andrea Pagnani
- Department of Applied Science and Technology (DISAT), Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, Italy; Italian Institute for Genomic Medicine (IIGM), IRCC-Candiolo, Candiolo, TO, Italy; INFN Sezione di Torino, Via P. Giuria 1, Torino, Italy
| | - Mohammad Reza Saberi
- Medicinal Chemistry Department, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran; Bioinformatics Research group, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN, 37209, USA
| |
Collapse
|
21
|
Alshammari M, He J. Combine Cryo-EM Density Map and Residue Contact for Protein Structure Prediction - A Case Study. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2020; 2020:110. [PMID: 35838376 PMCID: PMC9279007 DOI: 10.1145/3388440.3414708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Cryo-electron microscopy is a major structure determination technique for large molecular machines and membrane-associated complexes. Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. When combined with secondary structure sequence segments predicted from a protein sequence, it is possible to generate a set of likely topologies of α-traces and β-sheet traces. A topology describes the overall folding relationship among secondary structures; it is a critical piece of information for deriving the corresponding atomic structure. We propose a method for protein structure prediction that combines three sources of information: the secondary structure traces detected from the cryo-EM density map, predicted secondary structure sequence segments, and amino acid contact pairs predicted using MULTICOM. A case study shows that using amino acid contact prediction from MULTICOM improves the ranking of the true topology. Our observations convey that using a small set of highly voted secondary structure contact pairs enhances the ranking in all experiments conducted for this case.
Collapse
|
22
|
Deng Y, Mu Y, Sazzed S, Sun J, He J. Using Curriculum Learning in Pattern Recognition of 3-dimensional Cryo-electron Microscopy Density Maps. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2020; 2020:112. [PMID: 35838357 PMCID: PMC9279008 DOI: 10.1145/3388440.3414710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Although Cryo-electron microscopy (cryo-EM) has been successfully used to derive atomic structures for many proteins, it is still challenging to derive atomic structure when the resolution of cryo-EM density maps is in the medium range, e.g., 5-10 Å. Studies have attempted to utilize machine learning methods, especially deep neural networks to build predictive models for the detection of protein secondary structures from cryo-EM images, which ultimately helps to derive the atomic structure of proteins. However, the large variation in data quality makes it challenging to train a deep neural network with high prediction accuracy. Curriculum learning has been shown as an effective learning paradigm in machine learning. In this paper, we present a study using curriculum learning as a more effective way to utilize cryo-EM density maps with varying quality. We investigated three distinct training curricula that differ in whether/how images used for training in past are reused while the network was continually trained using new images. A total of 1,382 3-dimensional cryo-EM images were extracted from density maps of Electron Microscopy Data Bank in our study. Our results indicate learning with curriculum significantly improves the performance of the final trained network when the forgetting problem is properly addressed.
Collapse
Affiliation(s)
- Yangmei Deng
- Department of Computer Science, Old Dominion University, Norfolk VA USA
| | - Yongcheng Mu
- Department of Computer Science, Old Dominion University, Norfolk VA USA
| | - Salim Sazzed
- Department of Computer Science, Old Dominion University, Norfolk VA USA
| | - Jiangwen Sun
- Department of Computer Science, Old Dominion University, Norfolk VA USA
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk VA USA
| |
Collapse
|
23
|
Mostosi P, Schindelin H, Kollmannsberger P, Thorn A. Haruspex: A Neural Network for the Automatic Identification of Oligonucleotides and Protein Secondary Structure in Cryo-Electron Microscopy Maps. Angew Chem Int Ed Engl 2020; 59:14788-14795. [PMID: 32187813 PMCID: PMC7497202 DOI: 10.1002/anie.202000421] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Revised: 03/11/2020] [Indexed: 11/25/2022]
Abstract
In recent years, three-dimensional density maps reconstructed from single particle images obtained by electron cryo-microscopy (cryo-EM) have reached unprecedented resolution. However, map interpretation can be challenging, in particular if the constituting structures require de-novo model building or are very mobile. Herein, we demonstrate the potential of convolutional neural networks for the annotation of cryo-EM maps: our network Haruspex has been trained on a carefully curated set of 293 experimentally derived reconstruction maps to automatically annotate RNA/DNA as well as protein secondary structure elements. It can be straightforwardly applied to newly reconstructed maps in order to support domain placement or as a starting point for main-chain placement. Due to its high recall and precision rates of 95.1 % and 80.3 %, respectively, on an independent test set of 122 maps, it can also be used for validation during model building. The trained network will be available as part of the CCP-EM suite.
Collapse
Affiliation(s)
- Philipp Mostosi
- Institute of Structural BiologyRudolf Virchow Center for Experimental BiomedicineUniversity of WürzburgJosef-Schneider-Str. 297080WürzburgGermany
- Center for Computational and Theoretical BiologyUniversity of WürzburgCampus Hubland Nord 3297074WürzburgGermany
| | - Hermann Schindelin
- Institute of Structural BiologyRudolf Virchow Center for Experimental BiomedicineUniversity of WürzburgJosef-Schneider-Str. 297080WürzburgGermany
| | - Philip Kollmannsberger
- Center for Computational and Theoretical BiologyUniversity of WürzburgCampus Hubland Nord 3297074WürzburgGermany
| | - Andrea Thorn
- Institute of Structural BiologyRudolf Virchow Center for Experimental BiomedicineUniversity of WürzburgJosef-Schneider-Str. 297080WürzburgGermany
| |
Collapse
|
24
|
Sazzed S, Scheible P, Alshammari M, Wriggers W, He J. Cylindrical Similarity Measurement for Helices in Medium-Resolution Cryo-Electron Microscopy Density Maps. J Chem Inf Model 2020; 60:2644-2650. [PMID: 32216344 PMCID: PMC8279803 DOI: 10.1021/acs.jcim.0c00010] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Cryo-electron microscopy (cryo-EM) density maps at medium resolution (5-10 Å) reveal secondary structural features such as α-helices and β-sheets, but they lack the side chain details that would enable a direct structure determination. Among the more than 800 entries in the Electron Microscopy Data Bank (EMDB) of medium-resolution density maps that are associated with atomic models, a wide variety of similarities can be observed between maps and models. To validate such atomic models and to classify structural features, a local similarity criterion, the F1 score, is proposed and evaluated in this study. The F1 score is theoretically normalized to a range from zero to one, providing a local measure of cylindrical agreement between the density and atomic model of a helix. A systematic scan of 30,994 helices (among 3,247 protein chains modeled into medium-resolution density maps) reveals an actual range of observed F1 scores from 0.171 to 0.848, suggesting that the cylindrical fit of the current data is well stratified by the proposed measure. The best (highest) F1 scores tend to be associated with regions that exhibit high and spatially homogeneous local resolution (between 5 Å and 7.5 Å) in the helical density. The proposed F1 scores can be used as a discriminative classifier for validation studies and as a ranking criterion for cryo-EM density features in databases.
Collapse
Affiliation(s)
- Salim Sazzed
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Peter Scheible
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Maytha Alshammari
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, United States
| |
Collapse
|
25
|
Mostosi P, Schindelin H, Kollmannsberger P, Thorn A. Haruspex: A Neural Network for the Automatic Identification of Oligonucleotides and Protein Secondary Structure in Cryo‐Electron Microscopy Maps. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.202000421] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Philipp Mostosi
- Institute of Structural Biology Rudolf Virchow Center for Experimental Biomedicine University of Würzburg Josef-Schneider-Str. 2 97080 Würzburg Germany
- Center for Computational and Theoretical Biology University of Würzburg Campus Hubland Nord 32 97074 Würzburg Germany
| | - Hermann Schindelin
- Institute of Structural Biology Rudolf Virchow Center for Experimental Biomedicine University of Würzburg Josef-Schneider-Str. 2 97080 Würzburg Germany
| | - Philip Kollmannsberger
- Center for Computational and Theoretical Biology University of Würzburg Campus Hubland Nord 32 97074 Würzburg Germany
| | - Andrea Thorn
- Institute of Structural Biology Rudolf Virchow Center for Experimental Biomedicine University of Würzburg Josef-Schneider-Str. 2 97080 Würzburg Germany
| |
Collapse
|
26
|
Si D, Moritz SA, Pfab J, Hou J, Cao R, Wang L, Wu T, Cheng J. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci Rep 2020; 10:4282. [PMID: 32152330 PMCID: PMC7063051 DOI: 10.1038/s41598-020-60598-y] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 02/10/2020] [Indexed: 11/29/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has become a leading technology for determining protein structures. Recent advances in this field have allowed for atomic resolution. However, predicting the backbone trace of a protein has remained a challenge on all but the most pristine density maps (<2.5 Å resolution). Here we introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict Cα atoms along a protein's backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein's structure. This model predicts secondary structure elements (SSEs), backbone structure, and Cα atoms, combining the results of each to produce a complete prediction map. The cascaded-CNN is a semantic segmentation image classifier and was trained using thousands of simulated density maps. This method is largely automatic and only requires a recommended threshold value for each protein density map. A specialized tabu-search path walking algorithm was used to produce an initial backbone trace with Cα placements. A helix-refinement algorithm made further improvements to the α-helix SSEs of the backbone trace. Finally, a novel quality assessment-based combinatorial algorithm was used to effectively map protein sequences onto Cα traces to obtain full-atom protein structures. This method was tested on 50 experimental maps between 2.6 Å and 4.4 Å resolution. It outperformed several state-of-the-art prediction methods including Rosetta de-novo, MAINMAST, and a Phenix based method by producing the most complete predicted protein structures, as measured by percentage of found Cα atoms. This method accurately predicted 88.9% (mean) of the Cα atoms within 3 Å of a protein's backbone structure surpassing the 66.8% mark achieved by the leading alternate method (Phenix based fully automatic method) on the same set of density maps. The C-CNN also achieved an average root-mean-square deviation (RMSD) of 1.24 Å on a set of 50 experimental density maps which was tested by the Phenix based fully automatic method. The source code and demo of this research has been published at https://github.com/DrDongSi/Ca-Backbone-Prediction.
Collapse
Affiliation(s)
- Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, 98011, USA.
| | - Spencer A Moritz
- Division of Computing & Software Systems, University of Washington, Bothell, WA, 98011, USA
| | - Jonas Pfab
- Division of Computing & Software Systems, University of Washington, Bothell, WA, 98011, USA
| | - Jie Hou
- Department of Computer Science, Saint Louis University, Saint Louis, MO, 63103, USA
- Program in Bioinformatics & Computational Biology, Saint Louis University, Saint Louis, MO, 63103, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, 98447, USA
| | - Liguo Wang
- Department of Biological Structure, University of Washington, Seattle, WA, 98185, USA
| | - Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| |
Collapse
|
27
|
Alnabati E, Kihara D. Advances in Structure Modeling Methods for Cryo-Electron Microscopy Maps. Molecules 2019; 25:molecules25010082. [PMID: 31878333 PMCID: PMC6982917 DOI: 10.3390/molecules25010082] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 12/20/2019] [Accepted: 12/20/2019] [Indexed: 01/16/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) has now become a widely used technique for structure determination of macromolecular complexes. For modeling molecular structures from density maps of different resolutions, many algorithms have been developed. These algorithms can be categorized into rigid fitting, flexible fitting, and de novo modeling methods. It is also observed that machine learning (ML) techniques have been increasingly applied following the rapid progress of the ML field. Here, we review these different categories of macromolecule structure modeling methods and discuss their advances over time.
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
- Correspondence:
| |
Collapse
|
28
|
Advances in image processing for single-particle analysis by electron cryomicroscopy and challenges ahead. Curr Opin Struct Biol 2018; 52:127-145. [PMID: 30509756 DOI: 10.1016/j.sbi.2018.11.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 10/26/2018] [Accepted: 11/17/2018] [Indexed: 12/20/2022]
Abstract
Electron cryomicroscopy (cryoEM) is essential for the study and functional understanding of non-crystalline macromolecules such as proteins. These molecules cannot be imaged using X-ray crystallography or other popular methods. CryoEM has been successfully used to visualize macromolecular complexes such as ribosomes, viruses, and ion channels. Determination of structural models of these at various conformational states leads to insight on how these molecules function. Recent advances in imaging technology have given cryoEM a scientific rebirth. As a result of these technological advances image processing and analysis have yielded molecular structures at atomic resolution. Nevertheless there continue to be challenges in image processing, and in this article we will touch on the most essential in order to derive an accurate three-dimensional model from noisy projection images. Traditional approaches, such as k-means clustering for class averaging, will be provided as background. We will then highlight new approaches for each image processing subproblem, including a 3D reconstruction method for asymmetric molecules using just two projection images and deep learning algorithms for automated particle picking.
Collapse
|
29
|
Haslam D, Zeng T, Li R, He J. Exploratory Studies Detecting Secondary Structures in Medium Resolution 3D Cryo-EM Images Using Deep Convolutional Neural Networks. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2018; 2018:628-632. [PMID: 35838356 PMCID: PMC9279009 DOI: 10.1145/3233547.3233704] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is an emerging biophysical technique for structural determination of protein complexes. However, accurate detection of secondary structures is still challenging when cryo-EM density maps are at medium resolutions (5-10 Å). Most of existing methods are image processing methods that do not fully utilize available images in the cryo-EM database. In this paper, we present a deep learning approach to segment secondary structure elements as helices and β-sheets from medium-resolution density maps. The proposed 3D convolutional neural network is shown to detect secondary structure locations with an F1 score between 0.79 and 0.88 for six simulated test cases. The architecture was also applied to an experimentally-derived cryo-EM density map with good accuracy.
Collapse
Affiliation(s)
- Devin Haslam
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529
| | - Tao Zeng
- Department of Computer Science, Washington State University, Pullman, WA 99164
| | | | - Jing He
- Corresponding author: Jing He,
| |
Collapse
|
30
|
Tiemann JK, Rose AS, Ismer J, Darvish MD, Hilal T, Spahn CM, Hildebrand PW. FragFit: a web-application for interactive modeling of protein segments into cryo-EM density maps. Nucleic Acids Res 2018; 46:W310-W314. [PMID: 29788317 PMCID: PMC6030921 DOI: 10.1093/nar/gky424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 05/10/2018] [Indexed: 11/20/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) is a standard method to determine the three-dimensional structures of molecular complexes. However, easy to use tools for modeling of protein segments into cryo-EM maps are sparse. Here, we present the FragFit web-application, a web server for interactive modeling of segments of up to 35 amino acids length into cryo-EM density maps. The fragments are provided by a regularly updated database containing at the moment about 1 billion entries extracted from PDB structures and can be readily integrated into a protein structure. Fragments are selected based on geometric criteria, sequence similarity and fit into a given cryo-EM density map. Web-based molecular visualization with the NGL Viewer allows interactive selection of fragments. The FragFit web-application, accessible at http://proteinformatics.de/FragFit, is free and open to all users, without any login requirements.
Collapse
Affiliation(s)
- Johanna Ks Tiemann
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany.,Institute of Medical Physics and Biophysics, Medical University Leipzig, Leipzig, Sachsen 04107, Germany
| | - Alexander S Rose
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany
| | - Jochen Ismer
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany
| | - Mitra D Darvish
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany
| | - Tarek Hilal
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany
| | - Christian Mt Spahn
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany
| | - Peter W Hildebrand
- Institute of Medical Physics and Biophysics, Charité University Medicine Berlin, Berlin 10117, Germany.,Institute of Medical Physics and Biophysics, Medical University Leipzig, Leipzig, Sachsen 04107, Germany
| |
Collapse
|
31
|
Haslam D, Sazzed S, Wriggers W, Kovcas J, Song J, Auer M, He J. A Pattern Recognition Tool for Medium-resolution Cryo-EM Density Maps and Low-resolution Cryo-ET Density maps. BIOINFORMATICS RESEARCH AND APPLICATIONS : 14TH INTERNATIONAL SYMPOSIUM, ISBRA 2018, BEIJING, CHINA, JUNE 8-11, 2018, PROCEEDINGS. ISBRA (CONFERENCE) (14TH : 2018 : BEIJING, CHINA) 2018; 10847:233-238. [PMID: 36383494 PMCID: PMC9645795 DOI: 10.1007/978-3-319-94968-0_22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Cryo-electron microscopy (Cryo-EM) and cryo-electron tomography (cryo-ET) produce 3-D density maps of biological molecules at a range of resolution levels. Pattern recognition tools are important in distinguishing biological components from volumetric maps with the available resolutions. One of the most distinct characters in density maps at medium (5-10 Å) resolution is the visibility of protein secondary structures. Although computational methods have been developed, the accurate detection of helices and β-strands from cryo-EM density maps is still an active research area. We have developed a tool for protein secondary structure detection and evaluation of medium resolution 3-D cryo-EM density maps which combines three computational methods (SSETracer, StrandTwister, and AxisComparison). The program was integrated in UCSF Chimera, a popular visualization software in the cryo-EM community. In related work, we have developed BundleTrac, a computational method to trace filaments in a bundle from lower resolution cryo-ET density maps. It has been applied to actin filament tracing in stereocilia with good accuracy and can be potentially added as a tool in Chimera.
Collapse
Affiliation(s)
- Devin Haslam
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - Salim Sazzed
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, USA
| | - Julio Kovcas
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, USA
| | - Junha Song
- Cell and Tissue Imaging, Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Manfred Auer
- Cell and Tissue Imaging, Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
32
|
Tracing Actin Filament Bundles in Three-Dimensional Electron Tomography Density Maps of Hair Cell Stereocilia. Molecules 2018; 23:molecules23040882. [PMID: 29641472 PMCID: PMC6017643 DOI: 10.3390/molecules23040882] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 03/14/2018] [Accepted: 03/22/2018] [Indexed: 12/20/2022] Open
Abstract
Cryo-electron tomography (cryo-ET) is a powerful method of visualizing the three-dimensional organization of supramolecular complexes, such as the cytoskeleton, in their native cell and tissue contexts. Due to its minimal electron dose and reconstruction artifacts arising from the missing wedge during data collection, cryo-ET typically results in noisy density maps that display anisotropic XY versus Z resolution. Molecular crowding further exacerbates the challenge of automatically detecting supramolecular complexes, such as the actin bundle in hair cell stereocilia. Stereocilia are pivotal to the mechanoelectrical transduction process in inner ear sensory epithelial hair cells. Given the complexity and dense arrangement of actin bundles, traditional approaches to filament detection and tracing have failed in these cases. In this study, we introduce BundleTrac, an effective method to trace hundreds of filaments in a bundle. A comparison between BundleTrac and manually tracing the actin filaments in a stereocilium showed that BundleTrac accurately built 326 of 330 filaments (98.8%), with an overall cross-distance of 1.3 voxels for the 330 filaments. BundleTrac is an effective semi-automatic modeling approach in which a seed point is provided for each filament and the rest of the filament is computationally identified. We also demonstrate the potential of a denoising method that uses a polynomial regression to address the resolution and high-noise anisotropic environment of the density map.
Collapse
|
33
|
Al Nasr K, Yousef F, Jebril R, Jones C. Analytical Approaches to Improve Accuracy in Solving the Protein Topology Problem. Molecules 2018; 23:E28. [PMID: 29360779 PMCID: PMC6017786 DOI: 10.3390/molecules23020028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 01/19/2018] [Accepted: 01/19/2018] [Indexed: 11/17/2022] Open
Abstract
To take advantage of recent advances in genomics and proteomics it is critical that the three-dimensional physical structure of biological macromolecules be determined. Cryo-Electron Microscopy (cryo-EM) is a promising and improving method for obtaining this data, however resolution is often not sufficient to directly determine the atomic scale structure. Despite this, information for secondary structure locations is detectable. De novo modeling is a computational approach to modeling these macromolecular structures based on cryo-EM derived data. During de novo modeling a mapping between detected secondary structures and the underlying amino acid sequence must be identified. DP-TOSS (Dynamic Programming for determining the Topology Of Secondary Structures) is one tool that attempts to automate the creation of this mapping. By treating the correspondence between the detected structures and the structures predicted from sequence data as a constraint graph problem DP-TOSS achieved good accuracy in its original iteration. In this paper, we propose modifications to the scoring methodology of DP-TOSS to improve its accuracy. Three scoring schemes were applied to DP-TOSS and tested: (i) a skeleton-based scoring function; (ii) a geometry-based analytical function; and (iii) a multi-well potential energy-based function. A test of 25 proteins shows that a combination of these schemes can improve the performance of DP-TOSS to solve the topology determination problem for macromolecule proteins.
Collapse
Affiliation(s)
- Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
| | - Feras Yousef
- Department of Mathematics, The University of Jordan, Amman 11942, Jordan.
| | - Ruba Jebril
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
| | - Christopher Jones
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
| |
Collapse
|
34
|
Islam T, Poteat M, He J. Quantification of Twist from the Central Lines of β-Strands. J Comput Biol 2018; 25:114-120. [PMID: 29313736 DOI: 10.1089/cmb.2017.0174] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Since the discovery of right-handed twist of a β-strand, many studies have been conducted to understand the twist. Given the atomic structure of a protein, twist angles have been defined using atomic positions of the backbone. However, limited study is available to characterize twist when the atomic positions are not available, but the central lines of β-strands are. Recent studies in cryoelectron microscopy show that it is possible to predict the central lines of β-strands from a medium-resolution density map. Accurate measurement of twist angles is important in identification of β-strands from such density maps. We propose an effective method to quantify twist angles from a set of splines. In a data set of 55 pairs of β-strands from 11 β-sheets of 11 proteins, the spline measurement shows comparable results as measured using the discrete method that uses atomic positions directly, particularly in capturing twist angle change along a pair, different levels of twist among different pairs, and the average of twist angles. The proposed method provides an alternative method to characterize twist using the central lines of a β-sheet.
Collapse
Affiliation(s)
- Tunazzina Islam
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Michael Poteat
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Jing He
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| |
Collapse
|
35
|
Ismer J, Rose AS, Tiemann JKS, Hildebrand PW. A fragment based method for modeling of protein segments into cryo-EM density maps. BMC Bioinformatics 2017; 18:475. [PMID: 29132296 PMCID: PMC5683378 DOI: 10.1186/s12859-017-1904-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 11/01/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Single-particle analysis of electron cryo-microscopy (cryo-EM) is a key technology for elucidation of macromolecular structures. Recent technical advances in hardware and software developments significantly enhanced the resolution of cryo-EM density maps and broadened the applicability and the circle of users. To facilitate modeling of macromolecules into cryo-EM density maps, fast and easy to use methods for modeling are now demanded. RESULTS Here we investigated and benchmarked the suitability of a classical and well established fragment-based approach for modeling of segments into cryo-EM density maps (termed FragFit). FragFit uses a hierarchical strategy to select fragments from a pre-calculated set of billions of fragments derived from structures deposited in the Protein Data Bank, based on sequence similarly, fit of stem atoms and fit to a cryo-EM density map. The user only has to specify the sequence of the segment and the number of the N- and C-terminal stem-residues in the protein. Using a representative data set of protein structures, we show that protein segments can be accurately modeled into cryo-EM density maps of different resolution by FragFit. Prediction quality depends on segment length, the type of secondary structure of the segment and local quality of the map. CONCLUSION Fast and automated calculation of FragFit renders it applicable for implementation of interactive web-applications e.g. to model missing segments, flexible protein parts or hinge-regions into cryo-EM density maps.
Collapse
Affiliation(s)
- Jochen Ismer
- Institute of Medical Physics and Biophysics, University Medicine Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Alexander S Rose
- Institute of Medical Physics and Biophysics, University Medicine Berlin, Charitéplatz 1, 10117, Berlin, Germany.,RCSB Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, CA, 92093-0743, USA
| | - Johanna K S Tiemann
- Institute of Medical Physics and Biophysics, University Medicine Berlin, Charitéplatz 1, 10117, Berlin, Germany.,Institute of Medical Physics and Biophysics, University Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany
| | - Peter W Hildebrand
- Institute of Medical Physics and Biophysics, University Medicine Berlin, Charitéplatz 1, 10117, Berlin, Germany. .,Institute of Medical Physics and Biophysics, University Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany.
| |
Collapse
|
36
|
Ng A, Si D. Beta-Barrel Detection for Medium Resolution Cryo-Electron Microscopy Density Maps Using Genetic Algorithms and Ray Tracing. J Comput Biol 2017; 25:326-336. [PMID: 29035579 DOI: 10.1089/cmb.2017.0155] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) is a technique that produces three-dimensional density maps of large protein complexes. This allows for the study of the structure of these proteins. Identifying the secondary structures within proteins is vital to understanding the overall structure and function of the protein. The [Formula: see text]-barrel is one such secondary structure, commonly found in lipocalins and membrane proteins. In this article, we present a novel approach that utilizes genetic algorithms, kd-trees, and ray tracing to automatically detect and extract [Formula: see text]-barrels from cryo-EM density maps. This approach was tested on simulated and experimental density maps with zero, one, or multiple barrels in the density map. The results suggest that the proposed approach is capable of performing automatic detection of [Formula: see text]-barrels from medium resolution cryo-EM density maps.
Collapse
Affiliation(s)
- Albert Ng
- 1 Division of Computing and Software Systems, University of Washington Bothell , Bothell, Washington
| | - Dong Si
- 1 Division of Computing and Software Systems, University of Washington Bothell , Bothell, Washington
| |
Collapse
|
37
|
Biswas A, Ranjan D, Zubair M, Zeil S, Nasr KA, He J. An Effective Computational Method Incorporating Multiple Secondary Structure Predictions in Topology Determination for Cryo-EM Images. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:578-586. [PMID: 27008671 PMCID: PMC5071113 DOI: 10.1109/tcbb.2016.2543721] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A key idea in de novo modeling of a medium-resolution density image obtained from cryo-electron microscopy is to compute the optimal mapping between the secondary structure traces observed in the density image and those predicted on the protein sequence. When secondary structures are not determined precisely, either from the image or from the amino acid sequence of the protein, the computational problem becomes more complex. We present an efficient method that addresses the secondary structure placement problem in presence of multiple secondary structure predictions and computes the optimal mapping. We tested the method using 12 simulated images from α-proteins and two Cryo-EM images of α-β proteins. We observed that the rank of the true topologies is consistently improved by using multiple secondary structure predictions instead of a single prediction. The results show that the algorithm is robust and works well even when errors/misses in the predicted secondary structures are present in the image or the sequence. The results also show that the algorithm is efficient and is able to handle proteins with as many as 33 helices.
Collapse
Affiliation(s)
- Abhishek Biswas
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Desh Ranjan
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Mohammad Zubair
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Stephanie Zeil
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| | - Kamal Al Nasr
- Dept. of Computer Science, Tennessee State University, Nashville, TN 37209
| | - Jing He
- Dept. of Computer Science, Old Dominion University, Norfolk, VA 23529
| |
Collapse
|
38
|
Li R, Si D, Zeng T, Ji S, He J. Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2017; 2016:41-46. [PMID: 29770260 DOI: 10.1109/bibm.2016.7822490] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The detection of secondary structure of proteins using three dimensional (3D) cryo-electron microscopy (cryo-EM) images is still a challenging task when the spatial resolution of cryo-EM images is at medium level (5-10Å ). Prior researches focused on the usage of local features that may not capture the global information of image objects. In this study, we propose to use deep learning methods to extract high representative global features and then automatically detect secondary structures of proteins. In particular, we build a convolutional neural network (CNN) classifier that predicts the probability of label for every individual voxel in 3D cryo-EM image with respect to the secondary structure elements of proteins such as α-helix, β-sheet and background. To effectively incorporate the 3D spatial information in protein structures, we propose to perform 3D convolutions in the convolutional layers of CNNs. We show that the proposed CNN classifier can outperform existing SVM method on identifying the secondary structure elements of proteins from 3D cryo-EM medium resolution images.
Collapse
Affiliation(s)
- Rongjian Li
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529
| | - Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011
| | - Tao Zeng
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164
| | - Shuiwang Ji
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529
| |
Collapse
|
39
|
Si D, He J. Modeling Beta-Traces for Beta-Barrels from Cryo-EM Density Maps. BIOMED RESEARCH INTERNATIONAL 2017; 2017:1793213. [PMID: 28164115 PMCID: PMC5259677 DOI: 10.1155/2017/1793213] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 12/08/2016] [Indexed: 01/09/2023]
Abstract
Cryo-electron microscopy (cryo-EM) has produced density maps of various resolutions. Although α-helices can be detected from density maps at 5-8 Å resolutions, β-strands are challenging to detect at such density maps due to close-spacing of β-strands. The variety of shapes of β-sheets adds the complexity of β-strands detection from density maps. We propose a new approach to model traces of β-strands for β-barrel density regions that are extracted from cryo-EM density maps. In the test containing eight β-barrels extracted from experimental cryo-EM density maps at 5.5 Å-8.25 Å resolution, StrandRoller detected about 74.26% of the amino acids in the β-strands with an overall 2.05 Å 2-way distance between the detected β-traces and the observed ones, if the best of the fifteen detection cases is considered.
Collapse
Affiliation(s)
- Dong Si
- Division of Computing and Software Systems, University of Washington Bothell, Bothell, WA 98011, USA
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
40
|
Zeil S, Kovacs J, Wriggers W, He J. Comparing an Atomic Model or Structure to a Corresponding Cryo-electron Microscopy Image at the Central Axis of a Helix. J Comput Biol 2017; 24:52-67. [PMID: 27936925 PMCID: PMC5220566 DOI: 10.1089/cmb.2016.0145] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
Abstract
Three-dimensional density maps of biological specimens from cryo-electron microscopy (cryo-EM) can be interpreted in the form of atomic models that are modeled into the density, or they can be compared to known atomic structures. When the central axis of a helix is detectable in a cryo-EM density map, it is possible to quantify the agreement between this central axis and a central axis calculated from the atomic model or structure. We propose a novel arc-length association method to compare the two axes reliably. This method was applied to 79 helices in simulated density maps and six case studies using cryo-EM maps at 6.4-7.7 Å resolution. The arc-length association method is then compared to three existing measures that evaluate the separation of two helical axes: a two-way distance between point sets, the length difference between two axes, and the individual amino acid detection accuracy. The results show that our proposed method sensitively distinguishes lateral and longitudinal discrepancies between the two axes, which makes the method particularly suitable for the systematic investigation of cryo-EM map-model pairs.
Collapse
Affiliation(s)
- Stephanie Zeil
- Department of Computer Science, Old Dominion University, Norfolk, Virginia
| | - Julio Kovacs
- Department of Mechanical and Aerospace Engineering and Institute of Biomedical Engineering, Old Dominion University, Norfolk, Virginia
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering and Institute of Biomedical Engineering, Old Dominion University, Norfolk, Virginia
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, Virginia
| |
Collapse
|
41
|
Haslam D, Zubair M, Ranjan D, Biswas A, He J. CHALLENGES IN MATCHING SECONDARY STRUCTURES IN CRYO-EM: AN EXPLORATION. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2016; 2016:1714-1719. [PMID: 29770261 PMCID: PMC5952047 DOI: 10.1109/bibm.2016.7822776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Cryo-electron microscopy is a fast emerging biophysical technique for structural determination of large protein complexes. While more atomic structures are being determined using this technique, it is still challenging to derive atomic structures from density maps produced at medium resolution when no suitable templates are available. A critical step in structure determination is how a protein chain threads through the 3-dimensional density map. A dynamic programming method was previously developed to generate K best matches of secondary structures between the density map and its protein sequence using shortest paths in a related weighted graph. We discuss challenges associated with the creation of the weighted graph and explore heuristic methods to solve the problem of matching secondary structures.
Collapse
Affiliation(s)
- Devin Haslam
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| | - Mohammad Zubair
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| | - Desh Ranjan
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| | | | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk VA23529
| |
Collapse
|
42
|
Constrained cyclic coordinate descent for cryo-EM images at medium resolutions: beyond the protein loop closure problem. ROBOTICA 2016; 34:1777-1790. [DOI: 10.1017/s0263574716000242] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
SUMMARYThe cyclic coordinate descent (CCD) method is a popular loop closure method in protein structure modeling. It is a robotics algorithm originally developed for inverse kinematic applications. We demonstrate an effective method of building the backbone of protein structure models using the principle of CCD and a guiding trace. For medium-resolution 3-dimensional (3D) images derived using cryo-electron microscopy (cryo-EM), it is possible to obtain guiding traces of secondary structures and their skeleton connections. Our new method, constrained cyclic coordinate descent (CCCD), builds α-helices, β-strands, and loops quickly and fairly accurately along predefined traces. We show that it is possible to build the entire backbone of a protein fairly accurately when the guiding traces are accurate. In a test of 10 proteins, the models constructed using CCCD show an average of 3.91 Å of backbone root mean square deviation (RMSD). When the CCCD method is incorporated in a simulated annealing framework to sample possible shift, translation, and rotation freedom, the models built with the true topology were ranked high on the list, with an average backbone RMSD100 of 3.76 Å. CCCD is an effective method for modeling atomic structures after secondary structure traces and skeletons are extracted from 3D cryo-EM images.
Collapse
|
43
|
He J, Zeil S, Hallak H, McKaig K, Kovacs J, Wriggers W. Comparison of an Atomic Model and Its Cryo-EM Image at the Central Axis of a Helix. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2015; 2015:1253-1259. [PMID: 27280059 PMCID: PMC4894056 DOI: 10.1109/bibm.2015.7359860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Cryo-electron microscopy (cryo-EM) is an important biophysical technique that produces three-dimensional (3D) density maps at different resolutions. Because more and more models are being produced from cryo-EM density maps, validation of the models is becoming important. We propose a method for measuring local agreement between a model and the density map using the central axis of the helix. This method was tested using 19 helices from cryo-EM density maps between 5.5 Å and 7.2 Å resolution and 94 helices from simulated density maps. This method distinguished most of the well-fitting helices, although challenges exist for shorter helices.
Collapse
Affiliation(s)
- Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529
| | - Stephanie Zeil
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529
| | - Hussam Hallak
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529
| | - Kele McKaig
- Department of Computer Science, Old Dominion University, Norfolk, VA, 23529
| | - Julio Kovacs
- Department of Mechanical & Aerospace Engineering, Old Dominion University, Norfolk, VA, 23529
| | - Willy Wriggers
- Department of Mechanical & Aerospace Engineering, Old Dominion University, Norfolk, VA, 23529
| |
Collapse
|
44
|
Wriggers W, He J. Numerical geometry of map and model assessment. J Struct Biol 2015; 192:255-61. [PMID: 26416532 DOI: 10.1016/j.jsb.2015.09.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 09/18/2015] [Accepted: 09/24/2015] [Indexed: 10/23/2022]
Abstract
We are describing best practices and assessment strategies for the atomic interpretation of cryo-electron microscopy (cryo-EM) maps. Multiscale numerical geometry strategies in the Situs package and in secondary structure detection software are currently evolving due to the recent increases in cryo-EM resolution. Criteria that aim to predict the accuracy of fitted atomic models at low (worse than 8Å) and medium (4-8 Å) resolutions remain challenging. However, a high level of confidence in atomic models can be achieved by combining such criteria. The observed errors are due to map-model discrepancies and due to the effect of imperfect global docking strategies. Extending the earlier motion capture approach developed for flexible fitting, we use simulated fiducials (pseudoatoms) at varying levels of coarse-graining to track the local drift of structural features. We compare three tracking approaches: naïve vector quantization, a smoothly deformable model, and a tessellation of the structure into rigid Voronoi cells, which are fitted using a multi-fragment refinement approach. The lowest error is an upper bound for the (small) discrepancy between the crystal structure and the EM map due to different conditions in their structure determination. When internal features such as secondary structures are visible in medium-resolution EM maps, it is possible to extend the idea of point-based fiducials to more complex geometric representations such as helical axes, strands, and skeletons. We propose quantitative strategies to assess map-model pairs when such secondary structure patterns are prominent.
Collapse
Affiliation(s)
- Willy Wriggers
- Department of Mechanical & Aerospace Engineering, Old Dominion University, Norfolk, VA 23529, United States.
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, United States.
| |
Collapse
|
45
|
Biswas A, Ranjan D, Zubair M, He J. A Dynamic Programming Algorithm for Finding the Optimal Placement of a Secondary Structure Topology in Cryo-EM Data. J Comput Biol 2015; 22:837-43. [PMID: 26244416 DOI: 10.1089/cmb.2015.0120] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The determination of secondary structure topology is a critical step in deriving the atomic structures from the protein density maps obtained from electron cryomicroscopy technique. This step often relies on matching the secondary structure traces detected from the protein density map to the secondary structure sequence segments predicted from the amino acid sequence. Due to inaccuracies in both sources of information, a pool of possible secondary structure positions needs to be sampled. One way to approach the problem is to first derive a small number of possible topologies using existing matching algorithms, and then find the optimal placement for each possible topology. We present a dynamic programming method of Θ(Nq(2)h) to find the optimal placement for a secondary structure topology. We show that our algorithm requires significantly less computational time than the brute force method that is in the order of Θ(q(N) h).
Collapse
Affiliation(s)
- Abhishek Biswas
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Desh Ranjan
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Mohammad Zubair
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| | - Jing He
- Department of Computer Science, Old Dominion University , Norfolk, Virginia
| |
Collapse
|
46
|
Si D, He J. Tracing Beta Strands Using StrandTwister from Cryo-EM Density Maps at Medium Resolutions. Structure 2014; 22:1665-76. [DOI: 10.1016/j.str.2014.08.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Revised: 08/07/2014] [Accepted: 08/08/2014] [Indexed: 10/24/2022]
|
47
|
López-Blanco JR, Chacón P. Structural modeling from electron microscopy data. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2014. [DOI: 10.1002/wcms.1199] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- José Ramón López-Blanco
- Department of Biological Physical Chemistry; Rocasolano Physical Chemistry Institute, CSIC; Madrid Spain
| | - Pablo Chacón
- Department of Biological Physical Chemistry; Rocasolano Physical Chemistry Institute, CSIC; Madrid Spain
| |
Collapse
|
48
|
Al Nasr K, Ranjan D, Zubair M, Chen L, He J. Solving the Secondary Structure Matching Problem in Cryo-EM De Novo Modeling Using a Constrained K-Shortest Path Graph Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:419-430. [PMID: 26355788 DOI: 10.1109/tcbb.2014.2302803] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Electron cryomicroscopy is becoming a major experimental technique in solving the structures of large molecular assemblies. More and more three-dimensional images have been obtained at the medium resolutions between 5 and 10 Å. At this resolution range, major α-helices can be detected as cylindrical sticks and β-sheets can be detected as plain-like regions. A critical question in de novo modeling from cryo-EM images is to determine the match between the detected secondary structures from the image and those on the protein sequence. We formulate this matching problem into a constrained graph problem and present an O(Δ(2)N(2)2(N)) algorithm to this NP-Hard problem. The algorithm incorporates the dynamic programming approach into a constrained K-shortest path algorithm. Our method, DP-TOSS, has been tested using α-proteins with maximum 33 helices and α-β proteins up to five helices and 12 β-strands. The correct match was ranked within the top 35 for 19 of the 20 α-proteins and all nine α-β proteins tested. The results demonstrate that DP-TOSS improves accuracy, time and memory space in deriving the topologies of the secondary structure elements for proteins with a large number of secondary structures and a complex skeleton.
Collapse
|
49
|
McKnight A, Si D, Al Nasr K, Chernikov A, Chrisochoides N, He J. Estimating loop length from CryoEM images at medium resolutions. BMC STRUCTURAL BIOLOGY 2014; 13 Suppl 1:S5. [PMID: 24565041 PMCID: PMC3953143 DOI: 10.1186/1472-6807-13-s1-s5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background De novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein. Results We have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5Å of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2Å. Conclusions The tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices.
Collapse
|
50
|
Si D, He J. Combining image processing and modeling to generate traces of beta-strands from cryo-EM density images of beta-barrels. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2014; 2014:3941-3944. [PMID: 25570854 DOI: 10.1109/embc.2014.6944486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Electron cryo-microscopy (Cryo-EM) technique produces 3-dimensional (3D) density images of proteins. When resolution of the images is not high enough to resolve the molecular details, it is challenging for image processing methods to enhance the molecular features. β-barrel is a particular structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive β-strands from the 3D image of a β-barrel at medium resolutions. We propose a new method, StrandRoller, to generate a small set of possible β-traces from the density images at medium resolutions of 5-10Å. StrandRoller has been tested using eleven β-barrel images simulated to 10Å resolution and one image isolated from the experimentally derived cryo-EM density image at 6.7Å resolution. StrandRoller was able to detect 81.84% of the β-strands with an overall 1.5Å 2-way distance between the detected and the observed β-traces, if the best of fifteen detections is considered. Our results suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when no separation of the β-strands is visible in the images.
Collapse
|