1
|
Punuru P, Jain A, Kihara D. Secondary Structure Detection and Structure Modeling for Cryo-EM. Methods Mol Biol 2025; 2870:341-355. [PMID: 39543043 DOI: 10.1007/978-1-0716-4213-9_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Rapid advancements in cryogenic electron microscopy (cryo-EM) have revolutionized the field of structural biology by enabling the determination of complex macromolecular structures at unprecedented resolutions. When cryo-EM density maps have a resolution around 3 Å, the atomic structure can be modeled manually. However, as the resolution decreases, analyzing these density maps becomes increasingly challenging. For modeling structures in lower resolution maps, deep learning can be used to identify structural features in the maps to assist in structure modeling.Here, we present a suite of deep learning-based tools developed by our lab that enable structural biologists to work with cryo-EM maps of a wide range of resolutions. For cryo-EM maps at near-atomic resolution (5 Å or better), DeepMainmast automatically models all-atom structures by tracing the main chain from local map features of amino acids and atoms detected by deep learning; DAQ score quantifies map-model fit and indicates potential misassignments in protein models. In intermediate resolution maps (5-10 Å), Emap2sec and Emap2sec+ can accurately detect protein secondary structures and nucleic acids. These tools and more are available at our web server: https://em.kiharalab.org/ .
Collapse
Affiliation(s)
- Pranav Punuru
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Anika Jain
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
2
|
Chen S, Zhang S, Fang X, Lin L, Zhao H, Yang Y. Protein complex structure modeling by cross-modal alignment between cryo-EM maps and protein sequences. Nat Commun 2024; 15:8808. [PMID: 39394203 PMCID: PMC11470027 DOI: 10.1038/s41467-024-53116-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 10/02/2024] [Indexed: 10/13/2024] Open
Abstract
Cryo-electron microscopy (cryo-EM) technique is widely used for protein structure determination. Current automatic cryo-EM protein complex modeling methods mostly rely on prior chain separation. However, chain separation without sequence guidance often suffers from errors caused by cross-chain interaction or noise densities, which would accumulate and mislead the subsequent steps. Here, we present EModelX, a fully automated cryo-EM protein complex structure modeling method, which achieves sequence-guiding modeling through cross-modal alignments between cryo-EM maps and protein sequences. EModelX first employs multi-task deep learning to predict Cα atoms, backbone atoms, and amino acid types from cryo-EM maps, which is subsequently used to sample Cα traces with amino acid profiles. The profiles are then aligned with protein sequences to obtain initial structural models, which yielded an average RMSD of 1.17 Å in our test set, approaching atomic-level precision in recovering PDB-deposited structures. After filling unmodeled gaps through sequence-guiding Cα threading, the final models achieved an average TM-score of 0.808, outperforming the state-of-the-art method. The further combination with AlphaFold can improve the average TM-score to 0.911. Analyzes conducted by comparing some EModelX-built models and PDB structures highlight its potential to improve PDB structures. EModelX is accessible at https://bio-web1.nscc-gz.cn/app/EModelX .
Collapse
Affiliation(s)
- Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Sen Zhang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Xiaoyu Fang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Liang Lin
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
3
|
Qi J, Feng C, Shi Y, Yang J, Zhang F, Li G, Han R. FP-Zernike: An Open-source Structural Database Construction Toolkit for Fast Structure Retrieval. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae007. [PMID: 38894604 PMCID: PMC11423855 DOI: 10.1093/gpbjnl/qzae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 08/16/2023] [Accepted: 09/20/2023] [Indexed: 06/21/2024]
Abstract
The release of AlphaFold2 has sparked a rapid expansion in protein model databases. Efficient protein structure retrieval is crucial for the analysis of structure models, while measuring the similarity between structures is the key challenge in structural retrieval. Although existing structure alignment algorithms can address this challenge, they are often time-consuming. Currently, the state-of-the-art approach involves converting protein structures into three-dimensional (3D) Zernike descriptors and assessing similarity using Euclidean distance. However, the methods for computing 3D Zernike descriptors mainly rely on structural surfaces and are predominantly web-based, thus limiting their application in studying custom datasets. To overcome this limitation, we developed FP-Zernike, a user-friendly toolkit for computing different types of Zernike descriptors based on feature points. Users simply need to enter a single line of command to calculate the Zernike descriptors of all structures in customized datasets. FP-Zernike outperforms the leading method in terms of retrieval accuracy and binary classification accuracy across diverse benchmark datasets. In addition, we showed the application of FP-Zernike in the construction of the descriptor database and the protocol used for the Protein Data Bank (PDB) dataset to facilitate the local deployment of this tool for interested readers. Our demonstration contained 590,685 structures, and at this scale, our system required only 4-9 s to complete a retrieval. The experiments confirmed that it achieved the state-of-the-art accuracy level. FP-Zernike is an open-source toolkit, with the source code and related data accessible at https://ngdc.cncb.ac.cn/biocode/tools/BT007365/releases/0.1, as well as through a webserver at http://www.structbioinfo.cn/.
Collapse
Affiliation(s)
- Junhai Qi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- BioMap Research, Menlo Park, CA 94025, USA
| | - Chenjie Feng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan 750004, China
| | - Yulin Shi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Fa Zhang
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing 100081, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
4
|
Beton JG, Cragnolini T, Kaleel M, Mulvaney T, Sweeney A, Topf M. Integrating model simulation tools and
cryo‐electron
microscopy. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Joseph George Beton
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck and University College London London UK
| | - Manaz Kaleel
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Thomas Mulvaney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Aaron Sweeney
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| | - Maya Topf
- Centre for Structural Systems Biology (CSSB) Leibniz‐Institut für Virologie (LIV) Hamburg Germany
| |
Collapse
|
5
|
Alnabati E, Esquivel-Rodriguez J, Terashi G, Kihara D. MarkovFit: Structure Fitting for Protein Complexes in Electron Microscopy Maps Using Markov Random Field. Front Mol Biosci 2022; 9:935411. [PMID: 35959463 PMCID: PMC9358042 DOI: 10.3389/fmolb.2022.935411] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
An increasing number of protein complex structures are determined by cryo-electron microscopy (cryo-EM). When individual protein structures have been determined and are available, an important task in structure modeling is to fit the individual structures into the density map. Here, we designed a method that fits the atomic structures of proteins in cryo-EM maps of medium to low resolutions using Markov random fields, which allows probabilistic evaluation of fitted models. The accuracy of our method, MarkovFit, performed better than existing methods on datasets of 31 simulated cryo-EM maps of resolution 10 Å , nine experimentally determined cryo-EM maps of resolution less than 4 Å , and 28 experimentally determined cryo-EM maps of resolution 6 to 20 Å .
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | | | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
6
|
Ljung F, André I. ZEAL: protein structure alignment based on shape similarity. Bioinformatics 2021; 37:2874-2881. [PMID: 33772587 DOI: 10.1093/bioinformatics/btab205] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 02/02/2021] [Accepted: 03/25/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Most protein-structure superimposition tools consider only Cartesian coordinates. Yet, much of biology happens on the surface of proteins, which is why proteins with shared ancestry and similar function often have comparable surface shapes. Superposition of proteins based on surface shape can enable comparison of highly divergent proteins, identify convergent evolution and enable detailed comparison of surface features and binding sites. RESULTS We present ZEAL, an interactive tool to superpose global and local protein structures based on their shape resemblance using 3D (Zernike-Canterakis) functions to represent the molecular surface. In a benchmark study of structures with the same fold, we show that ZEAL outperforms two other methods for shape-based superposition. In addition, alignments from ZEAL were of comparable quality to the coordinate-based superpositions provided by TM-align. For comparisons of proteins with limited sequence and backbone-fold similarity, where coordinate-based methods typically fail, ZEAL can often find alignments with substantial surface-shape correspondence. In combination with shape-based matching, ZEAL can be used as a general tool to study relationships between shape and protein function. We identify several categories of protein functions where global shape similarity is significantly more likely than expected by random chance, when comparing proteins with little similarity on the fold level. In particular, we find that global surface shape similarity is particular common among DNA binding proteins. AVAILABILITY AND IMPLEMENTATION ZEAL can be used online at https://andrelab.org/zeal or as a standalone program with command line or graphical user interface. Source files and installers are available at https://github.com/Andre-lab/ZEAL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Filip Ljung
- Division of Biochemistry and Structural Biology, Department of Chemistry, Lund University, Lund SE-22100, Sweden
| | - Ingemar André
- Division of Biochemistry and Structural Biology, Department of Chemistry, Lund University, Lund SE-22100, Sweden
| |
Collapse
|
7
|
Wang X, Flannery ST, Kihara D. Protein Docking Model Evaluation by Graph Neural Networks. Front Mol Biosci 2021; 8:647915. [PMID: 34113650 PMCID: PMC8185212 DOI: 10.3389/fmolb.2021.647915] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 04/26/2021] [Indexed: 12/03/2022] Open
Abstract
Physical interactions of proteins play key functional roles in many important cellular processes. To understand molecular mechanisms of such functions, it is crucial to determine the structure of protein complexes. To complement experimental approaches, which usually take a considerable amount of time and resources, various computational methods have been developed for predicting the structures of protein complexes. In computational modeling, one of the challenges is to identify near-native structures from a large pool of generated models. Here, we developed a deep learning-based approach named Graph Neural Network-based DOcking decoy eValuation scorE (GNN-DOVE). To evaluate a protein docking model, GNN-DOVE extracts the interface area and represents it as a graph. The chemical properties of atoms and the inter-atom distances are used as features of nodes and edges in the graph, respectively. GNN-DOVE was trained, validated, and tested on docking models in the Dockground database and further tested on a combined dataset of Dockground and ZDOCK benchmark as well as a CAPRI scoring dataset. GNN-DOVE performed better than existing methods, including DOVE, which is our previous development that uses a convolutional neural network on voxelized structure models.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Sean T. Flannery
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
8
|
Han X, Terashi G, Christoffer C, Chen S, Kihara D. VESPER: global and local cryo-EM map alignment using local density vectors. Nat Commun 2021; 12:2090. [PMID: 33828103 PMCID: PMC8027200 DOI: 10.1038/s41467-021-22401-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 03/12/2021] [Indexed: 11/16/2022] Open
Abstract
An increasing number of density maps of biological macromolecules have been determined by cryo-electron microscopy (cryo-EM) and stored in the public database, EMDB. To interpret the structural information contained in EM density maps, alignment of maps is an essential step for structure modeling, comparison of maps, and for database search. Here, we developed VESPER, which captures the similarity of underlying molecular structures embedded in density maps by taking local gradient directions into consideration. Compared to existing methods, VESPER achieved substantially more accurate global and local alignment of maps as well as database retrieval.
Collapse
Affiliation(s)
- Xusi Han
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | | | - Siyang Chen
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
9
|
Aderinwale T, Christoffer CW, Sarkar D, Alnabati E, Kihara D. Computational structure modeling for diverse categories of macromolecular interactions. Curr Opin Struct Biol 2020; 64:1-8. [PMID: 32599506 PMCID: PMC7665979 DOI: 10.1016/j.sbi.2020.05.017] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 05/06/2020] [Accepted: 05/21/2020] [Indexed: 01/23/2023]
Abstract
Computational protein-protein docking is one of the most intensively studied topics in structural bioinformatics. The field has made substantial progress through over three decades of development. The development began with methods for rigid-body docking of two proteins, which have now been extended in different directions to cover the various macromolecular interactions observed in a cell. Here, we overview the recent developments of the variations of docking methods, including multiple protein docking, peptide-protein docking, and disordered protein docking methods.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
10
|
Wang X, Terashi G, Christoffer CW, Zhu M, Kihara D. Protein docking model evaluation by 3D deep convolutional neural networks. Bioinformatics 2020; 36:2113-2118. [PMID: 31746961 DOI: 10.1093/bioinformatics/btz870] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 08/25/2019] [Accepted: 11/19/2019] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Many important cellular processes involve physical interactions of proteins. Therefore, determining protein quaternary structures provide critical insights for understanding molecular mechanisms of functions of the complexes. To complement experimental methods, many computational methods have been developed to predict structures of protein complexes. One of the challenges in computational protein complex structure prediction is to identify near-native models from a large pool of generated models. RESULTS We developed a convolutional deep neural network-based approach named DOcking decoy selection with Voxel-based deep neural nEtwork (DOVE) for evaluating protein docking models. To evaluate a protein docking model, DOVE scans the protein-protein interface of the model with a 3D voxel and considers atomic interaction types and their energetic contributions as input features applied to the neural network. The deep learning models were trained and validated on docking models available in the ZDock and DockGround databases. Among the different combinations of features tested, almost all outperformed existing scoring functions. AVAILABILITY AND IMPLEMENTATION Codes available at http://github.com/kiharalab/DOVE, http://kiharalab.org/dove/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | - Mengmeng Zhu
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA.,Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
11
|
Roy AA, Dhawanjewar AS, Sharma P, Singh G, Madhusudhan MS. Protein Interaction Z Score Assessment (PIZSA): an empirical scoring scheme for evaluation of protein-protein interactions. Nucleic Acids Res 2020; 47:W331-W337. [PMID: 31114890 PMCID: PMC6602501 DOI: 10.1093/nar/gkz368] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 04/24/2019] [Accepted: 05/15/2019] [Indexed: 11/24/2022] Open
Abstract
Our web server, PIZSA (http://cospi.iiserpune.ac.in/pizsa), assesses the likelihood of protein–protein interactions by assigning a Z Score computed from interface residue contacts. Our score takes into account the optimal number of atoms that mediate the interaction between pairs of residues and whether these contacts emanate from the main chain or side chain. We tested the score on 174 native interactions for which 100 decoys each were constructed using ZDOCK. The native structure scored better than any of the decoys in 146 cases and was able to rank within the 95th percentile in 162 cases. This easily outperforms a competing method, CIPS. We also benchmarked our scoring scheme on 15 targets from the CAPRI dataset and found that our method had results comparable to that of CIPS. Further, our method is able to analyse higher order protein complexes without the need to explicitly identify chains as receptors or ligands. The PIZSA server is easy to use and could be used to score any input three-dimensional structure and provide a residue pair-wise break up of the results. Attractively, our server offers a platform for users to upload their own potentials and could serve as an ideal testing ground for this class of scoring schemes.
Collapse
Affiliation(s)
- Ankit A Roy
- Indian Institute of Science Education and Research, Pune, Dr Homi Bhabha Road, Pashan, Pune 411008, India
| | - Abhilesh S Dhawanjewar
- Indian Institute of Science Education and Research, Pune, Dr Homi Bhabha Road, Pashan, Pune 411008, India.,presently at School of Biological Sciences, University of Nebraska, Lincoln, NE 68588, USA
| | - Parichit Sharma
- Indian Institute of Science Education and Research, Pune, Dr Homi Bhabha Road, Pashan, Pune 411008, India.,presently at School of Informatics, Computing & Engineering, Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - Gulzar Singh
- Indian Institute of Science Education and Research, Pune, Dr Homi Bhabha Road, Pashan, Pune 411008, India
| | - M S Madhusudhan
- Indian Institute of Science Education and Research, Pune, Dr Homi Bhabha Road, Pashan, Pune 411008, India
| |
Collapse
|
12
|
Terashi G, Kagaya Y, Kihara D. MAINMASTseg: Automated Map Segmentation Method for Cryo-EM Density Maps with Symmetry. J Chem Inf Model 2020; 60:2634-2643. [PMID: 32197044 DOI: 10.1021/acs.jcim.9b01110] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, United States
| | - Yuki Kagaya
- Graduate School of Information Sciences, Tohoku University, Aramaki Aza, Aoba 6-3-09, Aoba-Ku, Sendai, Miyagi 980-8579, Japan
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, United States
- Department of Computer Science, Purdue University, West Lafayette, Indiana 47907, United States
| |
Collapse
|
13
|
Alnabati E, Kihara D. Advances in Structure Modeling Methods for Cryo-Electron Microscopy Maps. Molecules 2019; 25:molecules25010082. [PMID: 31878333 PMCID: PMC6982917 DOI: 10.3390/molecules25010082] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 12/20/2019] [Accepted: 12/20/2019] [Indexed: 01/16/2023] Open
Abstract
Cryo-electron microscopy (cryo-EM) has now become a widely used technique for structure determination of macromolecular complexes. For modeling molecular structures from density maps of different resolutions, many algorithms have been developed. These algorithms can be categorized into rigid fitting, flexible fitting, and de novo modeling methods. It is also observed that machine learning (ML) techniques have been increasingly applied following the rapid progress of the ML field. Here, we review these different categories of macromolecule structure modeling methods and discuss their advances over time.
Collapse
Affiliation(s)
- Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
- Correspondence:
| |
Collapse
|
14
|
Christoffer C, Terashi G, Shin WH, Aderinwale T, Maddhuri Venkata Subramaniya SR, Peterson L, Verburgt J, Kihara D. Performance and enhancement of the LZerD protein assembly pipeline in CAPRI 38-46. Proteins 2019; 88:948-961. [PMID: 31697428 DOI: 10.1002/prot.25850] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 10/07/2019] [Accepted: 11/03/2019] [Indexed: 01/17/2023]
Abstract
We report the performance of the protein docking prediction pipeline of our group and the results for Critical Assessment of Prediction of Interactions (CAPRI) rounds 38-46. The pipeline integrates programs developed in our group as well as other existing scoring functions. The core of the pipeline is the LZerD protein-protein docking algorithm. If templates of the target complex are not found in PDB, the first step of our docking prediction pipeline is to run LZerD for a query protein pair. Meanwhile, in the case of human group prediction, we survey the literature to find information that can guide the modeling, such as protein-protein interface information. In addition to any literature information and binding residue prediction, generated docking decoys were selected by a rank aggregation of statistical scoring functions. The top 10 decoys were relaxed by a short molecular dynamics simulation before submission to remove atom clashes and improve side-chain conformations. In these CAPRI rounds, our group, particularly the LZerD server, showed robust performance. On the other hand, there are failed cases where some other groups were successful. To understand weaknesses of our pipeline, we analyzed sources of errors for failed targets. Since we noted that structure refinement is a step that needs improvement, we newly performed a comparative study of several refinement approaches. Finally, we show several examples that illustrate successful and unsuccessful cases by our group.
Collapse
Affiliation(s)
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana.,Department of Chemistry Education, Sunchon National University, Suncheon, Jeollanam-do, Republic of Korea
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana
| | | | - Lenna Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana.,Department of Biological Sciences, Purdue University, West Lafayette, Indiana.,Purdue University Center for Cancer Research, Purdue University, West Lafayette, Indiana.,Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| |
Collapse
|
15
|
Malhotra S, Träger S, Dal Peraro M, Topf M. Modelling structures in cryo-EM maps. Curr Opin Struct Biol 2019; 58:105-114. [PMID: 31394387 DOI: 10.1016/j.sbi.2019.05.024] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 05/23/2019] [Accepted: 05/25/2019] [Indexed: 12/20/2022]
Abstract
Recent advances in structure determination of sub-cellular structures using cryo-electron microscopy and tomography have enabled us to understand their architecture in a more detailed manner and gain insight into their function. The choice of approach to use for atomic model building, fitting, refinement and validation in the 3D map resulting from these experiments depends primarily on the resolution of the map and the prior information on the corresponding model. Here, we survey some of such methods and approaches and highlight their uses in specific recent examples.
Collapse
Affiliation(s)
- Sony Malhotra
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, Malet Street, London WC1E 7HX, United Kingdom
| | - Sylvain Träger
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Matteo Dal Peraro
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Maya Topf
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, Malet Street, London WC1E 7HX, United Kingdom.
| |
Collapse
|
16
|
Protein secondary structure detection in intermediate-resolution cryo-EM maps using deep learning. Nat Methods 2019; 16:911-917. [PMID: 31358979 PMCID: PMC6717539 DOI: 10.1038/s41592-019-0500-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 06/24/2019] [Indexed: 02/05/2023]
Abstract
An increasing number of protein structures have been solved by cryo-electron microscopy (cryo-EM). Although structures determined at near-atomic resolution are now routinely reported, many density maps are still determined at an intermediate resolution, where extracting structure information is still a challenge. We have developed a computational method, Emap2sec, which identifies the secondary structures of proteins (α helices, β sheets, and other structures) in an EM map of 5 to 10 Å resolution. Emap2sec uses a 3D deep convolutional neural network to assign secondary structure to each grid point in an EM map. We tested Emap2sec on 6.0 and 10.0 Å resolution EM maps simulated from 34 structures, as well as on 43 maps determined experimentally at 5.0 to 9.5 Å resolution. Emap2sec was able to clearly identify the secondary structures in many maps tested, and showed substantially better performance than existing methods.
Collapse
|
17
|
Terashi G, Kihara D. De novo main-chain modeling with MAINMAST in 2015/2016 EM Model Challenge. J Struct Biol 2018; 204:351-359. [PMID: 30075190 PMCID: PMC6179447 DOI: 10.1016/j.jsb.2018.07.013] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Revised: 07/13/2018] [Accepted: 07/19/2018] [Indexed: 11/15/2022]
Abstract
Protein tertiary structure modeling is a critical step for the interpretation of three dimensional (3D) election microscopy density. Our group participated the 2015/2016 EM Model Challenge using the MAINMAST software for a de novo main chain modeling. The software generates local dense points using the mean shifting algorithm, and connects them into Cα models by calculating the minimum spanning tree and the longest path. Subsequently, full atom structure models are generated, which are subject to structural refinement. Here, we summarize the qualities of our submitted models and examine successful and unsuccessful models, including 3D models we did not submit to the Challenge. Our protocol using the MAINMAST software was sometimes able to build correct conformations with 3.4–5.1 Å RMSD. Unsuccessful models had failure of chain traces, however, their Cα positions and some local structures were quite correctly built. For evaluate the quality of the models, the MAINMAST software provides a confidence score for each Cα position from the consensus of top 100 scoring models.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|
18
|
Kumar A, Zhang KYJ. Advances in the Development of Shape Similarity Methods and Their Application in Drug Discovery. Front Chem 2018; 6:315. [PMID: 30090808 PMCID: PMC6068280 DOI: 10.3389/fchem.2018.00315] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 07/09/2018] [Indexed: 12/21/2022] Open
Abstract
Molecular similarity is a key concept in drug discovery. It is based on the assumption that structurally similar molecules frequently have similar properties. Assessment of similarity between small molecules has been highly effective in the discovery and development of various drugs. Especially, two-dimensional (2D) similarity approaches have been quite popular due to their simplicity, accuracy and efficiency. Recently, the focus has been shifted toward the development of methods involving the representation and comparison of three-dimensional (3D) conformation of small molecules. Among the 3D similarity methods, evaluation of shape similarity is now gaining attention for its application not only in virtual screening but also in molecular target prediction, drug repurposing and scaffold hopping. A wide range of methods have been developed to describe molecular shape and to determine the shape similarity between small molecules. The most widely used methods include atom distance-based methods, surface-based approaches such as spherical harmonics and 3D Zernike descriptors, atom-centered Gaussian overlay based representations. Several of these methods demonstrated excellent virtual screening performance not only retrospectively but also prospectively. In addition to methods assessing the similarity between small molecules, shape similarity approaches have been developed to compare shapes of protein structures and binding pockets. Additionally, shape comparisons between atomic models and 3D density maps allowed the fitting of atomic models into cryo-electron microscopy maps. This review aims to summarize the methodological advances in shape similarity assessment highlighting advantages, disadvantages and their application in drug discovery.
Collapse
Affiliation(s)
| | - Kam Y. J. Zhang
- Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, Yokohama, Japan
| |
Collapse
|
19
|
Cassidy CK, Himes BA, Luthey-Schulten Z, Zhang P. CryoEM-based hybrid modeling approaches for structure determination. Curr Opin Microbiol 2018; 43:14-23. [PMID: 29107896 PMCID: PMC5934336 DOI: 10.1016/j.mib.2017.10.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 10/04/2017] [Accepted: 10/09/2017] [Indexed: 12/21/2022]
Abstract
Recent advances in cryo-electron microscopy (cryoEM) have dramatically improved the resolutions at which vitrified biological specimens can be studied, revealing new structural and mechanistic insights over a broad range of spatial scales. Bolstered by these advances, much effort has been directed toward the development of hybrid modeling methodologies for the construction and refinement of high-fidelity atomistic models from cryoEM data. In this brief review, we will survey the key elements of cryoEM-based hybrid modeling, providing an overview of available computational tools and strategies as well as several recent applications.
Collapse
Affiliation(s)
- C Keith Cassidy
- Department of Physics, Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Benjamin A Himes
- Department of Structural Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Zaida Luthey-Schulten
- Department of Chemistry, Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Peijun Zhang
- Department of Structural Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA; Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK; Electron Bio-Imaging Centre, Diamond Light Sources, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK.
| |
Collapse
|
20
|
Terashi G, Kihara D. De novo main-chain modeling for EM maps using MAINMAST. Nat Commun 2018; 9:1618. [PMID: 29691408 PMCID: PMC5915429 DOI: 10.1038/s41467-018-04053-7] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 03/29/2018] [Indexed: 11/09/2022] Open
Abstract
An increasing number of protein structures are determined by cryo-electron microscopy (cryo-EM) at near atomic resolution. However, tracing the main-chains and building full-atom models from EM maps of ~4-5 Å is still not trivial and remains a time-consuming task. Here, we introduce a fully automated de novo structure modeling method, MAINMAST, which builds three-dimensional models of a protein from a near-atomic resolution EM map. The method directly traces the protein's main-chain and identifies Cα positions as tree-graph structures in the EM map. MAINMAST performs significantly better than existing software in building global protein structure models on data sets of 40 simulated density maps at 5 Å resolution and 30 experimentally determined maps at 2.6-4.8 Å resolution. In another benchmark of building missing fragments in protein models for EM maps, MAINMAST builds fragments of 11-161 residues long with an average RMSD of 2.68 Å.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, 249S. Martin Jischke Dr., West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, 249S. Martin Jischke Dr., West Lafayette, IN, 47907, USA. .,Department of Computer Science, Purdue University, 305N. University St., West Lafayette, IN, 47907, USA.
| |
Collapse
|
21
|
Vreven T, Schweppe DK, Chavez JD, Weisbrod CR, Shibata S, Zheng C, Bruce JE, Weng Z. Integrating Cross-Linking Experiments with Ab Initio Protein-Protein Docking. J Mol Biol 2018; 430:1814-1828. [PMID: 29665372 DOI: 10.1016/j.jmb.2018.04.010] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 03/19/2018] [Accepted: 04/10/2018] [Indexed: 12/23/2022]
Abstract
Ab initio protein-protein docking algorithms often rely on experimental data to identify the most likely complex structure. We integrated protein-protein docking with the experimental data of chemical cross-linking followed by mass spectrometry. We tested our approach using 19 cases that resulted from an exhaustive search of the Protein Data Bank for protein complexes with cross-links identified in our experiments. We implemented cross-links as constraints based on Euclidean distance or void-volume distance. For most test cases, the rank of the top-scoring near-native prediction was improved by at least twofold compared with docking without the cross-link information, and the success rate for the top 5 predictions nearly tripled. Our results demonstrate the delicate balance between retaining correct predictions and eliminating false positives. Several test cases had multiple components with distinct interfaces, and we present an approach for assigning cross-links to the interfaces. Employing the symmetry information for these cases further improved the performance of complex structure prediction.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| | - Devin K Schweppe
- Department of Chemistry and Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
| | - Juan D Chavez
- Department of Chemistry and Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
| | - Chad R Weisbrod
- Department of Chemistry and Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
| | - Sayaka Shibata
- Department of Chemistry and Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
| | - Chunxiang Zheng
- Department of Chemistry and Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
| | - James E Bruce
- Department of Chemistry and Department of Genome Sciences, University of Washington, Seattle, WA 98109, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
22
|
Joseph AP, Lagerstedt I, Patwardhan A, Topf M, Winn M. Improved metrics for comparing structures of macromolecular assemblies determined by 3D electron-microscopy. J Struct Biol 2017; 199:12-26. [PMID: 28552721 PMCID: PMC5479444 DOI: 10.1016/j.jsb.2017.05.007] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 05/19/2017] [Accepted: 05/23/2017] [Indexed: 11/28/2022]
Abstract
Recent developments in 3-dimensional electron microcopy (3D-EM) techniques and a concomitant drive to look at complex molecular structures, have led to a rapid increase in the amount of volume data available for biomolecules. This creates a demand for better methods to analyse the data, including improved scores for comparison, classification and integration of data at different resolutions. To this end, we developed and evaluated a set of scoring functions that compare 3D-EM volumes. To test our scores we used a benchmark set of volume alignments derived from the Electron Microscopy Data Bank. We find that the performance of different scores vary with the map-type, resolution and the extent of overlap between volumes. Importantly, adding the overlap information to the local scoring functions can significantly improve their precision and accuracy in a range of resolutions. A combined score involving the local mutual information and overlap (LMI_OV) performs best overall, irrespective of the map category, resolution or the extent of overlap, and we recommend this score for general use. The local mutual information score itself is found to be more discriminatory than cross-correlation coefficient for intermediate-to-low resolution maps or when the map size and density distribution differ significantly. For comparing map surfaces, we implemented two filters to detect the surface points, including one based on the 'extent of surface exposure'. We show that scores that compare surfaces are useful at low resolutions and for maps with evident surface features. All the scores discussed are implemented in TEMPy (http://tempy.ismb.lon.ac.uk/).
Collapse
Affiliation(s)
- Agnel Praveen Joseph
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, Malet Street, London WC1E 7HX, United Kingdom; Scientific Computing Department, Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Ingvar Lagerstedt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom; Computational Chemistry and Cheminformatics, Lilly UK, Windlesham GU20 6PH, United Kingdom
| | - Ardan Patwardhan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Maya Topf
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, Malet Street, London WC1E 7HX, United Kingdom.
| | - Martyn Winn
- Scientific Computing Department, Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom.
| |
Collapse
|
23
|
Variability of Protein Structure Models from Electron Microscopy. Structure 2017; 25:592-602.e2. [PMID: 28262392 DOI: 10.1016/j.str.2017.02.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 01/10/2017] [Accepted: 02/11/2017] [Indexed: 11/23/2022]
Abstract
An increasing number of biomolecular structures are solved by electron microscopy (EM). However, the quality of structure models determined from EM maps vary substantially. To understand to what extent structure models are supported by information embedded in EM maps, we used two computational structure refinement methods to examine how much structures can be refined using a dataset of 49 maps with accompanying structure models. The extent of structure modification as well as the disagreement between refinement models produced by the two computational methods scaled inversely with the global and the local map resolutions. A general quantitative estimation of deviations of structures for particular map resolutions are provided. Our results indicate that the observed discrepancy between the deposited map and the refined models is due to the lack of structural information present in EM maps and thus these annotations must be used with caution for further applications.
Collapse
|
24
|
Vreven T, Pierce BG, Borrman TM, Weng Z. Performance of ZDOCK and IRAD in CAPRI rounds 28-34. Proteins 2016; 85:408-416. [PMID: 27718275 DOI: 10.1002/prot.25186] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 09/20/2016] [Accepted: 09/29/2016] [Indexed: 11/11/2022]
Abstract
We report the performance of our protein-protein docking pipeline, including the ZDOCK rigid-body docking algorithm, on 19 targets in CAPRI rounds 28-34. Following the docking step, we reranked the ZDOCK predictions using the IRAD scoring function, pruned redundant predictions, performed energy landscape analysis, and utilized our interface prediction approach RCF. In addition, we applied constraints to the search space based on biological information that we culled from the literature, which increased the chance of making a correct prediction. For all but two targets we were able to find and apply biological information and we found the information to be highly accurate, indicating that effective incorporation of biological information is an important component for protein-protein docking. Proteins 2017; 85:408-416. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Brian G Pierce
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Tyler M Borrman
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| |
Collapse
|
25
|
Wang D, Sun S, Chen X, Yu Z. A 3D shape descriptor based on spherical harmonics through evolutionary optimization. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.01.081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
26
|
Pandurangan AP, Vasishtan D, Alber F, Topf M. γ-TEMPy: Simultaneous Fitting of Components in 3D-EM Maps of Their Assembly Using a Genetic Algorithm. Structure 2015; 23:2365-2376. [PMID: 26655474 PMCID: PMC4671957 DOI: 10.1016/j.str.2015.10.013] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Revised: 09/24/2015] [Accepted: 10/01/2015] [Indexed: 12/02/2022]
Abstract
We have developed a genetic algorithm for building macromolecular complexes using only a 3D-electron microscopy density map and the atomic structures of the relevant components. For efficient sampling the method uses map feature points calculated by vector quantization. The fitness function combines a mutual information score that quantifies the goodness of fit with a penalty score that helps to avoid clashes between components. Testing the method on ten assemblies (containing 3–8 protein components) and simulated density maps at 10, 15, and 20 Å resolution resulted in identification of the correct topology in 90%, 70%, and 60% of the cases, respectively. We further tested it on four assemblies with experimental maps at 7.2–23.5 Å resolution, showing the ability of the method to identify the correct topology in all cases. We have also demonstrated the importance of the map feature-point quality on assembly fitting in the lack of additional experimental information. γ-TEMPy uses a genetic algorithm to fit multiple components into 3D-EM density maps The fitness score is a combination of a Mutual Information score and a clash penalty Efficient sampling is aided by using map feature points from vector quantization Native topologies for assemblies containing up to eight components can be predicted
Collapse
Affiliation(s)
- Arun Prasad Pandurangan
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK
| | - Daven Vasishtan
- Division of Structural Biology, Oxford Particle Imaging Centre, Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Frank Alber
- Program in Molecular and Computational Biology, University of Southern California, 1050 Childs Way, RRI413E, Los Angeles, CA 90089, USA
| | - Maya Topf
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK.
| |
Collapse
|
27
|
Farabella I, Vasishtan D, Joseph AP, Pandurangan AP, Sahota H, Topf M. TEMPy: a Python library for assessment of three-dimensional electron microscopy density fits. J Appl Crystallogr 2015; 48:1314-1323. [PMID: 26306092 PMCID: PMC4520291 DOI: 10.1107/s1600576715010092] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Accepted: 05/24/2015] [Indexed: 12/21/2022] Open
Abstract
TEMPy is an object-oriented Python library that provides the means to validate density fits in electron microscopy reconstructions. This article highlights several features of particular interest for this purpose and includes some customized examples. Three-dimensional electron microscopy is currently one of the most promising techniques used to study macromolecular assemblies. Rigid and flexible fitting of atomic models into density maps is often essential to gain further insights into the assemblies they represent. Currently, tools that facilitate the assessment of fitted atomic models and maps are needed. TEMPy (template and electron microscopy comparison using Python) is a toolkit designed for this purpose. The library includes a set of methods to assess density fits in intermediate-to-low resolution maps, both globally and locally. It also provides procedures for single-fit assessment, ensemble generation of fits, clustering, and multiple and consensus scoring, as well as plots and output files for visualization purposes to help the user in analysing rigid and flexible fits. The modular nature of TEMPy helps the integration of scoring and assessment of fits into large pipelines, making it a tool suitable for both novice and expert structural biologists.
Collapse
Affiliation(s)
- Irene Farabella
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London , Malet street, London WC1E 7HX, UK
| | - Daven Vasishtan
- Oxford Particle Imaging Centre, Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford , Oxford OX3 7BN, UK
| | - Agnel Praveen Joseph
- Scientific Computing Department, Science and Technology Facilities Council, Research Complex at Harwell , Didcot, Oxon OX11 0QX, UK
| | - Arun Prasad Pandurangan
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London , Malet street, London WC1E 7HX, UK
| | - Harpal Sahota
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London , Malet street, London WC1E 7HX, UK
| | - Maya Topf
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck, University of London , Malet street, London WC1E 7HX, UK
| |
Collapse
|
28
|
Integrative Modeling of Biomolecular Complexes: HADDOCKing with Cryo-Electron Microscopy Data. Structure 2015; 23:949-960. [DOI: 10.1016/j.str.2015.03.014] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Revised: 03/12/2015] [Accepted: 03/13/2015] [Indexed: 12/13/2022]
|
29
|
Schröder GF. Hybrid methods for macromolecular structure determination: experiment with expectations. Curr Opin Struct Biol 2015; 31:20-7. [DOI: 10.1016/j.sbi.2015.02.016] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Revised: 02/22/2015] [Accepted: 02/26/2015] [Indexed: 12/15/2022]
|
30
|
López-Blanco JR, Chacón P. Structural modeling from electron microscopy data. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2014. [DOI: 10.1002/wcms.1199] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- José Ramón López-Blanco
- Department of Biological Physical Chemistry; Rocasolano Physical Chemistry Institute, CSIC; Madrid Spain
| | - Pablo Chacón
- Department of Biological Physical Chemistry; Rocasolano Physical Chemistry Institute, CSIC; Madrid Spain
| |
Collapse
|
31
|
Peterson LX, Kang X, Kihara D. Assessment of protein side-chain conformation prediction methods in different residue environments. Proteins 2014; 82:1971-84. [PMID: 24619909 PMCID: PMC5007623 DOI: 10.1002/prot.24552] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Revised: 03/02/2014] [Accepted: 03/07/2014] [Indexed: 11/09/2022]
Abstract
Computational prediction of side-chain conformation is an important component of protein structure prediction. Accurate side-chain prediction is crucial for practical applications of protein structure models that need atomic-detailed resolution such as protein and ligand design. We evaluated the accuracy of eight side-chain prediction methods in reproducing the side-chain conformations of experimentally solved structures deposited to the Protein Data Bank. Prediction accuracy was evaluated for a total of four different structural environments (buried, surface, interface, and membrane-spanning) in three different protein types (monomeric, multimeric, and membrane). Overall, the highest accuracy was observed for buried residues in monomeric and multimeric proteins. Notably, side-chains at protein interfaces and membrane-spanning regions were better predicted than surface residues even though the methods did not all use multimeric and membrane proteins for training. Thus, we conclude that the current methods are as practically useful for modeling protein docking interfaces and membrane-spanning regions as for modeling monomers.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette IN, 47907, USA
| | - Xuejiao Kang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
32
|
Esquivel-Rodriguez J, Filos-Gonzalez V, Li B, Kihara D. Pairwise and multimeric protein-protein docking using the LZerD program suite. Methods Mol Biol 2014; 1137:209-34. [PMID: 24573484 DOI: 10.1007/978-1-4939-0366-5_15] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Physical interactions between proteins are involved in many important cell functions and are key for understanding the mechanisms of biological processes. Protein-protein docking programs provide a means to computationally construct three-dimensional (3D) models of a protein complex structure from its component protein units. A protein docking program takes two or more individual 3D protein structures, which are either experimentally solved or computationally modeled, and outputs a series of probable complex structures.In this chapter we present the LZerD protein docking suite, which includes programs for pairwise docking, LZerD and PI-LZerD, and multiple protein docking, Multi-LZerD, developed by our group. PI-LZerD takes protein docking interface residues as additional input information. The methods use a combination of shape-based protein surface features as well as physics-based scoring terms to generate protein complex models. The programs are provided as stand-alone programs and can be downloaded from http://kiharalab.org/proteindocking.
Collapse
|
33
|
BAJAJ CHANDRAJIT, BAUER BENEDIKT, BETTADAPURA RADHAKRISHNA, VOLLRATH ANTJE. NONUNIFORM FOURIER TRANSFORMS FOR RIGID-BODY AND MULTI-DIMENSIONAL ROTATIONAL CORRELATIONS. SIAM JOURNAL ON SCIENTIFIC COMPUTING : A PUBLICATION OF THE SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS 2013; 35:10.1137/120892386. [PMID: 24379643 PMCID: PMC3874283 DOI: 10.1137/120892386] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The task of evaluating correlations is central to computational structural biology. The rigid-body correlation problem seeks the rigid-body transformation (R, t), R ∈ SO(3), t ∈ ℝ3 that maximizes the correlation between a pair of input scalar-valued functions representing molecular structures. Exhaustive solutions to the rigid-body correlation problem take advantage of the fast Fourier transform to achieve a speedup either with respect to the sought translation or rotation. We present PFcorr, a new exhaustive solution, based on the non-equispaced SO(3) Fourier transform, to the rigid-body correlation problem; unlike previous solutions, ours achieves a combination of translational and rotational speedups without requiring equispaced grids. PFcorr can be straightforwardly applied to a variety of problems in protein structure prediction and refinement that involve correlations under rigid-body motions of the protein. Additionally, we show how it applies, along with an appropriate flexibility model, to analogs of the above problems in which the flexibility of the protein is relevant.
Collapse
Affiliation(s)
- CHANDRAJIT BAJAJ
- Computational Visualization Center, Department of Computer Sciences and The Institute of Computational Engineering and Sciences, The University of Texas at Austin, 1 University Station C0200, Austin, Texas 78712, USA
| | - BENEDIKT BAUER
- Max Planck Institute for Evolutionary Biology. Plön, Germany
| | - RADHAKRISHNA BETTADAPURA
- Computational Visualization Center, Department of Mechanical Engineering, The University of Texas at Austin, 1 University Station C0200, Austin, Texas 78712, USA
| | - ANTJE VOLLRATH
- Institute of Computational Mathematics, TU Braunschweig, Pockelsstr 14, 38106 Braunschweig, Germany
| |
Collapse
|
34
|
Esquivel-Rodríguez J, Kihara D. Computational methods for constructing protein structure models from 3D electron microscopy maps. J Struct Biol 2013; 184:93-102. [PMID: 23796504 DOI: 10.1016/j.jsb.2013.06.008] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Revised: 06/11/2013] [Accepted: 06/13/2013] [Indexed: 12/31/2022]
Abstract
Protein structure determination by cryo-electron microscopy (EM) has made significant progress in the past decades. Resolutions of EM maps have been improving as evidenced by recently reported structures that are solved at high resolutions close to 3Å. Computational methods play a key role in interpreting EM data. Among many computational procedures applied to an EM map to obtain protein structure information, in this article we focus on reviewing computational methods that model protein three-dimensional (3D) structures from a 3D EM density map that is constructed from two-dimensional (2D) maps. The computational methods we discuss range from de novo methods, which identify structural elements in an EM map, to structure fitting methods, where known high resolution structures are fit into a low-resolution EM map. A list of available computational tools is also provided.
Collapse
Affiliation(s)
- Juan Esquivel-Rodríguez
- Department of Computer Science, College of Science, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
35
|
Tretyakov K, Goldberg T, Jin VX, Horton P. Summary of talks and papers at ISCB-Asia/SCCG 2012. BMC Genomics 2013. [PMCID: PMC3639071 DOI: 10.1186/1471-2164-14-s2-i1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Abstract
The second ISCB-Asia conference of the International Society for Computational Biology took place December 17-19, 2012, in Shenzhen, China. The conference was co-hosted by BGI as the first Shenzhen Conference on Computational Genomics (SCCG).
45 talks were presented at ISCB-Asia/SCCG 2012. The topics covered included software tools, reproducible computing, next-generation sequencing data analysis, transcription and mRNA regulation, protein structure and function, cancer genomics and personalized medicine. Nine of the proceedings track talks are included as full papers in this supplement.
In this report we first give a short overview of the conference by listing some statistics and visualizing the talk abstracts as word clouds. Then we group the talks by topic and briefly summarize each one, providing references to related publications whenever possible. Finally, we close with a few comments on the success of this conference.
Collapse
|
36
|
Esquivel-Rodríguez J, Kihara D. Effect of conformation sampling strategies in genetic algorithm for multiple protein docking. BMC Proc 2012; 6 Suppl 7:S4. [PMID: 23173833 PMCID: PMC3504801 DOI: 10.1186/1753-6561-6-s7-s4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background Macromolecular protein complexes play important roles in a cell and their tertiary structure can help understand key biological processes of their functions. Multiple protein docking is a valuable computational tool for providing structure information of multimeric protein complexes. In a previous study we developed and implemented an algorithm for this purpose, named Multi-LZerD. This method represents a conformation of a multimeric protein complex as a graph, where nodes denote subunits and each edge connecting nodes denotes a pairwise docking conformation of the two subunits. Multi-LZerD employs a genetic algorithm to sample different topologies of the graph and pairwise transformations between subunits, seeking for the conformation of the optimal (lowest) energy. In this study we explore different configurations of the genetic algorithm, namely, the population size, whether to include a crossover operation, as well as the threshold for structural clustering, to find the optimal experimental setup. Methods Multi-LZerD was executed to predict the structures of three multimeric protein complexes, using different population sizes, clustering thresholds, and configurations of mutation and crossover. We analyzed the impact of varying these parameters on the computational time and the prediction accuracy. Results and conclusions Given that computational resources is a key for handling complexes with a large number of subunits and also for computing a large number of protein complexes in a genome-scale study, finding a proper setting for sampling the conformation space is of the utmost importance. Our results show that an excessive sampling of the conformational space by increasing the population size or by introducing the crossover operation is not necessary for improving accuracy for predicting structures of small complexes. The clustering is effective in reducing redundant pairwise predictions, which leads to successful identification of near-native conformations.
Collapse
Affiliation(s)
- Juan Esquivel-Rodríguez
- Department of Computer Science, College of Science, Purdue University, West Lafayette, IN 47907, USA.
| | | |
Collapse
|