1
|
Thamkachy R, Medina-Pritchard B, Park SH, Chiodi CG, Zou J, de la Torre-Barranco M, Shimanaka K, Abad MA, Gallego Páramo C, Feederle R, Ruksenaite E, Heun P, Davies OR, Rappsilber J, Schneidman-Duhovny D, Cho US, Jeyaprakash AA. Structural basis for Mis18 complex assembly and its implications for centromere maintenance. EMBO Rep 2024:10.1038/s44319-024-00183-w. [PMID: 38951710 DOI: 10.1038/s44319-024-00183-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/06/2024] [Accepted: 06/06/2024] [Indexed: 07/03/2024] Open
Abstract
The centromere, defined by the enrichment of CENP-A (a Histone H3 variant) containing nucleosomes, is a specialised chromosomal locus that acts as a microtubule attachment site. To preserve centromere identity, CENP-A levels must be maintained through active CENP-A loading during the cell cycle. A central player mediating this process is the Mis18 complex (Mis18α, Mis18β and Mis18BP1), which recruits the CENP-A-specific chaperone HJURP to centromeres for CENP-A deposition. Here, using a multi-pronged approach, we characterise the structure of the Mis18 complex and show that multiple hetero- and homo-oligomeric interfaces facilitate the hetero-octameric Mis18 complex assembly composed of 4 Mis18α, 2 Mis18β and 2 Mis18BP1. Evaluation of structure-guided/separation-of-function mutants reveals structural determinants essential for cell cycle controlled Mis18 complex assembly and centromere maintenance. Our results provide new mechanistic insights on centromere maintenance, highlighting that while Mis18α can associate with centromeres and deposit CENP-A independently of Mis18β, the latter is indispensable for the optimal level of CENP-A loading required for preserving the centromere identity.
Collapse
Affiliation(s)
- Reshma Thamkachy
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | | | - Sang Ho Park
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Carla G Chiodi
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | - Juan Zou
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | | | - Kazuma Shimanaka
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maria Alba Abad
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | | | - Regina Feederle
- Monoclonal Antibody Core Facility, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), 85764, Neuherberg, Germany
| | - Emilija Ruksenaite
- Institute Novo Nordisk Foundation Centre for Protein Research, Copenhagen, Denmark
| | - Patrick Heun
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | - Owen R Davies
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
| | - Juri Rappsilber
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK
- Institute of Biotechnology, Technische Universität Berlin, 13355, Berlin, Germany
| | - Dina Schneidman-Duhovny
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Uhn-Soo Cho
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA
| | - A Arockia Jeyaprakash
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK.
- Gene Center, Department of Biochemistry, Ludwig Maximilians Universität, Munich, Germany.
| |
Collapse
|
2
|
Shor B, Schneidman-Duhovny D. Integrative modeling meets deep learning: Recent advances in modeling protein assemblies. Curr Opin Struct Biol 2024; 87:102841. [PMID: 38795564 DOI: 10.1016/j.sbi.2024.102841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/24/2024] [Accepted: 04/27/2024] [Indexed: 05/28/2024]
Abstract
Recent progress in protein structure prediction based on deep learning revolutionized the field of Structural Biology. Beyond single proteins, it also enabled high-throughput prediction of structures of protein-protein interactions. Despite the success in predicting complex structures, large macromolecular assemblies still require specialized approaches. Here we describe recent advances in modeling macromolecular assemblies using integrative and hierarchical approaches. We highlight applications that predict protein-protein interactions and challenges in modeling complexes based on the interaction networks, including the prediction of complex stoichiometry and heterogeneity.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel. https://twitter.com/ben_shor
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
3
|
Shor B, Schneidman-Duhovny D. CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2. Nat Methods 2024; 21:477-487. [PMID: 38326495 PMCID: PMC10927564 DOI: 10.1038/s41592-024-02174-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 01/09/2024] [Indexed: 02/09/2024]
Abstract
Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold's high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
4
|
Shor B, Schneidman-Duhovny D. Predicting structures of large protein assemblies using combinatorial assembly algorithm and AlphaFold2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.541003. [PMID: 37293053 PMCID: PMC10245790 DOI: 10.1101/2023.05.16.541003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score > 0.7) 72% of the complexes among the Top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding PDB entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold's high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
5
|
Bryant P, Pozzati G, Zhu W, Shenoy A, Kundrotas P, Elofsson A. Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search. Nat Commun 2022; 13:6028. [PMID: 36224222 PMCID: PMC9556563 DOI: 10.1038/s41467-022-33729-4] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 09/29/2022] [Indexed: 11/30/2022] Open
Abstract
AlphaFold can predict the structure of single- and multiple-chain proteins with very high accuracy. However, the accuracy decreases with the number of chains, and the available GPU memory limits the size of protein complexes which can be predicted. Here we show that one can predict the structure of large complexes starting from predictions of subcomponents. We assemble 91 out of 175 complexes with 10–30 chains from predicted subcomponents using Monte Carlo tree search, with a median TM-score of 0.51. There are 30 highly accurate complexes (TM-score ≥0.8, 33% of complete assemblies). We create a scoring function, mpDockQ, that can distinguish if assemblies are complete and predict their accuracy. We find that complexes containing symmetry are accurately assembled, while asymmetrical complexes remain challenging. The method is freely available and accesible as a Colab notebook https://colab.research.google.com/github/patrickbryant1/MoLPC/blob/master/MoLPC.ipynb. The accuracy of AlphaFold decreases with the number of protein chains and the available GPU memory limits the size of protein complexes that can be predicted. Here, the authors show that complexes with 10–30 chains can be assembled from predicted subcomponents using Monte Carlo tree search.
Collapse
Affiliation(s)
- Patrick Bryant
- Science for Life Laboratory, 172 21, Solna, Sweden. .,Department of Biochemistry and Biophysics, Stockholm University, 106 91, Stockholm, Sweden.
| | - Gabriele Pozzati
- Science for Life Laboratory, 172 21, Solna, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, 106 91, Stockholm, Sweden
| | - Wensi Zhu
- Science for Life Laboratory, 172 21, Solna, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, 106 91, Stockholm, Sweden
| | - Aditi Shenoy
- Science for Life Laboratory, 172 21, Solna, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, 106 91, Stockholm, Sweden
| | - Petras Kundrotas
- Science for Life Laboratory, 172 21, Solna, Sweden.,Center for Computational Biology, The University of Kansas, Lawrence, KS, 66047, USA
| | - Arne Elofsson
- Science for Life Laboratory, 172 21, Solna, Sweden.,Department of Biochemistry and Biophysics, Stockholm University, 106 91, Stockholm, Sweden
| |
Collapse
|
6
|
Aderinwale T, Christoffer C, Kihara D. RL-MLZerD: Multimeric protein docking using reinforcement learning. Front Mol Biosci 2022; 9:969394. [PMID: 36090027 PMCID: PMC9459051 DOI: 10.3389/fmolb.2022.969394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 08/08/2022] [Indexed: 11/24/2022] Open
Abstract
Numerous biological processes in a cell are carried out by protein complexes. To understand the molecular mechanisms of such processes, it is crucial to know the quaternary structures of the complexes. Although the structures of protein complexes have been determined by biophysical experiments at a rapid pace, there are still many important complex structures that are yet to be determined. To supplement experimental structure determination of complexes, many computational protein docking methods have been developed; however, most of these docking methods are designed only for docking with two chains. Here, we introduce a novel method, RL-MLZerD, which builds multiple protein complexes using reinforcement learning (RL). In RL-MLZerD a multi-chain assembly process is considered as a series of episodes of selecting and integrating pre-computed pairwise docking models in a RL framework. RL is effective in correctly selecting plausible pairwise models that fit well with other subunits in a complex. When tested on a benchmark dataset of protein complexes with three to five chains, RL-MLZerD showed better modeling performance than other existing multiple docking methods under different evaluation criteria, except against AlphaFold-Multimer in unbound docking. Also, it emerged that the docking order of multi-chain complexes can be naturally predicted by examining preferred paths of episodes in the RL computation.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
- *Correspondence: Daisuke Kihara,
| |
Collapse
|
7
|
Boyer B, Laurent B, Robert CH, Prévost C. Modeling Perturbations in Protein Filaments at the Micro and Meso Scale Using NAMD and PTools/Heligeom. Bio Protoc 2021; 11:e4097. [PMID: 34395733 DOI: 10.21769/bioprotoc.4097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 03/28/2021] [Accepted: 04/22/2021] [Indexed: 11/02/2022] Open
Abstract
Protein filaments are dynamic entities that respond to external stimuli by slightly or substantially modifying the internal binding geometries between successive protomers. This results in overall changes in the filament architecture, which are difficult to model due to the helical character of the system. Here, we describe how distortions in RecA nucleofilaments and their consequences on the filament-DNA and bound DNA-DNA interactions at different stages of the homologous recombination process can be modeled using the PTools/Heligeom software and subsequent molecular dynamics simulation with NAMD. Modeling methods dealing with helical macromolecular objects typically rely on symmetric assemblies and take advantage of known symmetry descriptors. Other methods dealing with single objects, such as MMTK or VMD, do not integrate the specificities of regular assemblies. By basing the model building on binding geometries at the protomer-protomer level, PTools/Heligeom frees the building process from a priori knowledge of the system topology and enables irregular architectures and symmetry disruption to be accounted for. Graphical abstract: Model of ATP hydrolysis-induced distortions in the recombinant nucleoprotein, obtained by combining RecA-DNA and two RecA-RecA binding geometries.
Collapse
Affiliation(s)
- Benjamin Boyer
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Université de Paris, F-75005, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| | - Benoist Laurent
- CNRS, FR 550, Institut de Biologie Physico-Chimique, Paris, France
| | - Charles H Robert
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Université de Paris, F-75005, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| | - Chantal Prévost
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Université de Paris, F-75005, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| |
Collapse
|
8
|
Aderinwale T, Christoffer CW, Sarkar D, Alnabati E, Kihara D. Computational structure modeling for diverse categories of macromolecular interactions. Curr Opin Struct Biol 2020; 64:1-8. [PMID: 32599506 PMCID: PMC7665979 DOI: 10.1016/j.sbi.2020.05.017] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 05/06/2020] [Accepted: 05/21/2020] [Indexed: 01/23/2023]
Abstract
Computational protein-protein docking is one of the most intensively studied topics in structural bioinformatics. The field has made substantial progress through over three decades of development. The development began with methods for rigid-body docking of two proteins, which have now been extended in different directions to cover the various macromolecular interactions observed in a cell. Here, we overview the recent developments of the variations of docking methods, including multiple protein docking, peptide-protein docking, and disordered protein docking methods.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
9
|
Ruiz Echartea ME, Ritchie DW, Chauvot de Beauchêne I. Using restraints in
EROS‐DOCK
improves model quality in pairwise and multicomponent protein docking. Proteins 2020; 88:1121-1128. [DOI: 10.1002/prot.25959] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 03/04/2020] [Accepted: 05/27/2020] [Indexed: 12/26/2022]
|
10
|
Abstract
Macromolecular complexes play a key role in cellular function. Predicting the structure and dynamics of these complexes is one of the key challenges in structural biology. Docking applications have traditionally been used to predict pairwise interactions between proteins. However, few methods exist for modeling multi-protein assemblies. Here we present two methods, CombDock and DockStar, that can predict multi-protein assemblies starting from subunit structural models. CombDock can assemble subunits without any assumptions about the pairwise interactions between subunits, while DockStar relies on the interaction graph or, alternatively, a homology model or a cryo-electron microscopy (EM) density map of the entire complex. We demonstrate the two methods using RNA polymerase II with 12 subunits and TRiC/CCT chaperonin with 16 subunits.
Collapse
Affiliation(s)
- Dina Schneidman-Duhovny
- School of Computer Science and Engineering and the Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Haim J Wolfson
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
11
|
Furmanova K, Jurcik A, Kozlikova B, Hauser H, Byska J. Multiscale Visual Drilldown for the Analysis of Large Ensembles of Multi-Body Protein Complexes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:843-852. [PMID: 31425101 DOI: 10.1109/tvcg.2019.2934333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
When studying multi-body protein complexes, biochemists use computational tools that can suggest hundreds or thousands of their possible spatial configurations. However, it is not feasible to experimentally verify more than only a very small subset of them. In this paper, we propose a novel multiscale visual drilldown approach that was designed in tight collaboration with proteomic experts, enabling a systematic exploration of the configuration space. Our approach takes advantage of the hierarchical structure of the data - from the whole ensemble of protein complex configurations to the individual configurations, their contact interfaces, and the interacting amino acids. Our new solution is based on interactively linked 2D and 3D views for individual hierarchy levels. At each level, we offer a set of selection and filtering operations that enable the user to narrow down the number of configurations that need to be manually scrutinized. Furthermore, we offer a dedicated filter interface, which provides the users with an overview of the applied filtering operations and enables them to examine their impact on the explored ensemble. This way, we maintain the history of the exploration process and thus enable the user to return to an earlier point of the exploration. We demonstrate the effectiveness of our approach on two case studies conducted by collaborating proteomic experts.
Collapse
|
12
|
Developments in integrative modeling with dynamical interfaces. Curr Opin Struct Biol 2019; 56:11-17. [DOI: 10.1016/j.sbi.2018.10.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 10/26/2018] [Accepted: 10/27/2018] [Indexed: 11/19/2022]
|
13
|
Bertoni M, Aloy P. DynBench3D, a Web-Resource to Dynamically Generate Benchmark Sets of Large Heteromeric Protein Complexes. J Mol Biol 2018; 430:4431-4438. [PMID: 30274705 DOI: 10.1016/j.jmb.2018.09.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 08/21/2018] [Accepted: 09/11/2018] [Indexed: 11/24/2022]
Abstract
Multi-protein machines are responsible for most cellular tasks, and many efforts have been invested in the systematic identification and characterization of thousands of these macromolecular assemblies. However, unfortunately, the (quasi) atomic details necessary to understand their function are available only for a tiny fraction of the known complexes. The computational biology community is developing strategies to integrate structural data of different nature, from electron microscopy to X-ray crystallography, to model large molecular machines, as it has been done for individual proteins and interactions with remarkable success. However, unlike for binary interactions, there is no reliable gold-standard set of three-dimensional (3D) complexes to benchmark the performance of these methodologies and detect their limitations. Here, we present a strategy to dynamically generate non-redundant sets of 3D heteromeric complexes with three or more components. By changing the values of sequence identity and component overlap between assemblies required to define complex redundancy, we can create sets of representative complexes with known 3D structure (i.e., target complexes). Using an identity threshold of 20% and imposing a fraction of component overlap of <0.5, we identify 495 unique target complexes, which represent a real non-redundant set of heteromeric assemblies with known 3D structure. Moreover, for each target complex, we also identify a set of assemblies, of varying degrees of identity and component overlap, that can be readily used as input in a complex modeling exercise (i.e., template subcomplexes). We hope that resources like this will significantly help the development and progress assessment of novel methodologies, as docking benchmarks and blind prediction contests did. The interactive resource is accessible at https://DynBench3D.irbbarcelona.org.
Collapse
Affiliation(s)
- Martino Bertoni
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Patrick Aloy
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.
| |
Collapse
|
14
|
Peterson LX, Togawa Y, Esquivel-Rodriguez J, Terashi G, Christoffer C, Roy A, Shin WH, Kihara D. Modeling the assembly order of multimeric heteroprotein complexes. PLoS Comput Biol 2018; 14:e1005937. [PMID: 29329283 PMCID: PMC5785014 DOI: 10.1371/journal.pcbi.1005937] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Revised: 01/25/2018] [Accepted: 12/19/2017] [Indexed: 12/31/2022] Open
Abstract
Protein-protein interactions are the cornerstone of numerous biological processes. Although an increasing number of protein complex structures have been determined using experimental methods, relatively fewer studies have been performed to determine the assembly order of complexes. In addition to the insights into the molecular mechanisms of biological function provided by the structure of a complex, knowing the assembly order is important for understanding the process of complex formation. Assembly order is also practically useful for constructing subcomplexes as a step toward solving the entire complex experimentally, designing artificial protein complexes, and developing drugs that interrupt a critical step in the complex assembly. There are several experimental methods for determining the assembly order of complexes; however, these techniques are resource-intensive. Here, we present a computational method that predicts the assembly order of protein complexes by building the complex structure. The method, named Path-LzerD, uses a multimeric protein docking algorithm that assembles a protein complex structure from individual subunit structures and predicts assembly order by observing the simulated assembly process of the complex. Benchmarked on a dataset of complexes with experimental evidence of assembly order, Path-LZerD was successful in predicting the assembly pathway for the majority of the cases. Moreover, when compared with a simple approach that infers the assembly path from the buried surface area of subunits in the native complex, Path-LZerD has the strong advantage that it can be used for cases where the complex structure is not known. The path prediction accuracy decreased when starting from unbound monomers, particularly for larger complexes of five or more subunits, for which only a part of the assembly path was correctly identified. As the first method of its kind, Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Yoichiro Togawa
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Juan Esquivel-Rodriguez
- Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America
| | - Amitava Roy
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana, United States of America
- Bioinformatics and Computational Biosciences Branch, Rocky Mountain Laboratories, NIAID, National Institutes of Health, Hamilton, Montana, United States of America
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America
- * E-mail:
| |
Collapse
|
15
|
Ronchi VP, Kim ED, Summa CM, Klein JM, Haas AL. In silico modeling of the cryptic E2∼ubiquitin-binding site of E6-associated protein (E6AP)/UBE3A reveals the mechanism of polyubiquitin chain assembly. J Biol Chem 2017; 292:18006-18023. [PMID: 28924046 DOI: 10.1074/jbc.m117.813477] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Indexed: 12/13/2022] Open
Abstract
To understand the mechanism for assembly of Lys48-linked polyubiquitin degradation signals, we previously demonstrated that the E6AP/UBE3A ligase harbors two functionally distinct E2∼ubiquitin-binding sites: a high-affinity Site 1 required for E6AP Cys820∼ubiquitin thioester formation and a canonical Site 2 responsible for subsequent chain elongation. Ordered binding to Sites 1 and 2 is here revealed by observation of UbcH7∼ubiquitin-dependent substrate inhibition of chain formation at micromolar concentrations. To understand substrate inhibition, we exploited the PatchDock algorithm to model in silico UbcH7∼ubiquitin bound to Site 1, validated by chain assembly kinetics of selected point mutants. The predicted structure buries an extensive solvent-excluded surface bringing the UbcH7∼ubiquitin thioester bond within 6 Å of the Cys820 nucleophile. Modeling onto the active E6AP trimer suggests that substrate inhibition arises from steric hindrance between Sites 1 and 2 of adjacent subunits. Confirmation that Sites 1 and 2 function in trans was demonstrated by examining the effect of E6APC820A on wild-type activity and single-turnover pulse-chase kinetics. A cyclic proximal indexation model proposes that Sites 1 and 2 function in tandem to assemble thioester-linked polyubiquitin chains from the proximal end attached to Cys820 before stochastic en bloc transfer to the target protein. Non-reducing SDS-PAGE confirms assembly of the predicted Cys820-linked 125I-polyubiquitin thioester intermediate. Other studies suggest that Glu550 serves as a general base to generate the Cys820 thiolate within the low dielectric binding interface and Arg506 functions to orient Glu550 and to stabilize the incipient anionic transition state during thioester exchange.
Collapse
Affiliation(s)
| | - Elizabeth D Kim
- From the Department of Biochemistry and Molecular Biology and
| | - Christopher M Summa
- the Department of Computer Science, University of New Orleans, New Orleans, Louisiana 70148
| | | | - Arthur L Haas
- From the Department of Biochemistry and Molecular Biology and .,the Stanley S. Scott Cancer Center, Louisiana State University Health Sciences Center, New Orleans, Louisiana 70112 and
| |
Collapse
|
16
|
Computational modeling of protein assemblies. Curr Opin Struct Biol 2017; 44:179-189. [DOI: 10.1016/j.sbi.2017.04.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Revised: 04/07/2017] [Accepted: 04/11/2017] [Indexed: 01/18/2023]
|
17
|
de Vries SJ, Zacharias M. Fast and accurate grid representations for atom-based docking with partner flexibility. J Comput Chem 2017; 38:1538-1546. [DOI: 10.1002/jcc.24795] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 01/18/2017] [Accepted: 01/19/2017] [Indexed: 12/12/2022]
Affiliation(s)
- Sjoerd J. de Vries
- MTi, UMR-S 973, Physics Department T38; Technische Universität München; James-Franck-Strasse 1 85748 Garching Germany
| | - Martin Zacharias
- MTi, UMR-S 973, Physics Department T38; Technische Universität München; James-Franck-Strasse 1 85748 Garching Germany
| |
Collapse
|
18
|
Kuzu G, Keskin O, Nussinov R, Gursoy A. PRISM-EM: template interface-based modelling of multi-protein complexes guided by cryo-electron microscopy density maps. Acta Crystallogr D Struct Biol 2016; 72:1137-1148. [PMID: 27710935 PMCID: PMC5053140 DOI: 10.1107/s2059798316013541] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 08/23/2016] [Indexed: 12/29/2022] Open
Abstract
The structures of protein assemblies are important for elucidating cellular processes at the molecular level. Three-dimensional electron microscopy (3DEM) is a powerful method to identify the structures of assemblies, especially those that are challenging to study by crystallography. Here, a new approach, PRISM-EM, is reported to computationally generate plausible structural models using a procedure that combines crystallographic structures and density maps obtained from 3DEM. The predictions are validated against seven available structurally different crystallographic complexes. The models display mean deviations in the backbone of <5 Å. PRISM-EM was further tested on different benchmark sets; the accuracy was evaluated with respect to the structure of the complex, and the correlation with EM density maps and interface predictions were evaluated and compared with those obtained using other methods. PRISM-EM was then used to predict the structure of the ternary complex of the HIV-1 envelope glycoprotein trimer, the ligand CD4 and the neutralizing protein m36.
Collapse
Affiliation(s)
- Guray Kuzu
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, 34450 Istanbul, Turkey
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, 34450 Istanbul, Turkey
- Chemical and Biological Engineering, College of Engineering, Koc University, 34450 Istanbul, Turkey
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research Inc., National Cancer Institute, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Attila Gursoy
- Computer Engineering, Koc University, 34450 Istanbul, Turkey
| |
Collapse
|
19
|
Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A. Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics. PLoS Comput Biol 2016; 12:e1004619. [PMID: 27124275 PMCID: PMC4849799 DOI: 10.1371/journal.pcbi.1004619] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Investigation of macromolecular structure and dynamics is fundamental to understanding how macromolecules carry out their functions in the cell. Significant advances have been made toward this end in silico, with a growing number of computational methods proposed yearly to study and simulate various aspects of macromolecular structure and dynamics. This review aims to provide an overview of recent advances, focusing primarily on methods proposed for exploring the structure space of macromolecules in isolation and in assemblies for the purpose of characterizing equilibrium structure and dynamics. In addition to surveying recent applications that showcase current capabilities of computational methods, this review highlights state-of-the-art algorithmic techniques proposed to overcome challenges posed in silico by the disparate spatial and time scales accessed by dynamic macromolecules. This review is not meant to be exhaustive, as such an endeavor is impossible, but rather aims to balance breadth and depth of strategies for modeling macromolecular structure and dynamics for a broad audience of novices and experts.
Collapse
Affiliation(s)
- Tatiana Maximova
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Ryan Moffatt
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Buyong Ma
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
| | - Ruth Nussinov
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
- Department of Biongineering, George Mason University, Fairfax, Virginia, United States of America
- School of Systems Biology, George Mason University, Manassas, Virginia, United States of America
| |
Collapse
|
20
|
Ahnert SE, Marsh JA, Hernández H, Robinson CV, Teichmann SA. Principles of assembly reveal a periodic table of protein complexes. Science 2016; 350:aaa2245. [PMID: 26659058 DOI: 10.1126/science.aaa2245] [Citation(s) in RCA: 153] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Structural insights into protein complexes have had a broad impact on our understanding of biological function and evolution. In this work, we sought a comprehensive understanding of the general principles underlying quaternary structure organization in protein complexes. We first examined the fundamental steps by which protein complexes can assemble, using experimental and structure-based characterization of assembly pathways. Most assembly transitions can be classified into three basic types, which can then be used to exhaustively enumerate a large set of possible quaternary structure topologies. These topologies, which include the vast majority of observed protein complex structures, enable a natural organization of protein complexes into a periodic table. On the basis of this table, we can accurately predict the expected frequencies of quaternary structure topologies, including those not yet observed. These results have important implications for quaternary structure prediction, modeling, and engineering.
Collapse
Affiliation(s)
- Sebastian E Ahnert
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK
| | - Joseph A Marsh
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK. European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helena Hernández
- Physical and Theoretical Chemistry Laboratory, Department of Chemistry, University of Oxford, South Parks Road, Oxford OX1 3QZ, UK
| | - Carol V Robinson
- Physical and Theoretical Chemistry Laboratory, Department of Chemistry, University of Oxford, South Parks Road, Oxford OX1 3QZ, UK
| | - Sarah A Teichmann
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, UK. European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| |
Collapse
|
21
|
Ahmad TA, Eweida AE, Sheweita SA. B-cell epitope mapping for the design of vaccines and effective diagnostics. ACTA ACUST UNITED AC 2016. [DOI: 10.1016/j.trivac.2016.04.003] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
22
|
Abstract
We report the performance of our approaches for protein-protein docking and interface analysis in CAPRI rounds 20-26. At the core of our pipeline was the ZDOCK program for rigid-body protein-protein docking. We then reranked the ZDOCK predictions using the ZRANK or IRAD scoring functions, pruned and analyzed energy landscapes using clustering, and analyzed the docking results using our interface prediction approach RCF. When possible, we used biological information from the literature to apply constraints to the search space during or after the ZDOCK runs. For approximately half of the standard docking challenges we made at least one prediction that was acceptable or better. For the scoring challenges we made acceptable or better predictions for all but one target. This indicates that our scoring functions are generally able to select the correct binding mode.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | | | | | | |
Collapse
|
23
|
Bourquard T, Landomiel F, Reiter E, Crépieux P, Ritchie DW, Azé J, Poupon A. Unraveling the molecular architecture of a G protein-coupled receptor/β-arrestin/Erk module complex. Sci Rep 2015; 5:10760. [PMID: 26030356 PMCID: PMC4649906 DOI: 10.1038/srep10760] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 01/26/2015] [Indexed: 12/22/2022] Open
Abstract
β-arrestins serve as signaling scaffolds downstream of G protein-coupled receptors, and thus play a crucial role in a plethora of cellular processes. Although it is largely accepted that the ability of β-arrestins to interact simultaneously with many protein partners is key in G protein-independent signaling of GPCRs, only the precise knowledge of these multimeric arrangements will allow a full understanding of the dynamics of these interactions and their functional consequences. However, current experimental procedures for the determination of the three-dimensional structures of protein-protein complexes are not well adapted to analyze these short-lived, multi-component assemblies. We propose a model of the receptor/β-arrestin/Erk1 signaling module, which is consistent with most of the available experimental data. Moreover, for the β-arrestin/Raf1 and the β-arrestin/ERK interactions, we have used the model to design interfering peptides and shown that they compete with both partners, hereby demonstrating the validity of the predicted interaction regions.
Collapse
Affiliation(s)
- Thomas Bourquard
- 1] BIOS group, INRA, UMR85, Unité Physiologie de la Reproduction et des Comportements, F-37380 Nouzilly, France; CNRS, UMR7247, F-37380 Nouzilly, France; Université François Rabelais, 37041 Tours, France; IFCE, Nouzilly, F-37380 France [2] INRIA Nancy, 615 Rue du Jardin Botanique, Villers-lès-Nancy, 54600 France
| | - Flavie Landomiel
- BIOS group, INRA, UMR85, Unité Physiologie de la Reproduction et des Comportements, F-37380 Nouzilly, France; CNRS, UMR7247, F-37380 Nouzilly, France; Université François Rabelais, 37041 Tours, France; IFCE, Nouzilly, F-37380 France
| | - Eric Reiter
- BIOS group, INRA, UMR85, Unité Physiologie de la Reproduction et des Comportements, F-37380 Nouzilly, France; CNRS, UMR7247, F-37380 Nouzilly, France; Université François Rabelais, 37041 Tours, France; IFCE, Nouzilly, F-37380 France
| | - Pascale Crépieux
- BIOS group, INRA, UMR85, Unité Physiologie de la Reproduction et des Comportements, F-37380 Nouzilly, France; CNRS, UMR7247, F-37380 Nouzilly, France; Université François Rabelais, 37041 Tours, France; IFCE, Nouzilly, F-37380 France
| | - David W Ritchie
- INRIA Nancy, 615 Rue du Jardin Botanique, Villers-lès-Nancy, 54600 France
| | - Jérôme Azé
- Bioinformatics group - AMIB INRIA - Laboratoire de Recherche en Informatique, Université Paris-Sud, Orsay, 91405 France
| | - Anne Poupon
- BIOS group, INRA, UMR85, Unité Physiologie de la Reproduction et des Comportements, F-37380 Nouzilly, France; CNRS, UMR7247, F-37380 Nouzilly, France; Université François Rabelais, 37041 Tours, France; IFCE, Nouzilly, F-37380 France
| |
Collapse
|
24
|
Amir N, Cohen D, Wolfson HJ. DockStar: a novel ILP-based integrative method for structural modeling of multimolecular protein complexes. Bioinformatics 2015; 31:2801-7. [PMID: 25913207 DOI: 10.1093/bioinformatics/btv270] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 04/19/2015] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Atomic resolution modeling of large multimolecular assemblies is a key task in Structural Cell Biology. Experimental techniques can provide atomic resolution structures of single proteins and small complexes, or low resolution data of large multimolecular complexes. RESULTS We present a novel integrative computational modeling method, which integrates both low and high resolution experimental data. The algorithm accepts as input atomic resolution structures of the individual subunits obtained from X-ray, NMR or homology modeling, and interaction data between the subunits obtained from mass spectrometry. The optimal assembly of the individual subunits is formulated as an Integer Linear Programming task. The method was tested on several representative complexes, both in the bound and unbound cases. It placed correctly most of the subunits of multimolecular complexes of up to 16 subunits and significantly outperformed the CombDock and Haddock multimolecular docking methods. AVAILABILITY AND IMPLEMENTATION http://bioinfo3d.cs.tau.ac.il/DockStar CONTACT naamaamir@mail.tau.ac.il or wolfson@tau.ac.il SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Naama Amir
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Dan Cohen
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Haim J Wolfson
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
25
|
Boyer B, Ezelin J, Poulain P, Saladin A, Zacharias M, Robert CH, Prévost C. An integrative approach to the study of filamentous oligomeric assemblies, with application to RecA. PLoS One 2015; 10:e0116414. [PMID: 25785454 PMCID: PMC4364692 DOI: 10.1371/journal.pone.0116414] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 12/09/2014] [Indexed: 11/19/2022] Open
Abstract
Oligomeric macromolecules in the cell self-organize into a wide variety of geometrical motifs such as helices, rings or linear filaments. The recombinase proteins involved in homologous recombination present many such assembly motifs. Here, we examine in particular the polymorphic characteristics of RecA, the most studied member of the recombinase family, using an integrative approach that relates local modes of monomer/monomer association to the global architecture of their screw-type organization. In our approach, local modes of association are sampled via docking or Monte Carlo simulations. This enables shedding new light on fiber morphologies that may be adopted by the RecA protein. Two distinct RecA helical morphologies, the so-called "extended" and "compressed" forms, are known to play a role in homologous recombination. We investigate the variability within each form in terms of helical parameters and steric accessibility. We also address possible helical discontinuities in RecA filaments due to multiple monomer-monomer association modes. By relating local interface organization to global filament morphology, the strategies developed here to study RecA self-assembly are particularly well suited to other DNA-binding proteins and to filamentous protein assemblies in general.
Collapse
Affiliation(s)
- Benjamin Boyer
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie, 75005 Paris, France
- MTI, INSERM UMR-M 973, Université Paris Diderot-Paris 7, Bât Lamarck, 35 rue Hélène Brion, 75205 Paris Cedex 13, France
| | - Johann Ezelin
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie, 75005 Paris, France
| | - Pierre Poulain
- DSIMB team, Inserm UMR-S 665 and Univ. Paris Diderot, Sorbonne Paris Cité, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France
- Ets Poulain, Pointe-Noire, Republic of Congo
| | - Adrien Saladin
- MTI, INSERM UMR-M 973, Université Paris Diderot-Paris 7, Bât Lamarck, 35 rue Hélène Brion, 75205 Paris Cedex 13, France
| | - Martin Zacharias
- Technische Universität München, Physik-Department, James-Franck-Str. 1, 85748 Garching, Germany
| | - Charles H. Robert
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie, 75005 Paris, France
| | - Chantal Prévost
- Laboratoire de Biochimie Théorique, CNRS, UPR 9080, Univ Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie, 75005 Paris, France
- * E-mail:
| |
Collapse
|
26
|
Abstract
The assembly of individual proteins into functional complexes is fundamental to nearly all biological processes. In recent decades, many thousands of homomeric and heteromeric protein complex structures have been determined, greatly improving our understanding of the fundamental principles that control symmetric and asymmetric quaternary structure organization. Furthermore, our conception of protein complexes has moved beyond static representations to include dynamic aspects of quaternary structure, including conformational changes upon binding, multistep ordered assembly pathways, and structural fluctuations occurring within fully assembled complexes. Finally, major advances have been made in our understanding of protein complex evolution, both in reconstructing evolutionary histories of specific complexes and in elucidating general mechanisms that explain how quaternary structure tends to evolve. The evolution of quaternary structure occurs via changes in self-assembly state or through the gain or loss of protein subunits, and these processes can be driven by both adaptive and nonadaptive influences.
Collapse
Affiliation(s)
- Joseph A Marsh
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom;
| | | |
Collapse
|
27
|
Kuzu G, Keskin O, Nussinov R, Gursoy A. Modeling protein assemblies in the proteome. Mol Cell Proteomics 2014; 13:887-96. [PMID: 24445405 PMCID: PMC3945916 DOI: 10.1074/mcp.m113.031294] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2013] [Revised: 12/13/2013] [Indexed: 11/06/2022] Open
Abstract
Most (if not all) proteins function when associated in multimolecular assemblies. Attaining the structures of protein assemblies at the atomic scale is an important aim of structural biology. Experimentally, structures are increasingly available, and computations can help bridge the resolution gap between high- and low-resolution scales. Existing computational methods have made substantial progress toward this aim; however, current approaches are still limited. Some involve manual adjustment of experimental data; some are automated docking methods, which are computationally expensive and not applicable to large-scale proteome studies; and still others exploit the symmetry of the complexes and thus are not applicable to nonsymmetrical complexes. Our study aims to take steps toward overcoming these limitations. We have developed a strategy for the construction of protein assemblies computationally based on binary interactions predicted by a motif-based protein interaction prediction tool, PRISM (Protein Interactions by Structural Matching). Previously, we have shown its power in predicting pairwise interactions. Here we take a step toward multimolecular assemblies, reflecting the more prevalent cellular scenarios. With this method we are able to construct homo-/hetero-complexes and symmetric/asymmetric complexes without a limitation on the number of components. The method considers conformational changes and is applicable to large-scale studies. We also exploit electron microscopy density maps to select a solution from among the predictions. Here we present the method, illustrate its results, and highlight its current limitations.
Collapse
Affiliation(s)
- Guray Kuzu
- From the ‡Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | - Ozlem Keskin
- From the ‡Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | - Ruth Nussinov
- §Cancer and Inflammation Program, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702
- ¶Sackler Institute of Molecular Medicine Department of Human Genetics and Molecular Medicine Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Attila Gursoy
- From the ‡Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| |
Collapse
|
28
|
Esquivel-Rodriguez J, Filos-Gonzalez V, Li B, Kihara D. Pairwise and multimeric protein-protein docking using the LZerD program suite. Methods Mol Biol 2014; 1137:209-34. [PMID: 24573484 DOI: 10.1007/978-1-4939-0366-5_15] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Physical interactions between proteins are involved in many important cell functions and are key for understanding the mechanisms of biological processes. Protein-protein docking programs provide a means to computationally construct three-dimensional (3D) models of a protein complex structure from its component protein units. A protein docking program takes two or more individual 3D protein structures, which are either experimentally solved or computationally modeled, and outputs a series of probable complex structures.In this chapter we present the LZerD protein docking suite, which includes programs for pairwise docking, LZerD and PI-LZerD, and multiple protein docking, Multi-LZerD, developed by our group. PI-LZerD takes protein docking interface residues as additional input information. The methods use a combination of shape-based protein surface features as well as physics-based scoring terms to generate protein complex models. The programs are provided as stand-alone programs and can be downloaded from http://kiharalab.org/proteindocking.
Collapse
|
29
|
Popov P, Ritchie DW, Grudinin S. DockTrina: docking triangular protein trimers. Proteins 2013; 82:34-44. [PMID: 23775700 DOI: 10.1002/prot.24344] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Revised: 05/30/2013] [Accepted: 05/31/2013] [Indexed: 11/06/2022]
Abstract
In spite of the abundance of oligomeric proteins within a cell, the structural characterization of protein-protein interactions is still a challenging task. In particular, many of these interactions involve heteromeric complexes, which are relatively difficult to determine experimentally. Hence there is growing interest in using computational techniques to model such complexes. However, assembling large heteromeric complexes computationally is a highly combinatorial problem. Nonetheless the problem can be simplified greatly by considering interactions between protein trimers. After dimers and monomers, triangular trimers (i.e. trimers with pair-wise contacts between all three pairs of proteins) are the most frequently observed quaternary structural motifs according to the three-dimensional (3D) complex database. This article presents DockTrina, a novel protein docking method for modeling the 3D structures of nonsymmetrical triangular trimers. The method takes as input pair-wise contact predictions from a rigid body docking program. It then scans and scores all possible combinations of pairs of monomers using a very fast root mean square deviation test. Finally, it ranks the predictions using a scoring function which combines triples of pair-wise contact terms and a geometric clash penalty term. The overall approach takes less than 2 min per complex on a modern desktop computer. The method is tested and validated using a benchmark set of 220 bound and seven unbound protein trimer structures. DockTrina will be made available at http://nano-d.inrialpes.fr/software/docktrina.
Collapse
Affiliation(s)
- Petr Popov
- NANO-D, INRIA Grenoble-Rhone-Alpes, 38334 Saint Ismier Cedex, Montbonnot, France; Laboratoire Jean Kuntzmann, B.P. 53, 38041 Grenoble Cedex 9, France
| | | | | |
Collapse
|
30
|
Basin Hopping as a General and Versatile Optimization Framework for the Characterization of Biological Macromolecules. ACTA ACUST UNITED AC 2012. [DOI: 10.1155/2012/674832] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Since its introduction, the basin hopping (BH) framework has proven useful for hard nonlinear optimization problems with multiple variables and modalities. Applications span a wide range, from packing problems in geometry to characterization of molecular states in statistical physics. BH is seeing a reemergence in computational structural biology due to its ability to obtain a coarse-grained representation of
the protein energy surface in terms of local minima. In this paper, we show that the BH framework is general and versatile, allowing to address problems related to the characterization of protein structure, assembly, and motion due to its fundamental ability to sample minima in a high-dimensional variable space. We show how specific implementations of the main components in BH yield algorithmic realizations that attain state-of-the-art results in the context of ab initio protein structure prediction and rigid protein-protein docking. We also show that BH can map intermediate minima related with motions connecting diverse stable functionally relevant states in a protein molecule,
thus serving as a first step towards the characterization of transition trajectories connecting these states.
Collapse
|
31
|
Venkatraman V, Ritchie DW. Predicting Multi-Component Protein Assemblies Using an Ant Colony Approach. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH 2012. [DOI: 10.4018/jsir.2012070102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Many biological processes are governed by large assemblies of protein molecules. However, it is often very difficult to determine the three-dimensional structures of these assemblies using experimental biophysical techniques. Hence there is a need to develop computational approaches to fill this gap. This article presents an ant colony optimization approach to predict the structure of large multi-component protein complexes. Starting from pair-wise docking predictions, a multi-graph consisting of vertices representing the component proteins and edges representing candidate interactions is constructed. This allows the assembly problem to be expressed in terms of searching for a minimum weight spanning tree. However, because the problem remains highly combinatorial, the search space cannot be enumerated exhaustively and therefore heuristic optimisation techniques must be used. The utility of the ant colony based approach is demonstrated by re-assembling known protein complexes from the Protein Data Bank. The algorithm is able to identify near-native solutions for five of the six cases tested. This demonstrates that the ant colony approach provides a useful way to deal with the highly combinatorial multi-component protein assembly problem.
Collapse
|
32
|
HASHMI IRINA, AKBAL-DELIBAS BAHAR, HASPEL NURIT, SHEHU AMARDA. GUIDING PROTEIN DOCKING WITH GEOMETRIC AND EVOLUTIONARY INFORMATION. J Bioinform Comput Biol 2012; 10:1242008. [DOI: 10.1142/s0219720012420085] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Structural modeling of molecular assemblies promises to improve our understanding of molecular interactions and biological function. Even when focusing on modeling structures of protein dimers from knowledge of monomeric native structure, docking two rigid structures onto one another entails exploring a large configurational space. This paper presents a novel approach for docking protein molecules and elucidating native-like configurations of protein dimers. The approach makes use of geometric hashing to focus the docking of monomeric units on geometrically complementary regions through rigid-body transformations. This geometry-based approach improves the feasibility of searching the combined configurational space. The search space is narrowed even further by focusing the sought rigid-body transformations around molecular surface regions composed of amino acids with high evolutionary conservation. This condition is based on recent findings, where analysis of protein assemblies reveals that many functional interfaces are significantly conserved throughout evolution. Different search procedures are employed in this work to search the resulting narrowed configurational space. A proof-of-concept energy-guided probabilistic search procedure is also presented. Results are shown on a broad list of 18 protein dimers and additionally compared with data reported by other labs. Our analysis shows that focusing the search around evolutionary-conserved interfaces results in lower lRMSDs.
Collapse
Affiliation(s)
- IRINA HASHMI
- Department of Computer Science, George Mason University, Fairfax, VA, 22030, USA
| | - BAHAR AKBAL-DELIBAS
- Department of Computer Science, University of Massachusetts at Boston, Boston, MA, 02125, USA
| | - NURIT HASPEL
- Department of Computer Science, University of Massachusetts at Boston, Boston, MA, 02125, USA
| | - AMARDA SHEHU
- Department of Computer Science, George Mason University, Fairfax, VA, 22030, USA
- Department of Bioinformatics and Computational Biology, George Mason University, Fairfax, VA, 22030, USA
- Department of Bioengineering, George Mason University, Fairfax, VA, 22030, USA
| |
Collapse
|
33
|
Esquivel-Rodríguez J, Yang YD, Kihara D. Multi-LZerD: multiple protein docking for asymmetric complexes. Proteins 2012; 80:1818-33. [PMID: 22488467 DOI: 10.1002/prot.24079] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2012] [Revised: 03/08/2012] [Accepted: 03/23/2012] [Indexed: 11/06/2022]
Abstract
The tertiary structures of protein complexes provide a crucial insight about the molecular mechanisms that regulate their functions and assembly. However, solving protein complex structures by experimental methods is often more difficult than single protein structures. Here, we have developed a novel computational multiple protein docking algorithm, Multi-LZerD, that builds models of multimeric complexes by effectively reusing pairwise docking predictions of component proteins. A genetic algorithm is applied to explore the conformational space followed by a structure refinement procedure. Benchmark on eleven hetero-multimeric complexes resulted in near-native conformations for all but one of them (a root mean square deviation smaller than 2.5Å). We also show that our method copes with unbound docking cases well, outperforming the methodology that can be directly compared with our approach. Multi-LZerD was able to predict near-native structures for multimeric complexes of various topologies.
Collapse
Affiliation(s)
- Juan Esquivel-Rodríguez
- Department of Computer Science, College of Science, Purdue University, West Lafayette, Indiana 47907, USA
| | | | | |
Collapse
|
34
|
Esquivel-Rodríguez J, Kihara D. Evaluation of multiple protein docking structures using correctly predicted pairwise subunits. BMC Bioinformatics 2012; 13 Suppl 2:S6. [PMID: 22536869 PMCID: PMC3377905 DOI: 10.1186/1471-2105-13-s2-s6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Background Many functionally important proteins in a cell form complexes with multiple chains. Therefore, computational prediction of multiple protein complexes is an important task in bioinformatics. In the development of multiple protein docking methods, it is important to establish a metric for evaluating prediction results in a reasonable and practical fashion. However, since there are only few works done in developing methods for multiple protein docking, there is no study that investigates how accurate structural models of multiple protein complexes should be to allow scientists to gain biological insights. Methods We generated a series of predicted models (decoys) of various accuracies by our multiple protein docking pipeline, Multi-LZerD, for three multi-chain complexes with 3, 4, and 6 chains. We analyzed the decoys in terms of the number of correctly predicted pair conformations in the decoys. Results and conclusion We found that pairs of chains with the correct mutual orientation exist even in the decoys with a large overall root mean square deviation (RMSD) to the native. Therefore, in addition to a global structure similarity measure, such as the global RMSD, the quality of models for multiple chain complexes can be better evaluated by using the local measurement, the number of chain pairs with correct mutual orientation. We termed the fraction of correctly predicted pairs (RMSD at the interface of less than 4.0Å) as fpair and propose to use it for evaluation of the accuracy of multiple protein docking.
Collapse
Affiliation(s)
- Juan Esquivel-Rodríguez
- Department of Computer Science, College of Science, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
35
|
Clarke D, Bhardwaj N, Gerstein MB. Novel insights through the integration of structural and functional genomics data with protein networks. J Struct Biol 2012; 179:320-6. [PMID: 22343087 DOI: 10.1016/j.jsb.2012.02.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2011] [Revised: 02/02/2012] [Accepted: 02/02/2012] [Indexed: 12/13/2022]
Abstract
In recent years, major advances in genomics, proteomics, macromolecular structure determination, and the computational resources capable of processing and disseminating the large volumes of data generated by each have played major roles in advancing a more systems-oriented appreciation of biological organization. One product of systems biology has been the delineation of graph models for describing genome-wide protein-protein interaction networks. The network organization and topology which emerges in such models may be used to address fundamental questions in an array of cellular processes, as well as biological features intrinsic to the constituent proteins (or "nodes") themselves. However, graph models alone constitute an abstraction which neglects the underlying biological and physical reality that the network's nodes and edges are highly heterogeneous entities. Here, we explore some of the advantages of introducing a protein structural dimension to such models, as the marriage of conventional network representations with macromolecular structural data helps to place static node and edge constructs in a biologically more meaningful context. We emphasize that 3D protein structures constitute a valuable conceptual and predictive framework by discussing examples of the insights provided, such as enabling in silico predictions of protein-protein interactions, providing rational and compelling classification schemes for network elements, as well as revealing interesting intrinsic differences between distinct node types, such as disorder and evolutionary features, which may then be rationalized in light of their respective functions within networks.
Collapse
Affiliation(s)
- Declan Clarke
- Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | | | | |
Collapse
|
36
|
Melquiond AS, Karaca E, Kastritis PL, Bonvin AM. Next challenges in protein-protein docking: from proteome to interactome and beyond. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.91] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
37
|
Mashiach-Farkash E, Nussinov R, Wolfson HJ. SymmRef: a flexible refinement method for symmetric multimers. Proteins 2011; 79:2607-23. [PMID: 21721046 PMCID: PMC3155011 DOI: 10.1002/prot.23082] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2010] [Revised: 05/02/2011] [Accepted: 05/04/2011] [Indexed: 11/11/2022]
Abstract
Symmetric protein complexes are abundant in the living cell. Predicting their atomic structure can shed light on the mechanism of many important biological processes. Symmetric docking methods aim to predict the structure of these complexes given the unbound structure of a single monomer, or its model. Symmetry constraints reduce the search-space of these methods and make the prediction easier compared to asymmetric protein-protein docking. However, the challenge of modeling the conformational changes that the monomer might undergo is a major obstacle. In this article, we present SymmRef, a novel method for refinement and reranking of symmetric docking solutions. The method models backbone and side-chain movements and optimizes the rigid-body orientations of the monomers. The backbone movements are modeled by normal modes minimization and the conformations of the side-chains are modeled by selecting optimal rotamers. Since solved structures of symmetric multimers show asymmetric side-chain conformations, we do not use symmetry constraints in the side-chain optimization procedure. The refined models are re-ranked according to an energy score. We tested the method on a benchmark of unbound docking challenges. The results show that the method significantly improves the accuracy and the ranking of symmetric rigid docking solutions. SymmRef is available for download at http:// bioinfo3d.cs.tau.ac.il/SymmRef/download.html.
Collapse
Affiliation(s)
- Efrat Mashiach-Farkash
- Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ruth Nussinov
- Basic Research Program, SAIC-Frederick, Inc., Center for Cancer Research Nanobiology Program, NCI - Frederick, Frederick, MD 21702, USA
- Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Haim J. Wolfson
- Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
38
|
Lasker K, Sali A, Wolfson HJ. Determining macromolecular assembly structures by molecular docking and fitting into an electron density map. Proteins 2011; 78:3205-11. [PMID: 20827723 DOI: 10.1002/prot.22845] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Structural models of macromolecular assemblies are instrumental for gaining a mechanistic understanding of cellular processes. Determining these structures is a major challenge for experimental techniques, such as X-ray crystallography, NMR spectroscopy and electron microscopy (EM). Thus, computational modeling techniques, including molecular docking, are required. The development of most molecular docking methods has so far been focused on modeling of binary complexes. We have recently introduced the MultiFit method for modeling the structure of a multisubunit complex by simultaneously optimizing the fit of the model into an EM density map of the entire complex and the shape complementarity between interacting subunits. Here, we report algorithmic advances of the MultiFit method that result in an efficient and accurate assembly of the input subunits into their density map. The successful predictions and the increasing number of complexes being characterized by EM suggests that the CAPRI challenge could be extended to include docking-based modeling of macromolecular assemblies guided by EM.
Collapse
Affiliation(s)
- Keren Lasker
- Raymond and Beverly Sackler Faculty of Exact Sciences, Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | |
Collapse
|
39
|
Tuukkanen A, Huang B, Henschel A, Stewart F, Schroeder M. Structural modeling of histone methyltransferase complex Set1C from Saccharomyces cerevisiae using constraint-based docking. Proteomics 2011; 10:4186-95. [PMID: 21046623 DOI: 10.1002/pmic.201000283] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Set1C is a histone methyltransferase playing an important role in yeast gene regulation. Modeling the structure of this eight-subunit protein complex is an important open problem to further elucidate its functional mechanism. Recently, there has been progress in modeling of larger complexes using constraints to restrict the combinatorial explosion in binary docking of subunits. Here, we model the subunits of Set1C and develop a constraint-based docking approach, which uses high-quality protein interaction as well as functional data to guide and constrain the combinatorial assembly procedure. We obtained 22 final models. The core complex consisting of the subunits Set1, Bre2, Sdc1 and Swd2 is conformationally conserved in over half of the models, thus, giving high confidence. We characterize these high-confidence and the lower confidence interfaces and discuss implications for the function of Set1C.
Collapse
Affiliation(s)
- Anne Tuukkanen
- Biotechnology Center (BIOTEC), Technische Universität Dresden, Dresden, Germany
| | | | | | | | | |
Collapse
|
40
|
de Vries SJ, Melquiond ASJ, Kastritis PL, Karaca E, Bordogna A, van Dijk M, Rodrigues JPGLM, Bonvin AMJJ. Strengths and weaknesses of data-driven docking in critical assessment of prediction of interactions. Proteins 2011; 78:3242-9. [PMID: 20718048 DOI: 10.1002/prot.22814] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The recent CAPRI rounds have introduced new docking challenges in the form of protein-RNA complexes, multiple alternative interfaces, and an unprecedented number of targets for which homology modeling was required. We present here the performance of HADDOCK and its web server in the CAPRI experiment and discuss the strengths and weaknesses of data-driven docking. HADDOCK was successful for 6 out of 9 complexes (6 out of 11 targets) and accurately predicted the individual interfaces for two more complexes. The HADDOCK server, which is the first allowing the simultaneous docking of generic multi-body complexes, was successful in 4 out of 7 complexes for which it participated. In the scoring experiment, we predicted the highest number of targets of any group. The main weakness of data-driven docking revealed from these last CAPRI results is its vulnerability for incorrect experimental data related to the interface or the stoichiometry of the complex. At the same time, the use of experimental and/or predicted information is also the strength of our approach as evidenced for those targets for which accurate experimental information was available (e.g., the 10 three-stars predictions for T40!). Even when the models show a wrong orientation, the individual interfaces are generally well predicted with an average coverage of 60% ± 26% over all targets. This makes data-driven docking particularly valuable in a biological context to guide experimental studies like, for example, targeted mutagenesis.
Collapse
Affiliation(s)
- Sjoerd J de Vries
- NMR Research Group, Bijvoet Center for Biomolecular Research, Utrecht University, 3584 CH Utrecht, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
41
|
Bastard K, Saladin A, Prévost C. Accounting for large amplitude protein deformation during in silico macromolecular docking. Int J Mol Sci 2011; 12:1316-33. [PMID: 21541061 PMCID: PMC3083708 DOI: 10.3390/ijms12021316] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2010] [Revised: 01/07/2011] [Accepted: 02/08/2011] [Indexed: 12/23/2022] Open
Abstract
Rapid progress of theoretical methods and computer calculation resources has turned in silico methods into a conceivable tool to predict the 3D structure of macromolecular assemblages, starting from the structure of their separate elements. Still, some classes of complexes represent a real challenge for macromolecular docking methods. In these complexes, protein parts like loops or domains undergo large amplitude deformations upon association, thus remodeling the surface accessible to the partner protein or DNA. We discuss the problems linked with managing such rearrangements in docking methods and we review strategies that are presently being explored, as well as their limitations and success.
Collapse
Affiliation(s)
- Karine Bastard
- LABIS, Genoscope, CEA, 2 rue Gaston Cremieux, F-91057 Evry Cedex, France; E-Mail:
| | - Adrien Saladin
- MTI, INSERM UMR-M 973, Paris Diderot-Paris 7 University, Bât Lamarck, 35 rue Hélène Brion, F-75205 Paris Cedex 13, France; E-Mail:
| | - Chantal Prévost
- LBT-UPR 9080 CNRS, IBPC, 13 rue Pierre et Marie Curie, F-75005 Paris, France
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +33-(0)1 58 41 51 71, Fax: +33-(0)1 58 415 026
| |
Collapse
|
42
|
Stein A, Mosca R, Aloy P. Three-dimensional modeling of protein interactions and complexes is going 'omics. Curr Opin Struct Biol 2011; 21:200-8. [PMID: 21320770 DOI: 10.1016/j.sbi.2011.01.005] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Revised: 01/11/2011] [Accepted: 01/13/2011] [Indexed: 10/18/2022]
Abstract
High-throughput interaction discovery initiatives have revealed the existence of hundreds of multiprotein complexes whose functions are regulated through thousands of protein-protein interactions (PPIs). However, the structural details of these interactions, often necessary to understand their function, are only available for a tiny fraction, and the experimental difficulties surrounding complex structure determination make computational modeling techniques paramount. In this manuscript, we critically review some of the most recent developments in the field of structural bioinformatics applied to the modeling of protein interactions and complexes, from large macromolecular machines to domain-domain and peptide-mediated interactions. In particular, we place a special emphasis on those methods that can be applied in a proteome-wide manner, and discuss how they will help in the ultimate objective of building 3D interactome networks.
Collapse
Affiliation(s)
- Amelie Stein
- Institute for Research in Biomedicine (IRB Barcelona), Joint IRB-BSC Program in Computational Biology, c/Baldiri i Reixac 10-12, 08028 Barcelona, Spain
| | | | | |
Collapse
|
43
|
Macromolecular docking restrained by a small angle X-ray scattering profile. J Struct Biol 2010; 173:461-71. [PMID: 20920583 DOI: 10.1016/j.jsb.2010.09.023] [Citation(s) in RCA: 86] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 09/26/2010] [Indexed: 11/24/2022]
Abstract
While many structures of single protein components are becoming available, structural characterization of their complexes remains challenging. Methods for modeling assembly structures from individual components frequently suffer from large errors, due to protein flexibility and inaccurate scoring functions. However, when additional information is available, it may be possible to reduce the errors and compute near-native complex structures. One such type of information is a small angle X-ray scattering (SAXS) profile that can be collected in a high-throughput fashion from a small amount of sample in solution. Here, we present an efficient method for protein-protein docking with a SAXS profile (FoXSDock): generation of complex models by rigid global docking with PatchDock, filtering of the models based on the SAXS profile, clustering of the models, and refining the interface by flexible docking with FireDock. FoXSDock is benchmarked on 124 protein complexes with simulated SAXS profiles, as well as on 6 complexes with experimentally determined SAXS profiles. When induced fit is less than 1.5Å interface C(α) RMSD and the fraction residues of missing from the component structures is less than 3%, FoXSDock can find a model close to the native structure within the top 10 predictions in 77% of the cases; in comparison, docking alone succeeds in only 34% of the cases. Thus, the integrative approach significantly improves on molecular docking alone. The improvement arises from an increased resolution of rigid docking sampling and more accurate scoring.
Collapse
|
44
|
Lasker K, Phillips JL, Russel D, Velázquez-Muriel J, Schneidman-Duhovny D, Tjioe E, Webb B, Schlessinger A, Sali A. Integrative structure modeling of macromolecular assemblies from proteomics data. Mol Cell Proteomics 2010; 9:1689-702. [PMID: 20507923 DOI: 10.1074/mcp.r110.000067] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Proteomics techniques have been used to generate comprehensive lists of protein interactions in a number of species. However, relatively little is known about how these interactions result in functional multiprotein complexes. This gap can be bridged by combining data from proteomics experiments with data from established structure determination techniques. Correspondingly, integrative computational methods are being developed to provide descriptions of protein complexes at varying levels of accuracy and resolution, ranging from complex compositions to detailed atomic structures.
Collapse
Affiliation(s)
- Keren Lasker
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California 94158, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Karaca E, Melquiond ASJ, de Vries SJ, Kastritis PL, Bonvin AMJJ. Building macromolecular assemblies by information-driven docking: introducing the HADDOCK multibody docking server. Mol Cell Proteomics 2010; 9:1784-94. [PMID: 20305088 PMCID: PMC2938057 DOI: 10.1074/mcp.m000051-mcp201] [Citation(s) in RCA: 107] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Over the last years, large scale proteomics studies have generated a wealth of information of biomolecular complexes. Adding the structural dimension to the resulting interactomes represents a major challenge that classical structural experimental methods alone will have difficulties to confront. To meet this challenge, complementary modeling techniques such as docking are thus needed. Among the current docking methods, HADDOCK (High Ambiguity-Driven DOCKing) distinguishes itself from others by the use of experimental and/or bioinformatics data to drive the modeling process and has shown a strong performance in the critical assessment of prediction of interactions (CAPRI), a blind experiment for the prediction of interactions. Although most docking programs are limited to binary complexes, HADDOCK can deal with multiple molecules (up to six), a capability that will be required to build large macromolecular assemblies. We present here a novel web interface of HADDOCK that allows the user to dock up to six biomolecules simultaneously. This interface allows the inclusion of a large variety of both experimental and/or bioinformatics data and supports several types of cyclic and dihedral symmetries in the docking of multibody assemblies. The server was tested on a benchmark of six cases, containing five symmetric homo-oligomeric protein complexes and one symmetric protein-DNA complex. Our results reveal that, in the presence of either bioinformatics and/or experimental data, HADDOCK shows an excellent performance: in all cases, HADDOCK was able to generate good to high quality solutions and ranked them at the top, demonstrating its ability to model symmetric multicomponent assemblies. Docking methods can thus play an important role in adding the structural dimension to interactomes. However, although the current docking methodologies were successful for a vast range of cases, considering the variety and complexity of macromolecular assemblies, inclusion of some kind of experimental information (e.g. from mass spectrometry, nuclear magnetic resonance, cryoelectron microscopy, etc.) will remain highly desirable to obtain reliable results.
Collapse
Affiliation(s)
- Ezgi Karaca
- Bijvoet Center for Biomolecular Research, Science Faculty, Utrecht University, Utrecht, The Netherlands
| | | | | | | | | |
Collapse
|
46
|
Abstract
The Protein Data Bank contains the description of approximately 27 000 protein-ligand binding sites. Most of the ligands at these sites are biologically active small molecules, affecting the biological function of the protein. The classification of their binding sites may lead to relevant results in drug discovery and design. Clusters of similar binding sites were created here by a hybrid, sequence and spatial structure-based approach, using the OPTICS clustering algorithm. A dissimilarity measure was defined: a distance function on the amino acid sequences of the binding sites. All the binding sites were clustered in the Protein Data Bank according to this distance function, and it was found that the clusters characterized well the Enzyme Commission numbers of the entries. The results, carefully color coded by the Enzyme Commission numbers of the proteins, containing the 20 967 binding sites clustered, are available as html files in three parts at http://pitgroup.org/seqclust/.
Collapse
Affiliation(s)
- Gábor Iván
- Protein Information Technology Group, Department of Computer Science, Eötvös University, Budapest, Hungary
| | | | | |
Collapse
|
47
|
Abstract
The quaternary structure (QS) of a protein is determined by measuring its molecular weight in solution. The data have to be extracted from the literature, and they may be missing even for proteins that have a crystal structure reported in the Protein Data Bank (PDB). The PDB and other databases derived from it report QS information that either was obtained from the depositors or is based on an analysis of the contacts between polypeptide chains in the crystal, and this frequently differs from the QS determined in solution.The QS of a protein can be predicted from its sequence using either homology or threading methods. However, a majority of the proteins with less than 30% sequence identity have different QSs. A model of the QS can also be derived by docking the subunits when their 3D structure is independently known, but the model is likely to be incorrect if large conformation changes take place when the oligomer assembles.
Collapse
Affiliation(s)
- Anne Poupon
- Yeast Structural Genomics, IBBMC UMR 8619 CNRS, Université Paris-Sud, Orsay, France
| | | |
Collapse
|
48
|
Saladin A, Fiorucci S, Poulain P, Prévost C, Zacharias M. PTools: an opensource molecular docking library. BMC STRUCTURAL BIOLOGY 2009; 9:27. [PMID: 19409097 PMCID: PMC2685806 DOI: 10.1186/1472-6807-9-27] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2008] [Accepted: 05/01/2009] [Indexed: 11/18/2022]
Abstract
Background Macromolecular docking is a challenging field of bioinformatics. Developing new algorithms is a slow process generally involving routine tasks that should be found in a robust library and not programmed from scratch for every new software application. Results We present an object-oriented Python/C++ library to help the development of new docking methods. This library contains low-level routines like PDB-format manipulation functions as well as high-level tools for docking and analyzing results. We also illustrate the ease of use of this library with the detailed implementation of a 3-body docking procedure. Conclusion The PTools library can handle molecules at coarse-grained or atomic resolution and allows users to rapidly develop new software. The library is already in use for protein-protein and protein-DNA docking with the ATTRACT program and for simulation analysis. This library is freely available under the GNU GPL license, together with detailed documentation.
Collapse
Affiliation(s)
- Adrien Saladin
- Computational Biology, School of Engineering and Science, Jacobs University Bremen, 28759 Bremen, Germany.
| | | | | | | | | |
Collapse
|
49
|
Lasker K, Topf M, Sali A, Wolfson HJ. Inferential optimization for simultaneous fitting of multiple components into a CryoEM map of their assembly. J Mol Biol 2009; 388:180-94. [PMID: 19233204 DOI: 10.1016/j.jmb.2009.02.031] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2008] [Revised: 12/29/2008] [Accepted: 02/12/2009] [Indexed: 11/24/2022]
Abstract
Models of macromolecular assemblies are essential for a mechanistic description of cellular processes. Such models are increasingly obtained by fitting atomic-resolution structures of components into a density map of the whole assembly. Yet, current density-fitting techniques are frequently insufficient for an unambiguous determination of the positions and orientations of all components. Here, we describe MultiFit, a method used for simultaneously fitting atomic structures of components into their assembly density map at resolutions as low as 25 A. The component positions and orientations are optimized with respect to a scoring function that includes the quality-of-fit of components in the map, the protrusion of components from the map envelope, and the shape complementarity between pairs of components. The scoring function is optimized by our exact inference optimizer DOMINO (Discrete Optimization of Multiple INteracting Objects) that efficiently finds the global minimum in a discrete sampling space. MultiFit was benchmarked on seven assemblies of known structure, consisting of up to seven proteins each. The input atomic structures of the components were obtained from the Protein Data Bank, as well as by comparative modeling based on a 16-99% sequence identity to a template structure. A near-native configuration was usually found as the top-scoring model. Therefore, MultiFit can provide initial configurations for further refinement of many multicomponent assembly structures described by electron microscopy.
Collapse
Affiliation(s)
- Keren Lasker
- Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel-Aviv 69978, Israel.
| | | | | | | |
Collapse
|
50
|
Lauck F, Helms V, Geyer T. Graph Measures Reveal Fine Structure of Complexes Forming in Multiparticle Simulations. J Chem Theory Comput 2009; 5:641-8. [DOI: 10.1021/ct800396v] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Florian Lauck
- Zentrum für Bioinformatik, Universität des Saarlandes, D-66041 Saarbrücken, Germany
| | - Volkhard Helms
- Zentrum für Bioinformatik, Universität des Saarlandes, D-66041 Saarbrücken, Germany
| | - Tihamér Geyer
- Zentrum für Bioinformatik, Universität des Saarlandes, D-66041 Saarbrücken, Germany
| |
Collapse
|