101
|
Qin S, Zhou HX. PI 2PE: A Suite of Web Servers for Predictions Ranging From Protein Structure to Binding Kinetics. Biophys Rev 2012; 5:41-46. [PMID: 23526172 DOI: 10.1007/s12551-012-0086-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
PI2PE (http://pipe.sc.fsu.edu) is a suite of four web servers for predicting a variety of folding- and binding-related properties of proteins. These include the solvent accessibility of amino acids upon protein folding, the amino acids forming the interfaces of protein-protein and protein-nucleic acid complexes, and the binding rate constants of these complexes. Three of the servers debuted in 2007, and have garnered ~2,500 unique users and finished over 30,000 jobs. The functionalities of these servers are now enhanced, and a new sever, for predicting the binding rate constants, is added. Together, these web servers form a pipeline from protein sequence to tertiary structure, then to quaternary structure, and finally to binding kinetics.
Collapse
Affiliation(s)
- Sanbo Qin
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, Florida 32306, USA
| | | |
Collapse
|
102
|
Ma X, Gao L. Discovering protein complexes in protein interaction networks via exploring the weak ties effect. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 1:S6. [PMID: 23046740 PMCID: PMC3403613 DOI: 10.1186/1752-0509-6-s1-s6] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
BACKGROUND Studying protein complexes is very important in biological processes since it helps reveal the structure-functionality relationships in biological networks and much attention has been paid to accurately predict protein complexes from the increasing amount of protein-protein interaction (PPI) data. Most of the available algorithms are based on the assumption that dense subgraphs correspond to complexes, failing to take into account the inherence organization within protein complex and the roles of edges. Thus, there is a critical need to investigate the possibility of discovering protein complexes using the topological information hidden in edges. RESULTS To provide an investigation of the roles of edges in PPI networks, we show that the edges connecting less similar vertices in topology are more significant in maintaining the global connectivity, indicating the weak ties phenomenon in PPI networks. We further demonstrate that there is a negative relation between the weak tie strength and the topological similarity. By using the bridges, a reliable virtual network is constructed, in which each maximal clique corresponds to the core of a complex. By this notion, the detection of the protein complexes is transformed into a classic all-clique problem. A novel core-attachment based method is developed, which detects the cores and attachments, respectively. A comprehensive comparison among the existing algorithms and our algorithm has been made by comparing the predicted complexes against benchmark complexes. CONCLUSIONS We proved that the weak tie effect exists in the PPI network and demonstrated that the density is insufficient to characterize the topological structure of protein complexes. Furthermore, the experimental results on the yeast PPI network show that the proposed method outperforms the state-of-the-art algorithms. The analysis of detected modules by the present algorithm suggests that most of these modules have well biological significance in context of complexes, suggesting that the roles of edges are critical in discovering protein complexes.
Collapse
Affiliation(s)
- Xiaoke Ma
- School of Computer Science and Technology, Xidian University, 710071, PR China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, 710071, PR China
| |
Collapse
|
103
|
Wang J, Xie D, Lin H, Yang Z, Zhang Y. Filtering Gene Ontology semantic similarity for identifying protein complexes in large protein interaction networks. Proteome Sci 2012; 10 Suppl 1:S18. [PMID: 22759576 PMCID: PMC3380758 DOI: 10.1186/1477-5956-10-s1-s18] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many biological processes recognize in particular the importance of protein complexes, and various computational approaches have been developed to identify complexes from protein-protein interaction (PPI) networks. However, high false-positive rate of PPIs leads to challenging identification. RESULTS A protein semantic similarity measure is proposed in this study, based on the ontology structure of Gene Ontology (GO) terms and GO annotations to estimate the reliability of interactions in PPI networks. Interaction pairs with low GO semantic similarity are removed from the network as unreliable interactions. Then, a cluster-expanding algorithm is used to detect complexes with core-attachment structure on filtered network. Our method is applied to three different yeast PPI networks. The effectiveness of our method is examined on two benchmark complex datasets. Experimental results show that our method performed better than other state-of-the-art approaches in most evaluation metrics. CONCLUSIONS The method detects protein complexes from large scale PPI networks by filtering GO semantic similarity. Removing interactions with low GO similarity significantly improves the performance of complex identification. The expanding strategy is also effective to identify attachment proteins of complexes.
Collapse
Affiliation(s)
- Jian Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Dong Xie
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Hongfei Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Zhihao Yang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Yijia Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
104
|
Garma L, Mukherjee S, Mitra P, Zhang Y. How many protein-protein interactions types exist in nature? PLoS One 2012; 7:e38913. [PMID: 22719985 PMCID: PMC3374795 DOI: 10.1371/journal.pone.0038913] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2012] [Accepted: 05/14/2012] [Indexed: 11/18/2022] Open
Abstract
“Protein quaternary structure universe” refers to the ensemble of all protein-protein complexes across all organisms in nature. The number of quaternary folds thus corresponds to the number of ways proteins physically interact with other proteins. This study focuses on answering two basic questions: Whether the number of protein-protein interactions is limited and, if yes, how many different quaternary folds exist in nature. By all-to-all sequence and structure comparisons, we grouped the protein complexes in the protein data bank (PDB) into 3,629 families and 1,761 folds. A statistical model was introduced to obtain the quantitative relation between the numbers of quaternary families and quaternary folds in nature. The total number of possible protein-protein interactions was estimated around 4,000, which indicates that the current protein repository contains only 42% of quaternary folds in nature and a full coverage needs approximately a quarter century of experimental effort. The results have important implications to the protein complex structural modeling and the structure genomics of protein-protein interactions.
Collapse
Affiliation(s)
- Leonardo Garma
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Biocenter Oulu and Department of Biochemistry, University of Oulu, Oulu, Finland
| | - Srayanta Mukherjee
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Pralay Mitra
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
105
|
Talavera D, Williams SG, Norris MG, Robertson DL, Lovell SC. Evolvability of Yeast Protein–Protein Interaction Interfaces. J Mol Biol 2012; 419:387-96. [DOI: 10.1016/j.jmb.2012.03.021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2011] [Revised: 03/24/2012] [Accepted: 03/27/2012] [Indexed: 01/27/2023]
|
106
|
Kuzu G, Keskin O, Gursoy A, Nussinov R. Constructing structural networks of signaling pathways on the proteome scale. Curr Opin Struct Biol 2012; 22:367-77. [PMID: 22575757 DOI: 10.1016/j.sbi.2012.04.004] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 03/20/2012] [Accepted: 04/18/2012] [Indexed: 11/30/2022]
Abstract
Proteins function through their interactions, and the availability of protein interaction networks could help in understanding cellular processes. However, the known structural data are limited and the classical network node-and-edge representation, where proteins are nodes and interactions are edges, shows only which proteins interact; not how they interact. Structural networks provide this information. Protein-protein interface structures can also indicate which binding partners can interact simultaneously and which are competitive, and can help forecasting potentially harmful drug side effects. Here, we use a powerful protein-protein interactions prediction tool which is able to carry out accurate predictions on the proteome scale to construct the structural network of the extracellular signal-regulated kinases (ERK) in the mitogen-activated protein kinase (MAPK) signaling pathway. This knowledge-based method, PRISM, is motif-based, and is combined with flexible refinement and energy scoring. PRISM predicts protein interactions based on structural and evolutionary similarity to known protein interfaces.
Collapse
Affiliation(s)
- Guray Kuzu
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | | | | | | |
Collapse
|
107
|
Johnston MA, Farrell D, Nielsen JE. A collaborative environment for developing and validating predictive tools for protein biophysical characteristics. J Comput Aided Mol Des 2012; 26:387-96. [DOI: 10.1007/s10822-012-9564-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2011] [Accepted: 03/18/2012] [Indexed: 11/29/2022]
|
108
|
Ma X, Gao L. Predicting protein complexes in protein interaction networks using a core-attachment algorithm based on graph communicability. Inf Sci (N Y) 2012. [DOI: 10.1016/j.ins.2011.11.033] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
109
|
Tuncbag N, Keskin O, Nussinov R, Gursoy A. Fast and accurate modeling of protein-protein interactions by combining template-interface-based docking with flexible refinement. Proteins 2012; 80:1239-49. [PMID: 22275112 PMCID: PMC7448677 DOI: 10.1002/prot.24022] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2011] [Revised: 11/29/2011] [Accepted: 12/13/2011] [Indexed: 11/06/2022]
Abstract
The similarity between folding and binding led us to posit the concept that the number of protein-protein interface motifs in nature is limited, and interacting protein pairs can use similar interface architectures repeatedly, even if their global folds completely vary. Thus, known protein-protein interface architectures can be used to model the complexes between two target proteins on the proteome scale, even if their global structures differ. This powerful concept is combined with a flexible refinement and global energy assessment tool. The accuracy of the method is highly dependent on the structural diversity of the interface architectures in the template dataset. Here, we validate this knowledge-based combinatorial method on the Docking Benchmark and show that it efficiently finds high-quality models for benchmark complexes and their binding regions even in the absence of template interfaces having sequence similarity to the targets. Compared to "classical" docking, it is computationally faster; as the number of target proteins increases, the difference becomes more dramatic. Further, it is able to distinguish binders from nonbinders. These features allow performing large-scale network modeling. The results on an independent target set (proteins in the p53 molecular interaction map) show that current method can be used to predict whether a given protein pair interacts. Overall, while constrained by the diversity of the template set, this approach efficiently produces high-quality models of protein-protein complexes. We expect that with the growing number of known interface architectures, this type of knowledge-based methods will be increasingly used by the broad proteomics community.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Center for Computational Biology and Bioinformatics, College of Engineering, Koc University, 34450 Sariyer, Istanbul, Turkey
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics, College of Engineering, Koc University, 34450 Sariyer, Istanbul, Turkey
| | - Ruth Nussinov
- Basic Science Program, SAIC-Frederick, Inc., Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, Maryland 21702
- Department of Human Genetics and Molecular Medicine, Sackler Institute of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Attila Gursoy
- Center for Computational Biology and Bioinformatics, College of Engineering, Koc University, 34450 Sariyer, Istanbul, Turkey
| |
Collapse
|
110
|
The distribution of ligand-binding pockets around protein-protein interfaces suggests a general mechanism for pocket formation. Proc Natl Acad Sci U S A 2012; 109:3784-9. [PMID: 22355140 DOI: 10.1073/pnas.1117768109] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein-protein and protein-ligand interactions are ubiquitous in a biological cell. Here, we report a comprehensive study of the distribution of protein-ligand interaction sites, namely ligand-binding pockets, around protein-protein interfaces where protein-protein interactions occur. We inspected a representative set of 1,611 representative protein-protein complexes and identified pockets with a potential for binding small molecule ligands. The majority of these pockets are within a 6 Å distance from protein interfaces. Accordingly, in about half of ligand-bound protein-protein complexes, amino acids from both sides of a protein interface are involved in direct contacts with at least one ligand. Statistically, ligands are closer to a protein-protein interface than a random surface patch of the same solvent accessible surface area. Similar results are obtained in an analysis of the ligand distribution around domain-domain interfaces of 1,416 nonredundant, two-domain protein structures. Furthermore, comparable sized pockets as observed in experimental structures are present in artificially generated protein complexes, suggesting that the prominent appearance of pockets around protein interfaces is mainly a structural consequence of protein packing and thus, is an intrinsic geometric feature of protein structure. Nature may take advantage of such a structural feature by selecting and further optimizing for biological function. We propose that packing nearby protein-protein or domain-domain interfaces is a major route to the formation of ligand-binding pockets.
Collapse
|
111
|
Clarke D, Bhardwaj N, Gerstein MB. Novel insights through the integration of structural and functional genomics data with protein networks. J Struct Biol 2012; 179:320-6. [PMID: 22343087 DOI: 10.1016/j.jsb.2012.02.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2011] [Revised: 02/02/2012] [Accepted: 02/02/2012] [Indexed: 12/13/2022]
Abstract
In recent years, major advances in genomics, proteomics, macromolecular structure determination, and the computational resources capable of processing and disseminating the large volumes of data generated by each have played major roles in advancing a more systems-oriented appreciation of biological organization. One product of systems biology has been the delineation of graph models for describing genome-wide protein-protein interaction networks. The network organization and topology which emerges in such models may be used to address fundamental questions in an array of cellular processes, as well as biological features intrinsic to the constituent proteins (or "nodes") themselves. However, graph models alone constitute an abstraction which neglects the underlying biological and physical reality that the network's nodes and edges are highly heterogeneous entities. Here, we explore some of the advantages of introducing a protein structural dimension to such models, as the marriage of conventional network representations with macromolecular structural data helps to place static node and edge constructs in a biologically more meaningful context. We emphasize that 3D protein structures constitute a valuable conceptual and predictive framework by discussing examples of the insights provided, such as enabling in silico predictions of protein-protein interactions, providing rational and compelling classification schemes for network elements, as well as revealing interesting intrinsic differences between distinct node types, such as disorder and evolutionary features, which may then be rationalized in light of their respective functions within networks.
Collapse
Affiliation(s)
- Declan Clarke
- Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | | | | |
Collapse
|
112
|
Abstract
In this chapter, we introduce interaction networks by describing how they are generated, where they are stored, and how they are shared. We focus on publicly available interaction networks and describe a simple way of utilizing these resources. As a case study, we used Cytoscape, an open source and easy-to-use network visualization and analysis tool to first gather and visualize a small network. We have analyzed this network's topological features and have looked at functional enrichment of the network nodes by integrating the gene ontology database. The methods described are applicable to larger networks that can be collected from various resources.
Collapse
Affiliation(s)
- Gurkan Bebek
- Center for Proteomics and Bioinformatics, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| |
Collapse
|
113
|
Kuzu G, Keskin O, Gursoy A, Nussinov R. Expanding the conformational selection paradigm in protein-ligand docking. Methods Mol Biol 2012; 819:59-74. [PMID: 22183530 PMCID: PMC7455014 DOI: 10.1007/978-1-61779-465-0_5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Conformational selection emerges as a theme in macromolecular interactions. Data validate it as a prevailing mechanism in protein-protein, protein-DNA, protein-RNA, and protein-small molecule drug recognition. This raises the question of whether this fundamental biomolecular binding mechanism can be used to improve drug docking and discovery. Actually, in practice this has already been taking place for some years in increasing numbers. Essentially, it argues for using not a single conformer, but an ensemble. The paradigm of conformational selection holds that because the ensemble is heterogeneous, within it there will be states whose conformation matches that of the ligand. Even if the population of this state is low, since it is favorable for binding the ligand, it will bind to it with a subsequent population shift toward this conformer. Here we suggest expanding it by first modeling all protein interactions in the cell by using Prism, an efficient motif-based protein-protein interaction modeling strategy, followed by ensemble generation. Such a strategy could be particularly useful for signaling proteins, which are major targets in drug discovery and bind multiple partners through a shared binding site, each with some-minor or major-conformational change.
Collapse
Affiliation(s)
- Guray Kuzu
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, Istanbul, Turkey
| | | | | | | |
Collapse
|
114
|
Molecular systems biology of Sic1 in yeast cell cycle regulation through multiscale modeling. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 736:135-67. [PMID: 22161326 DOI: 10.1007/978-1-4419-7210-1_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Cell cycle control is highly regulated to guarantee the precise timing of events essential for cell growth, i.e., DNA replication onset and cell division. Failure of this control plays a role in cancer and molecules called cyclin-dependent kinase (Cdk) inhibitors (Ckis) exploit a critical function in cell cycle timing. Here we present a multiscale modeling where experimental and computational studies have been employed to investigate structure, function and temporal dynamics of the Cki Sic1 that regulates cell cycle progression in Saccharomyces cerevisiae. Structural analyses reveal molecular details of the interaction between Sic1 and Cdk/cyclin complexes, and biochemical investigation reveals Sic1 function in analogy to its human counterpart p27(Kip1), whose deregulation leads to failure in timing of kinase activation and, therefore, to cancer. Following these findings, a bottom-up systems biology approach has been developed to characterize modular networks addressing Sic1 regulatory function. Through complementary experimentation and modeling, we suggest a mechanism that underlies Sic1 function in controlling temporal waves of cyclins to ensure correct timing of the phase-specific Cdk activities.
Collapse
|
115
|
Mukherjee S, Zhang Y. Protein-protein complex structure predictions by multimeric threading and template recombination. Structure 2011; 19:955-66. [PMID: 21742262 DOI: 10.1016/j.str.2011.04.006] [Citation(s) in RCA: 128] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2010] [Revised: 03/30/2011] [Accepted: 04/01/2011] [Indexed: 10/18/2022]
Abstract
The total number of protein-protein complex structures currently available in the Protein Data Bank (PDB) is six times smaller than the total number of tertiary structures in the PDB, which limits the power of homology-based approaches to complex structure modeling. We present a threading-recombination approach, COTH, to boost the protein complex structure library by combining tertiary structure templates with complex alignments. The query sequences are first aligned to complex templates using a modified dynamic programming algorithm, guided by ab initio binding-site predictions. The monomer alignments are then shifted to the multimeric template framework by structural alignments. COTH was tested on 500 nonhomologous dimeric proteins, which can successfully detect correct templates for 50% of the cases after homologous templates are excluded, which significantly outperforms conventional homology modeling algorithms. It also shows a higher accuracy in interface modeling than rigid-body docking of unbound structures from ZDOCK although with lower coverage. These data demonstrate new avenues to model complex structures from nonhomologous templates.
Collapse
Affiliation(s)
- Srayanta Mukherjee
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | | |
Collapse
|
116
|
Abstract
BACKGROUND Protein complexes are important for understanding principles of cellular organization and functions. With the availability of large amounts of high-throughput protein-protein interactions (PPI), many algorithms have been proposed to discover protein complexes from PPI networks. However, existing algorithms generally do not take into consideration the fact that not all the interactions in a PPI network take place at the same time. As a result, predicted complexes often contain many spuriously included proteins, precluding them from matching true complexes. RESULTS We propose two methods to tackle this problem: (1) The localization GO term decomposition method: We utilize cellular component Gene Ontology (GO) terms to decompose PPI networks into several smaller networks such that the proteins in each decomposed network are annotated with the same cellular component GO term. (2) The hub removal method: This method is based on the observation that hub proteins are more likely to fuse clusters that correspond to different complexes. To avoid this, we remove hub proteins from PPI networks, and then apply a complex discovery algorithm on the remaining PPI network. The removed hub proteins are added back to the generated clusters afterwards. We tested the two methods on the yeast PPI network downloaded from BioGRID. Our results show that these methods can improve the performance of several complex discovery algorithms significantly. Further improvement in performance is achieved when we apply them in tandem. CONCLUSIONS The performance of complex discovery algorithms is hindered by the fact that not all the interactions in a PPI network take place at the same time. We tackle this problem by using localization GO terms or hubs to decompose a PPI network before complex discovery, which achieves considerable improvement.
Collapse
Affiliation(s)
- Guimei Liu
- School of Computing, National University of Singapore, Singapore
| | - Chern Han Yong
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore
| | - Hon Nian Chua
- Data Mining Department, Institute for Infocomm Research, Singapore
| | - Limsoon Wong
- School of Computing, National University of Singapore, Singapore
| |
Collapse
|
117
|
Abstract
BACKGROUND Identifying biologically relevant protein complexes from a large protein-protein interaction (PPI) network, is essential to understand the organization of biological systems. However, high-throughput experimental techniques that can produce a large amount of PPIs are known to yield non-negligible rates of false-positives and false-negatives, making the protein complexes difficult to be identified. RESULTS We propose a binary matrix factorization (BMF) algorithm under the Bayesian Ying-Yang (BYY) harmony learning, to detect protein complexes by clustering the proteins which share similar interactions through factorizing the binary adjacent matrix of a PPI network. The proposed BYY-BMF algorithm automatically determines the cluster number while this number is pre-given for most existing BMF algorithms. Also, BYY-BMF's clustering results does not depend on any parameters or thresholds, unlike the Markov Cluster Algorithm (MCL) that relies on a so-called inflation parameter. On synthetic PPI networks, the predictions evaluated by the known annotated complexes indicate that BYY-BMF is more robust than MCL for most cases. On real PPI networks from the MIPS and DIP databases, BYY-BMF obtains a better balanced prediction accuracies than MCL and a spectral analysis method, while MCL has its own advantages, e.g., with good separation values.
Collapse
Affiliation(s)
- Shikui Tu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
| | - Runsheng Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101
| | - Lei Xu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
| |
Collapse
|
118
|
Dwane S, Kiely PA. Tools used to study how protein complexes are assembled in signaling cascades. Bioeng Bugs 2011; 2:247-59. [PMID: 22002082 PMCID: PMC3225741 DOI: 10.4161/bbug.2.5.17844] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2011] [Revised: 08/19/2011] [Accepted: 08/24/2011] [Indexed: 01/08/2023] Open
Abstract
Most proteins do not function on their own but as part of large signaling complexes that are arranged in every living cell in response to specific environmental cues. Proteins interact with each other either constitutively or transiently and do so with different affinity. When identifying the role played by a protein inside a cell, it is essential to define its particular cohort of binding partners so that the researcher can predict what signaling pathways the protein is engaged in. Once identified and confirmed, the information might allow the interaction to be manipulated by pharmacological inhibitors to help fight disease. In this review, we discuss protein-protein interactions and how they are essential to propagate signals in signaling pathways. We examine some of the high-throughput screening methods and focus on the methods used to confirm specific protein-protein interactions including; affinity tagging, co-immunoprecipitation, peptide array technology and fluorescence microscopy.
Collapse
Affiliation(s)
- Susan Dwane
- Department of Life Sciences, and Materials and Surface Science Institute, University of Limerick, Limerick, Ireland
| | | |
Collapse
|
119
|
Feliu E, Aloy P, Oliva B. On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking. Protein Sci 2011; 20:529-41. [PMID: 21432933 DOI: 10.1002/pro.585] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions for selecting rigid-body docking poses. These potentials include the energetic component that provides the residues with a particular secondary structure and surface accessibility. These scoring functions have been tested on a state-of-art benchmark dataset and on a decoy dataset of permanent interactions. Our results were compared with a residue-pair potential scoring function (RPScore) and an atomic-detailed scoring function (Zrank). We have combined knowledge-based potentials to score protein-protein poses of decoys of complexes classified either as transient or as permanent protein-protein interactions. Being defined from residue-pair statistical potentials and not requiring of an atomic level description, our method surpassed Zrank for scoring rigid-docking decoys where the unbound partners of an interaction have to endure conformational changes upon binding. However, when only moderate conformational changes are required (in rigid docking) or when the right conformational changes are ensured (in flexible docking), Zrank is the most successful scoring function. Finally, our study suggests that the physicochemical properties necessary for the binding are allocated on the proteins previous to its binding and with independence of the partner. This information is encoded at the residue level and could be easily incorporated in the initial grid scoring for Fast Fourier Transform rigid-body docking methods.
Collapse
Affiliation(s)
- Elisenda Feliu
- Algebra and Geometry Department, Mathematics Faculty, Universitat de Barcelona, Spain
| | | | | |
Collapse
|
120
|
Xing C, Dunson DB. Bayesian inference for genomic data integration reduces misclassification rate in predicting protein-protein interactions. PLoS Comput Biol 2011; 7:e1002110. [PMID: 21829334 PMCID: PMC3145649 DOI: 10.1371/journal.pcbi.1002110] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2011] [Accepted: 05/17/2011] [Indexed: 12/02/2022] Open
Abstract
Protein-protein interactions (PPIs) are essential to most fundamental cellular processes. There has been increasing interest in reconstructing PPIs networks. However, several critical difficulties exist in obtaining reliable predictions. Noticeably, false positive rates can be as high as >80%. Error correction from each generating source can be both time-consuming and inefficient due to the difficulty of covering the errors from multiple levels of data processing procedures within a single test. We propose a novel Bayesian integration method, deemed nonparametric Bayes ensemble learning (NBEL), to lower the misclassification rate (both false positives and negatives) through automatically up-weighting data sources that are most informative, while down-weighting less informative and biased sources. Extensive studies indicate that NBEL is significantly more robust than the classic naïve Bayes to unreliable, error-prone and contaminated data. On a large human data set our NBEL approach predicts many more PPIs than naïve Bayes. This suggests that previous studies may have large numbers of not only false positives but also false negatives. The validation on two human PPIs datasets having high quality supports our observations. Our experiments demonstrate that it is feasible to predict high-throughput PPIs computationally with substantially reduced false positives and false negatives. The ability of predicting large numbers of PPIs both reliably and automatically may inspire people to use computational approaches to correct data errors in general, and may speed up PPIs prediction with high quality. Such a reliable prediction may provide a solid platform to other studies such as protein functions prediction and roles of PPIs in disease susceptibility.
Collapse
Affiliation(s)
- Chuanhua Xing
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, United States of America.
| | | |
Collapse
|
121
|
Structural principles within the human-virus protein-protein interaction network. Proc Natl Acad Sci U S A 2011; 108:10538-43. [PMID: 21680884 DOI: 10.1073/pnas.1101440108] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
General properties of the antagonistic biomolecular interactions between viruses and their hosts (exogenous interactions) remain poorly understood, and may differ significantly from known principles governing the cooperative interactions within the host (endogenous interactions). Systems biology approaches have been applied to study the combined interaction networks of virus and human proteins, but such efforts have so far revealed only low-resolution patterns of host-virus interaction. Here, we layer curated and predicted 3D structural models of human-virus and human-human protein complexes on top of traditional interaction networks to reconstruct the human-virus structural interaction network. This approach reveals atomic resolution, mechanistic patterns of host-virus interaction, and facilitates systematic comparison with the host's endogenous interactions. We find that exogenous interfaces tend to overlap with and mimic endogenous interfaces, thereby competing with endogenous binding partners. The endogenous interfaces mimicked by viral proteins tend to participate in multiple endogenous interactions which are transient and regulatory in nature. While interface overlap in the endogenous network results largely from gene duplication followed by divergent evolution, viral proteins frequently achieve interface mimicry without any sequence or structural similarity to an endogenous binding partner. Finally, while endogenous interfaces tend to evolve more slowly than the rest of the protein surface, exogenous interfaces--including many sites of endogenous-exogenous overlap--tend to evolve faster, consistent with an evolutionary "arms race" between host and pathogen. These significant biophysical, functional, and evolutionary differences between host-pathogen and within-host protein-protein interactions highlight the distinct consequences of antagonism versus cooperation in biological networks.
Collapse
|
122
|
Emig D, Sander O, Mayr G, Albrecht M. Structure collisions between interacting proteins. PLoS One 2011; 6:e19581. [PMID: 21655095 PMCID: PMC3107212 DOI: 10.1371/journal.pone.0019581] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2010] [Accepted: 04/12/2011] [Indexed: 11/24/2022] Open
Abstract
Protein-protein interactions take place at defined binding interfaces. One protein may bind two or more proteins at different interfaces at the same time. So far it has been commonly accepted that non-overlapping interfaces allow a given protein to bind other proteins simultaneously while no collisions occur between the binding protein structures. To test this assumption, we performed a comprehensive analysis of structural protein interactions to detect potential collisions. Our results did not indicate cases of biologically relevant collisions in the Protein Data Bank of protein structures. However, we discovered a number of collisions that originate from alternative protein conformations or quaternary structures due to different experimental conditions.
Collapse
Affiliation(s)
- Dorothea Emig
- Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Oliver Sander
- Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Gabriele Mayr
- Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Mario Albrecht
- Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany
- * E-mail:
| |
Collapse
|
123
|
Terradot L, Noirot-Gros MF. Bacterial protein interaction networks: puzzle stones from solved complex structures add to a clearer picture. Integr Biol (Camb) 2011; 3:645-52. [PMID: 21584322 DOI: 10.1039/c0ib00023j] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Global scale studies of protein-protein interaction (PPI) networks have considerably expanded our view of how proteins act in the cell. In particular, bacterial "interactome" surveys have revealed that proteins can sometimes interact with a large number of protein partners and connect different cellular processes. More targeted, pathway-orientated PPI studies have also helped to propose functions for unknown proteins based on the "guilty by association" principle. However, given the immense repertoire of PPIs generated and the variability of PPI networks, more studies are required to understand the role(s) of these interactions in the cell. With the availability of bioinformatic analysis tools, transcriptomics and co-expression experiments for a given interaction, interactomes are being deciphered. More recently, functional and structural studies have been derived from these PPI networks. In this review, we will give a number of examples of how combining functional and structural studies into PPI networks has contributed to understanding the functions of some of these interactions. We discuss how interactomes now represent a unique opportunity to determine the structures of bacterial protein complexes on a large scale by the integration of multiple technologies.
Collapse
Affiliation(s)
- Laurent Terradot
- Institut de Biologie et Chimie des Protéines, UMR 5086 CNRS Université de Lyon, IFR128, Biologie Structurale des Complexes Macromoléculaires Bactériens, 7 Passage du Vercors, F-69367, Lyon Cedex 07, France.
| | | |
Collapse
|
124
|
Wang TY, He F, Hu QW, Zhang Z. A predicted protein-protein interaction network of the filamentous fungus Neurospora crassa. MOLECULAR BIOSYSTEMS 2011; 7:2278-85. [PMID: 21584303 DOI: 10.1039/c1mb05028a] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The filamentous fungus Neurospora crassa is a leading model organism for circadian clock studies. Computational identification of a protein-protein interaction (PPI) network (also known as an interactome) in N. crassa can provide new insights into the cellular functions of proteins. Using two well-established bioinformatics methods (the interolog method and the domain interaction-based method), we predicted 27,588 PPIs among 3006 N. crassa proteins. To the best of our knowledge, this is the first identified interactome for N. crassa, although it remains problematic because of incomplete interactions and false positives. In particular, the established PPI network has provided clues to further decipher the molecular mechanism of circadian rhythmicity. For instance, we found that clock-controlled genes (ccgs) are more likely to act as bottlenecks in the established PPI network. We also identified an important module related to circadian oscillators, and some functional unknown proteins in this module may serve as potential candidates for new oscillators. Finally, all predicted PPIs were compiled into a user-friendly database server (NCPI), which is freely available at .
Collapse
Affiliation(s)
- Ting-You Wang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | | | | | | |
Collapse
|
125
|
Tuncbag N, Gursoy A, Keskin O. Prediction of protein-protein interactions: unifying evolution and structure at protein interfaces. Phys Biol 2011; 8:035006. [PMID: 21572173 DOI: 10.1088/1478-3975/8/3/035006] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The vast majority of the chores in the living cell involve protein-protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein-protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Koc University, Center for Computational Biology and Bioinformatics, and College of Engineering, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | | | | |
Collapse
|
126
|
Ardejani MS, Li NX, Orner BP. Stabilization of a protein nanocage through the plugging of a protein-protein interfacial water pocket. Biochemistry 2011; 50:4029-37. [PMID: 21488690 DOI: 10.1021/bi200207w] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The unique structural properties of the ferritin protein cages have provided impetus to focus on the methodical study of these self-assembling nanosystems. Among these proteins, Escherichia coli bacterioferritin (EcBfr), although architecturally very similar to other members of the family, shows structural instability and an incomplete self-assembly behavior by populating two oligomerization states. Through computational analysis and comparison to its homologues, we have found that this protein has a smaller than average dimeric interface on its 2-fold symmetry axis mainly because of the existence of an interfacial water pocket centered around two water-bridged asparagine residues. To investigate the possibility of engineering EcBfr for modified structural stability, we have used a semiempirical computational method to virtually explore the energy differences of the 480 possible mutants at the dimeric interface relative to that of wild-type EcBfr. This computational study also converged on the water-bridged asparagines. Replacing these two asparagines with hydrophobic amino acids resulted in proteins that folded into α-helical monomers and assembled into cages as evidenced by circular dichroism and transmission electron microscopy. Both thermal and chemical denaturation confirmed that, in all cases, these proteins, in agreement with the calculations, possessed increased stability. One of the three mutations shifts the population in favor of the higher-order oligomerization state in solution as evidenced by both size exclusion chromatography and native gel electrophoresis. These results taken together suggest that our low-level design was successful and that it may be possible to apply the strategy of targeting water pockets at protein--protein interfaces to other protein cage and self-assembling systems. More generally, this study further demonstrates the power of jointly employing in silico and in vitro techniques to understand and enhance biostructural energetics.
Collapse
Affiliation(s)
- Maziar S Ardejani
- Division of Chemistry and Biological Chemistry, Nanyang Technological University, 21 Nanyang Link, Singapore 637371
| | | | | |
Collapse
|
127
|
Schelhorn SE, Mestre J, Albrecht M, Zotenko E. Inferring physical protein contacts from large-scale purification data of protein complexes. Mol Cell Proteomics 2011; 10:M110.004929. [PMID: 21451165 PMCID: PMC3108834 DOI: 10.1074/mcp.m110.004929] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Recent large-scale data sets of protein complex purifications have provided unprecedented insights into the organization of cellular protein complexes. Several computational methods have been developed to detect co-complexed proteins in these data sets. Their common aim is the identification of biologically relevant protein complexes. However, much less is known about the network of direct physical protein contacts within the detected protein complexes. Therefore, our work investigates whether direct physical contacts can be computationally derived by combining raw data of large-scale protein complex purifications. We assess four established scoring schemes and introduce a new scoring approach that is specifically devised to infer direct physical protein contacts from protein complex purifications. The physical contacts identified by the five methods are comprehensively benchmarked against different reference sets that provide evidence for true physical contacts. Our results show that raw purification data can indeed be exploited to determine high-confidence physical protein contacts within protein complexes. In particular, our new method outperforms competing approaches at discovering physical contacts involving proteins that have been screened multiple times in purification experiments. It also excels in the analysis of recent protein purification screens of molecular chaperones and protein kinases. In contrast to previous findings, we observe that physical contacts inferred from purification experiments of protein complexes can be qualitatively comparable to binary protein interactions measured by experimental high-throughput assays such as yeast two-hybrid. This suggests that computationally derived physical contacts might complement binary protein interaction assays and guide large-scale interactome mapping projects by prioritizing putative physical contacts for further experimental screens.
Collapse
|
128
|
Krishnan S, Gaspari M, Corte AD, Bianchi P, Crescente M, Cerletti C, Torella D, Indolfi C, de Gaetano G, Donati MB, Rotilio D, Cuda G. OFFgel-based multidimensional LC-MS/MS approach to the cataloguing of the human platelet proteome for an interactomic profile. Electrophoresis 2011; 32:686-95. [DOI: 10.1002/elps.201000592] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2010] [Revised: 12/13/2010] [Accepted: 12/26/2010] [Indexed: 11/05/2022]
|
129
|
Stein A, Mosca R, Aloy P. Three-dimensional modeling of protein interactions and complexes is going 'omics. Curr Opin Struct Biol 2011; 21:200-8. [PMID: 21320770 DOI: 10.1016/j.sbi.2011.01.005] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Revised: 01/11/2011] [Accepted: 01/13/2011] [Indexed: 10/18/2022]
Abstract
High-throughput interaction discovery initiatives have revealed the existence of hundreds of multiprotein complexes whose functions are regulated through thousands of protein-protein interactions (PPIs). However, the structural details of these interactions, often necessary to understand their function, are only available for a tiny fraction, and the experimental difficulties surrounding complex structure determination make computational modeling techniques paramount. In this manuscript, we critically review some of the most recent developments in the field of structural bioinformatics applied to the modeling of protein interactions and complexes, from large macromolecular machines to domain-domain and peptide-mediated interactions. In particular, we place a special emphasis on those methods that can be applied in a proteome-wide manner, and discuss how they will help in the ultimate objective of building 3D interactome networks.
Collapse
Affiliation(s)
- Amelie Stein
- Institute for Research in Biomedicine (IRB Barcelona), Joint IRB-BSC Program in Computational Biology, c/Baldiri i Reixac 10-12, 08028 Barcelona, Spain
| | | | | |
Collapse
|
130
|
Kar G, Keskin O, Gursoy A, Nussinov R. Allostery and population shift in drug discovery. Curr Opin Pharmacol 2010; 10:715-22. [PMID: 20884293 PMCID: PMC7316380 DOI: 10.1016/j.coph.2010.09.002] [Citation(s) in RCA: 151] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Revised: 09/03/2010] [Accepted: 09/03/2010] [Indexed: 02/07/2023]
Abstract
Proteins can exist in a large number of conformations around their native states that can be characterized by an energy landscape. The landscape illustrates individual valleys, which are the conformational substates. From the functional standpoint, there are two key points: first, all functionally relevant substates pre-exist; and second, the landscape is dynamic and the relative populations of the substates will change following allosteric events. Allosteric events perturb the structure, and the energetic strain propagates and shifts the population. This can lead to changes in the shapes and properties of target binding sites. Here we present an overview of dynamic conformational ensembles focusing on allosteric events in signaling. We propose that combining equilibrium fluctuation concepts with genomic screens could help drug discovery.
Collapse
Affiliation(s)
- Gozde Kar
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | | | | | | |
Collapse
|
131
|
Launay G, Simonson T. A large decoy set of protein-protein complexes produced by flexible docking. J Comput Chem 2010; 32:106-20. [DOI: 10.1002/jcc.21604] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
132
|
Srihari S, Ning K, Leong HW. MCL-CAw: a refinement of MCL for detecting yeast complexes from weighted PPI networks by incorporating core-attachment structure. BMC Bioinformatics 2010; 11:504. [PMID: 20939868 PMCID: PMC2965181 DOI: 10.1186/1471-2105-11-504] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2010] [Accepted: 10/12/2010] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND The reconstruction of protein complexes from the physical interactome of organisms serves as a building block towards understanding the higher level organization of the cell. Over the past few years, several independent high-throughput experiments have helped to catalogue enormous amount of physical protein interaction data from organisms such as yeast. However, these individual datasets show lack of correlation with each other and also contain substantial number of false positives (noise). Over these years, several affinity scoring schemes have also been devised to improve the qualities of these datasets. Therefore, the challenge now is to detect meaningful as well as novel complexes from protein interaction (PPI) networks derived by combining datasets from multiple sources and by making use of these affinity scoring schemes. In the attempt towards tackling this challenge, the Markov Clustering algorithm (MCL) has proved to be a popular and reasonably successful method, mainly due to its scalability, robustness, and ability to work on scored (weighted) networks. However, MCL produces many noisy clusters, which either do not match known complexes or have additional proteins that reduce the accuracies of correctly predicted complexes. RESULTS Inspired by recent experimental observations by Gavin and colleagues on the modularity structure in yeast complexes and the distinctive properties of "core" and "attachment" proteins, we develop a core-attachment based refinement method coupled to MCL for reconstruction of yeast complexes from scored (weighted) PPI networks. We combine physical interactions from two recent "pull-down" experiments to generate an unscored PPI network. We then score this network using available affinity scoring schemes to generate multiple scored PPI networks. The evaluation of our method (called MCL-CAw) on these networks shows that: (i) MCL-CAw derives larger number of yeast complexes and with better accuracies than MCL, particularly in the presence of natural noise; (ii) Affinity scoring can effectively reduce the impact of noise on MCL-CAw and thereby improve the quality (precision and recall) of its predicted complexes; (iii) MCL-CAw responds well to most available scoring schemes. We discuss several instances where MCL-CAw was successful in deriving meaningful complexes, and where it missed a few proteins or whole complexes due to affinity scoring of the networks. We compare MCL-CAw with several recent complex detection algorithms on unscored and scored networks, and assess the relative performance of the algorithms on these networks. Further, we study the impact of augmenting physical datasets with computationally inferred interactions for complex detection. Finally, we analyse the essentiality of proteins within predicted complexes to understand a possible correlation between protein essentiality and their ability to form complexes. CONCLUSIONS We demonstrate that core-attachment based refinement in MCL-CAw improves the predictions of MCL on yeast PPI networks. We show that affinity scoring improves the performance of MCL-CAw.
Collapse
Affiliation(s)
- Sriganesh Srihari
- Department of Computer Science, National University of Singapore, 117590, Singapore
| | - Kang Ning
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
- Qingdao Institute of Bioenergy and Bioprocess Technology, Qingdao 266101, China
| | - Hon Wai Leong
- Department of Computer Science, National University of Singapore, 117590, Singapore
| |
Collapse
|
133
|
Habibi M, Eslahchi C, Wong L. Protein complex prediction based on k-connected subgraphs in protein interaction network. BMC SYSTEMS BIOLOGY 2010; 4:129. [PMID: 20846398 PMCID: PMC2949670 DOI: 10.1186/1752-0509-4-129] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2010] [Accepted: 09/16/2010] [Indexed: 11/10/2022]
Abstract
Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on connectivity number on subgraphs. We evaluate CFA using several protein interaction networks on reference protein complexes in two benchmark data sets (MIPS and Aloy), containing 1142 and 61 known complexes respectively. We compare CFA to some existing protein complex prediction methods (CMC, MCL, PCP and RNSC) in terms of recall and precision. We show that CFA predicts more complexes correctly at a competitive level of precision. Conclusions Many real complexes with different connectivity level in protein interaction network can be predicted based on connectivity number. Our CFA program and results are freely available from http://www.bioinf.cs.ipm.ir/softwares/cfa/CFA.rar.
Collapse
Affiliation(s)
- Mahnaz Habibi
- Faculty of Mathematics, Shahid-Beheshti University, gc, Tehran, Iran
| | | | | |
Collapse
|
134
|
Ozawa Y, Saito R, Fujimori S, Kashima H, Ishizaka M, Yanagawa H, Miyamoto-Sato E, Tomita M. Protein complex prediction via verifying and reconstructing the topology of domain-domain interactions. BMC Bioinformatics 2010; 11:350. [PMID: 20584269 PMCID: PMC2905371 DOI: 10.1186/1471-2105-11-350] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2009] [Accepted: 06/28/2010] [Indexed: 11/10/2022] Open
Abstract
Background High-throughput methods for detecting protein-protein interactions enable us to obtain large interaction networks, and also allow us to computationally identify the associations of proteins as protein complexes. Although there are methods to extract protein complexes as sets of proteins from interaction networks, the extracted complexes may include false positives because they do not account for the structural limitations of the proteins and thus do not check that the proteins in the extracted complex can simultaneously bind to each other. In addition, there have been few searches for deeper insights into the protein complexes, such as of the topology of the protein-protein interactions or into the domain-domain interactions that mediate the protein interactions. Results Here, we introduce a combinatorial approach for prediction of protein complexes focusing not only on determining member proteins in complexes but also on the DDI/PPI organization of the complexes. Our method analyzes complex candidates predicted by the existing methods. It searches for optimal combinations of domain-domain interactions in the candidates based on an assumption that the proteins in a candidate can form a true protein complex if each of the domains is used by a single protein interaction. This optimization problem was mathematically formulated and solved using binary integer linear programming. By using publicly available sets of yeast protein-protein interactions and domain-domain interactions, we succeeded in extracting protein complex candidates with an accuracy that is twice the average accuracy of the existing methods, MCL, MCODE, or clustering coefficient. Although the configuring parameters for each algorithm resulted in slightly improved precisions, our method always showed better precision for most values of the parameters. Conclusions Our combinatorial approach can provide better accuracy for prediction of protein complexes and also enables to identify both direct PPIs and DDIs that mediate them in complexes.
Collapse
Affiliation(s)
- Yosuke Ozawa
- Institute for Advanced Biosciences, Keio University, 403-1, Daihoji, Tsuruoka, Yamagata 997-0017, Japan
| | | | | | | | | | | | | | | |
Collapse
|
135
|
Abstract
With the advent of Systems Biology, the prediction of whether two proteins form a complex has become a problem of increased importance. A variety of experimental techniques have been applied to the problem, but three-dimensional structural information has not been widely exploited. Here we explore the range of applicability of such information by analyzing the extent to which the location of binding sites on protein surfaces is conserved among structural neighbors. We find, as expected, that interface conservation is most significant among proteins that have a clear evolutionary relationship, but that there is a significant level of conservation even among remote structural neighbors. This finding is consistent with recent evidence that information available from structural neighbors, independent of classification, should be exploited in the search for functional insights. The value of such structural information is highlighted through the development of a new protein interface prediction method, PredUs, that identifies what residues on protein surfaces are likely to participate in complexes with other proteins. The performance of PredUs, as measured through comparisons with other methods, suggests that relationships across protein structure space can be successfully exploited in the prediction of protein-protein interactions.
Collapse
|
136
|
Lo YS, Lin CY, Yang JM. PCFamily: a web server for searching homologous protein complexes. Nucleic Acids Res 2010; 38:W516-22. [PMID: 20511590 PMCID: PMC2896147 DOI: 10.1093/nar/gkq464] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The proteins in a cell often assemble into complexes to carry out their functions and play an essential role of biological processes. The PCFamily server identifies template-based homologous protein complexes [called protein complex family (PCF)] and infers functional modules of the query proteins. This server first finds homologous structure complexes of the query using BLASTP to search the structural template database (11 263 complexes). PCFamily then searches the homologous complexes of the templates (query) from a complete genomic database (Integr8 with 6 352 363 protein sequences in 2274 species). According to these homologous complexes across multiple species, this sever infers binding models (e.g. hydrogen-bonds and conserved amino acids in the interfaces), functional modules, and the conserved interacting domains and Gene Ontology annotations of the PCF. Experimental results demonstrate that the PCFamily server can be useful for binding model visualizations and annotating the query proteins. We believe that the server is able to provide valuable insights for determining functional modules of biological networks across multiple species. The PCFamily sever is available at http://pcfamily.life.nctu.edu.tw.
Collapse
Affiliation(s)
- Yu-Shu Lo
- Institute of Bioinformatics and Systems Biology, Department of Biological Science and Technology and Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsinchu 30050, Taiwan
| | | | | |
Collapse
|
137
|
Vendruscolo M, Dobson CM. Quantitative approaches to defining normal and aberrant protein homeostasis. Faraday Discuss 2010; 143:277-91; discussion 359-72. [PMID: 20334107 DOI: 10.1039/b905825g] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Protein homeostasis refers to the ability of cells to generate and regulate the levels of their constituent proteins in terms of conformations, interactions, concentrations and cellular localisation. We discuss here an approach in which physico-chemical properties of proteins and their environments are used to understand the underlying principles governing this process, which is crucial in all living systems. By adopting the strategy of characterising the origins of specific diseases to inform us about normal biology, we are bringing together methods and concepts from chemistry, physics, engineering, genetics and medicine. In particular, we are using a combination of in vitro, in silico and in vivo approaches to study protein homeostasis through the analysis of the effects that result from its perturbation in a select group of specific proteins, from either amino acid mutations, or changes in concentration and solubility, or interactions with other molecules. By developing a coherent and quantitative description of such phenomena, we are finding that it is possible to shed new light on how the physical and chemical properties of the cellular components can provide an understanding of the normal and aberrant behaviour of living systems. Through such an approach it is possible to provide new insights into the origin and consequences of the failure to maintain homeostasis that is associated with neurodegenerative diseases, in particular, and the phenomenon of ageing, in general, and hence provide a framework for the rational design of therapeutic approaches.
Collapse
Affiliation(s)
- Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, UK CB2 1EW.
| | | |
Collapse
|
138
|
Kaake RM, Wang X, Huang L. Profiling of protein interaction networks of protein complexes using affinity purification and quantitative mass spectrometry. Mol Cell Proteomics 2010; 9:1650-65. [PMID: 20445003 DOI: 10.1074/mcp.r110.000265] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Protein-protein interactions are important for nearly all biological processes, and it is known that aberrant protein-protein interactions can lead to human disease and cancer. Recent evidence has suggested that protein interaction interfaces describe a new class of attractive targets for drug development. Full characterization of protein interaction networks of protein complexes and their dynamics in response to various cellular cues will provide essential information for us to understand how protein complexes work together in cells to maintain cell viability and normal homeostasis. Affinity purification coupled with quantitative mass spectrometry has become the primary method for studying in vivo protein interactions of protein complexes and whole organism proteomes. Recent developments in sample preparation and affinity purification strategies allow the capture, identification, and quantification of protein interactions of protein complexes that are stable, dynamic, transient, and/or weak. Current efforts have mainly focused on generating reliable, reproducible, and high confidence protein interaction data sets for functional characterization. The availability of increasing amounts of information on protein interactions in eukaryotic systems and new bioinformatics tools allow functional analysis of quantitative protein interaction data to unravel the biological significance of the identified protein interactions. Existing studies in this area have laid a solid foundation toward generating a complete map of in vivo protein interaction networks of protein complexes in cells or tissues.
Collapse
Affiliation(s)
- Robyn M Kaake
- Department of Physiology and Biophysics, University of California, Irvine, California 92697-4560, USA
| | | | | |
Collapse
|
139
|
Li X, Wu M, Kwoh CK, Ng SK. Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC Genomics 2010; 11 Suppl 1:S3. [PMID: 20158874 PMCID: PMC2822531 DOI: 10.1186/1471-2164-11-s1-s3] [Citation(s) in RCA: 173] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Most proteins form macromolecular complexes to perform their biological functions. However, experimentally determined protein complex data, especially of those involving more than two protein partners, are relatively limited in the current state-of-the-art high-throughput experimental techniques. Nevertheless, many techniques (such as yeast-two-hybrid) have enabled systematic screening of pairwise protein-protein interactions en masse. Thus computational approaches for detecting protein complexes from protein interaction data are useful complements to the limited experimental methods. They can be used together with the experimental methods for mapping the interactions of proteins to understand how different proteins are organized into higher-level substructures to perform various cellular functions. Results Given the abundance of pairwise protein interaction data from high-throughput genome-wide experimental screenings, a protein interaction network can be constructed from protein interaction data by considering individual proteins as the nodes, and the existence of a physical interaction between a pair of proteins as a link. This binary protein interaction graph can then be used for detecting protein complexes using graph clustering techniques. In this paper, we review and evaluate the state-of-the-art techniques for computational detection of protein complexes, and discuss some promising research directions in this field. Conclusions Experimental results with yeast protein interaction data show that the interaction subgraphs discovered by various computational methods matched well with actual protein complexes. In addition, the computational approaches have also improved in performance over the years. Further improvements could be achieved if the quality of the underlying protein interaction data can be considered adequately to minimize the undesirable effects from the irrelevant and noisy sources, and the various biological evidences can be better incorporated into the detection process to maximize the exploitation of the increasing wealth of biological knowledge available.
Collapse
Affiliation(s)
- Xiaoli Li
- Institute for Infocomm Research, 1 Fusionopolis Way, Singapore.
| | | | | | | |
Collapse
|
140
|
Chin CH, Chen SH, Ho CW, Ko MT, Lin CY. A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles. BMC Bioinformatics 2010; 11 Suppl 1:S25. [PMID: 20122197 PMCID: PMC3009496 DOI: 10.1186/1471-2105-11-s1-s25] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Background Many research results show that the biological systems are composed of functional modules. Members in the same module usually have common functions. This is useful information to understand how biological systems work. Therefore, detecting functional modules is an important research topic in the post-genome era. One of functional module detecting methods is to find dense regions in Protein-Protein Interaction (PPI) networks. Most of current methods neglect confidence-scores of interactions, and pay little attention on using gene expression data to improve their results. Results In this paper, we propose a novel hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles, and we name it HUNTER. Our method not only can extract functional modules from a weighted PPI network, but also use gene expression data as optional input to increase the quality of outcomes. Using HUNTER on yeast data, we found it can discover more novel components related with RNA polymerase complex than those existed methods from yeast interactome. And these new components show the close relationship with polymerase after functional analysis on Gene Ontology. Conclusion A C++ implementation of our prediction method, dataset and supplementary material are available at http://hub.iis.sinica.edu.tw/Hunter/. Our proposed HUNTER method has been applied on yeast data, and the empirical results show that our method can accurately identify functional modules. Such useful application derived from our algorithm can reconstruct the biological machinery, identify undiscovered components and decipher common sub-modules inside these complexes like RNA polymerases I, II, III.
Collapse
Affiliation(s)
- Chia-Hao Chin
- Institute of Information Science, Academia Sinica, No, 128 Yan-Chiu-Yuan Rd, Sec, 2, Taipei 115, Taiwan.
| | | | | | | | | |
Collapse
|
141
|
Abstract
The quaternary structure (QS) of a protein is determined by measuring its molecular weight in solution. The data have to be extracted from the literature, and they may be missing even for proteins that have a crystal structure reported in the Protein Data Bank (PDB). The PDB and other databases derived from it report QS information that either was obtained from the depositors or is based on an analysis of the contacts between polypeptide chains in the crystal, and this frequently differs from the QS determined in solution.The QS of a protein can be predicted from its sequence using either homology or threading methods. However, a majority of the proteins with less than 30% sequence identity have different QSs. A model of the QS can also be derived by docking the subunits when their 3D structure is independently known, but the model is likely to be incorrect if large conformation changes take place when the oligomer assembles.
Collapse
Affiliation(s)
- Anne Poupon
- Yeast Structural Genomics, IBBMC UMR 8619 CNRS, Université Paris-Sud, Orsay, France
| | | |
Collapse
|
142
|
Böttcher B, Hipp K. Single-particle applications at intermediate resolution. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2010; 81:61-88. [PMID: 21115173 DOI: 10.1016/b978-0-12-381357-2.00003-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Electron microscopy together with single-particle image processing is an excellent method for structure determination of biological assemblies that exist in multiple identical copies. Typical assemblies contain several proteins and/or nucleic acids in a defined and reproducible arrangement. Coherent averaging of electron microscopic images of 5000-100,000 copies of these assemblies allows the determination of three-dimensional structures at ca. 1-3-nm resolution. At this intermediate resolution, it is possible to map individual subunits and thus to understand the architecture and quaternary structure of the assemblies. The intermediate resolution structural information gives a solid basis on which pseudo-atomic models of the assemblies can be modeled provided that high-resolution structures of smaller entities are known. The architecture of the assemblies, their pseudo-atomic models, and knowledge on their plasticity during function give a comprehensive understanding of large-scale structural dynamics of multicopy biological complexes. In this review, we will introduce the experimental pipeline and discuss selected examples.
Collapse
|
143
|
|
144
|
Ding B, LeJeune D, Li S. The C-terminal repeat domain of Spt5 plays an important role in suppression of Rad26-independent transcription coupled repair. J Biol Chem 2009; 285:5317-26. [PMID: 20042611 DOI: 10.1074/jbc.m109.082818] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
In eukaryotic cells, transcription coupled nucleotide excision repair (TCR) is believed to be initiated by RNA polymerase II (Pol II) stalled at a lesion in the transcribed strand of a gene. Rad26, the yeast homolog of the human Cockayne syndrome group B (CSB) protein, plays an important role in TCR. Spt4, a transcription elongation factor that forms a complex with Spt5, has been shown to suppress TCR in rad26Delta cells. Here we present evidence that Spt4 indirectly suppresses Rad26-independent TCR by protecting Spt5 from degradation and stabilizing the interaction of Spt5 with Pol II. We further found that the C-terminal repeat (CTR) domain of Spt5, which is dispensable for cell viability and is not involved in interactions with Spt4 and Pol II, plays an important role in the suppression. The Spt5 CTR is phosphorylated by the Bur kinase. Inactivation of the Bur kinase partially alleviates TCR in rad26Delta cells. We propose that the Spt5 CTR suppresses Rad26-independent TCR by serving as a platform for assembly of a multiple protein suppressor complex that is associated with Pol II. Phosphorylation of the Spt5 CTR by the Bur kinase may facilitate the assembly of the suppressor complex.
Collapse
Affiliation(s)
- Baojin Ding
- Department of Comparative Biomedical Sciences, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| | | | | |
Collapse
|
145
|
Kar G, Gursoy A, Keskin O. Human cancer protein-protein interaction network: a structural perspective. PLoS Comput Biol 2009; 5:e1000601. [PMID: 20011507 PMCID: PMC2785480 DOI: 10.1371/journal.pcbi.1000601] [Citation(s) in RCA: 144] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Accepted: 11/05/2009] [Indexed: 01/12/2023] Open
Abstract
Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network). The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%). We illustrate the interface related affinity properties of two cancer-related hub proteins: Erbb3, a multi interface, and Raf1, a single interface hub. The results reveal that affinity of interactions of the multi-interface hub tends to be higher than that of the single-interface hub. These findings might be important in obtaining new targets in cancer as well as finding the details of specific binding regions of putative cancer drug candidates. Protein-protein interaction networks provide a global picture of cellular function and biological processes. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. The structural details of interfaces are immensely useful in efforts to answer some fundamental questions such as: (i) what features of cancer-related protein interfaces make them act as hubs; (ii) how hub protein interfaces can interact with tens of other proteins with varying affinities; and (iii) which interactions can occur simultaneously and which are mutually exclusive. Addressing these questions, we propose a method to characterize interactions in a human protein-protein interaction network using three-dimensional protein structures and interfaces. Protein interface analysis shows that the strength and specificity of the interactions of hub proteins and cancer proteins are different than the interactions of non-hub and non-cancer proteins, respectively. In addition, distinguishing overlapping from non-overlapping interfaces, we illustrate how a fourth dimension, that of the sequence of processes, is integrated into the network with case studies. We believe that such an approach should be useful in structural systems biology.
Collapse
Affiliation(s)
- Gozde Kar
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumeli Feneri Yolu, Sariyer Istanbul, Turkey
| | | | | |
Collapse
|
146
|
|
147
|
Tyagi M, Shoemaker BA, Bryant SH, Panchenko AR. Exploring functional roles of multibinding protein interfaces. Protein Sci 2009; 18:1674-83. [PMID: 19591200 DOI: 10.1002/pro.181] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Cellular processes are highly interconnected and many proteins are shared in different pathways. Some of these shared proteins or protein families may interact with diverse partners using the same interface regions; such multibinding proteins are the subject of our study. The main goal of our study is to attempt to decipher the mechanisms of specific molecular recognition of multiple diverse partners by promiscuous protein regions. To address this, we attempt to analyze the physicochemical properties of multibinding interfaces and highlight the major mechanisms of functional switches realized through multibinding. We find that only 5% of protein families in the structure database have multibinding interfaces, and multibinding interfaces do not show any higher sequence conservation compared with the background interface sites. We highlight several important functional mechanisms utilized by multibinding families. (a) Overlap between different functional pathways can be prevented by the switches involving nearby residues of the same interfacial region. (b) Interfaces can be reused in pathways where the substrate should be passed from one protein to another sequentially. (c) The same protein family can develop different specificities toward different binding partners reusing the same interface; and finally, (d) inhibitors can attach to substrate binding sites as substrate mimicry and thereby prevent substrate binding.
Collapse
Affiliation(s)
- Manoj Tyagi
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | | | |
Collapse
|
148
|
Survey of large protein complexes in D. vulgaris reveals great structural diversity. Proc Natl Acad Sci U S A 2009; 106:16580-5. [PMID: 19805340 DOI: 10.1073/pnas.0813068106] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
An unbiased survey has been made of the stable, most abundant multi-protein complexes in Desulfovibrio vulgaris Hildenborough (DvH) that are larger than Mr approximately 400 k. The quaternary structures for 8 of the 16 complexes purified during this work were determined by single-particle reconstruction of negatively stained specimens, a success rate approximately 10 times greater than that of previous "proteomic" screens. In addition, the subunit compositions and stoichiometries of the remaining complexes were determined by biochemical methods. Our data show that the structures of only two of these large complexes, out of the 13 in this set that have recognizable functions, can be modeled with confidence based on the structures of known homologs. These results indicate that there is significantly greater variability in the way that homologous prokaryotic macromolecular complexes are assembled than has generally been appreciated. As a consequence, we suggest that relying solely on previously determined quaternary structures for homologous proteins may not be sufficient to properly understand their role in another cell of interest.
Collapse
|
149
|
Pushing structural information into the yeast interactome by high-throughput protein docking experiments. PLoS Comput Biol 2009; 5:e1000490. [PMID: 19714207 PMCID: PMC2722787 DOI: 10.1371/journal.pcbi.1000490] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2009] [Accepted: 07/28/2009] [Indexed: 11/19/2022] Open
Abstract
The last several years have seen the consolidation of high-throughput proteomics initiatives to identify and characterize protein interactions and macromolecular complexes in model organisms. In particular, more that 10,000 high-confidence protein-protein interactions have been described between the roughly 6,000 proteins encoded in the budding yeast genome (Saccharomyces cerevisiae). However, unfortunately, high-resolution three-dimensional structures are only available for less than one hundred of these interacting pairs. Here, we expand this structural information on yeast protein interactions by running the first-ever high-throughput docking experiment with some of the best state-of-the-art methodologies, according to our benchmarks. To increase the coverage of the interaction space, we also explore the possibility of using homology models of varying quality in the docking experiments, instead of experimental structures, and assess how it would affect the global performance of the methods. In total, we have applied the docking procedure to 217 experimental structures and 1,023 homology models, providing putative structural models for over 3,000 protein-protein interactions in the yeast interactome. Finally, we analyze in detail the structural models obtained for the interaction between SAM1-anthranilate synthase complex and the MET30-RNA polymerase III to illustrate how our predictions can be straightforwardly used by the scientific community. The results of our experiment will be integrated into the general 3D-Repertoire pipeline, a European initiative to solve the structures of as many as possible protein complexes in yeast at the best possible resolution. All docking results are available at http://gatealoy.pcb.ub.es/HT_docking/. Proteins are the main perpetrators of most biological processes. However, they seldom act alone, and most cellular functions are, in fact, carried out by large macromolecular complexes and regulated through intricate protein-protein interaction networks. Consequently, large efforts have been devoted to unveil protein interrelationships in a high-throughput manner, and the last several years have seen the consecution of the first interactome drafts for several model organisms. Unfortunately, these studies only reveal whether two proteins interact, but not the molecular bases of these interactions. A full comprehension of how proteins bind and form complexes can only come from high-resolution, three-dimensional (3D) structures, since they provide the key quasi-atomic details necessary to understand how the individual components in a complex or pathway are assembled and coordinated to function as a molecular unit. Here, we use protein docking experiments, in a high-throughput manner, to predict the 3D structure of over 3,000 interactions in yeast, which will be used to complement the complex structures obtained within the 3D-Repertoire pan-European initiative (http://www.3drepertoire.org).
Collapse
|
150
|
Friedel CC, Zimmer R. Identifying the topology of protein complexes from affinity purification assays. Bioinformatics 2009; 25:2140-6. [PMID: 19505940 PMCID: PMC2723003 DOI: 10.1093/bioinformatics/btp353] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Revised: 04/20/2009] [Accepted: 06/01/2009] [Indexed: 01/10/2023] Open
Abstract
MOTIVATION Recent advances in high-throughput technologies have made it possible to investigate not only individual protein interactions, but also the association of these proteins in complexes. So far the focus has been on the prediction of complexes as sets of proteins from the experimental results. The modular substructure and the physical interactions within the protein complexes have been mostly ignored. RESULTS We present an approach for identifying the direct physical interactions and the subcomponent structure of protein complexes predicted from affinity purification assays. Our algorithm calculates the union of all maximum spanning trees from scoring networks for each protein complex to extract relevant interactions. In a subsequent step this network is extended to interactions which are not accounted for by alternative indirect paths. We show that the interactions identified with this approach are more accurate in predicting experimentally derived physical interactions than baseline approaches. Based on these networks, the subcomponent structure of the complexes can be resolved more satisfactorily and subcomplexes can be identified. The usefulness of our method is illustrated on the RNA polymerases for which the modular substructure can be successfully reconstructed. AVAILABILITY A Java implementation of the prediction methods and supplementary material are available at http://www.bio.ifi.lmu.de/Complexes/Substructures/.
Collapse
Affiliation(s)
- Caroline C Friedel
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany.
| | | |
Collapse
|