51
|
Kirys T, Ruvinsky AM, Singla D, Tuzikov AV, Kundrotas PJ, Vakser IA. Simulated unbound structures for benchmarking of protein docking in the DOCKGROUND resource. BMC Bioinformatics 2015; 16:243. [PMID: 26227548 PMCID: PMC4521349 DOI: 10.1186/s12859-015-0672-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 07/10/2015] [Indexed: 11/10/2022] Open
Abstract
Background Proteins play an important role in biological processes in living organisms. Many protein functions are based on interaction with other proteins. The structural information is important for adequate description of these interactions. Sets of protein structures determined in both bound and unbound states are essential for benchmarking of the docking procedures. However, the number of such proteins in PDB is relatively small. A radical expansion of such sets is possible if the unbound structures are computationally simulated. Results The Dockground public resource provides data to improve our understanding of protein–protein interactions and to assist in the development of better tools for structural modeling of protein complexes, such as docking algorithms and scoring functions. A large set of simulated unbound protein structures was generated from the bound structures. The modeling protocol was based on 1 ns Langevin dynamics simulation. The simulated structures were validated on the ensemble of experimentally determined unbound and bound structures. The set is intended for large scale benchmarking of docking algorithms and scoring functions. Conclusions A radical expansion of the unbound protein docking benchmark set was achieved by simulating the unbound structures. The simulated unbound structures were selected according to criteria from systematic comparison of experimentally determined bound and unbound structures. The set is publicly available at http://dockground.compbio.ku.edu.
Collapse
Affiliation(s)
- Tatsiana Kirys
- Center for Computational Biology, The University of Kansas, Lawrence, KS, 66047, USA. .,United Institute of Informatics Problems, National Academy of Sciences, 220012, Minsk, Belarus.
| | - Anatoly M Ruvinsky
- Center for Computational Biology, The University of Kansas, Lawrence, KS, 66047, USA. .,Schrödinger, Inc., Cambridge, MA, 02142, USA.
| | - Deepak Singla
- Center for Computational Biology, The University of Kansas, Lawrence, KS, 66047, USA.
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences, 220012, Minsk, Belarus.
| | - Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, KS, 66047, USA.
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, KS, 66047, USA. .,Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, 66045, USA.
| |
Collapse
|
52
|
Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA. Structural templates for comparative protein docking. Proteins 2015; 83:1563-70. [PMID: 25488330 DOI: 10.1002/prot.24736] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Revised: 11/15/2014] [Accepted: 11/26/2014] [Indexed: 11/07/2022]
Abstract
Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, nonredundant library of templates containing 4950 full structures of binary complexes and 5936 protein-protein interfaces extracted from the full structures at 12 Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047.,United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | - Petras J Kundrotas
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | - Ilya A Vakser
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66045
| |
Collapse
|
53
|
Aumentado-Armstrong TT, Istrate B, Murgita RA. Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol Biol 2015; 10:7. [PMID: 25713596 PMCID: PMC4338852 DOI: 10.1186/s13015-015-0033-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Accepted: 01/07/2015] [Indexed: 12/19/2022] Open
Abstract
Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
Collapse
|
54
|
Scrima N, Lepault J, Boulard Y, Pasdeloup D, Bressanelli S, Roche S. Insights into herpesvirus tegument organization from structural analyses of the 970 central residues of HSV-1 UL36 protein. J Biol Chem 2015; 290:8820-33. [PMID: 25678705 DOI: 10.1074/jbc.m114.612838] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Indexed: 11/06/2022] Open
Abstract
The tegument of all herpesviruses contains a capsid-bound large protein that is essential for multiple viral processes, including capsid transport, decapsidation at the nuclear pore complex, particle assembly, and secondary envelopment, through mechanisms that are still incompletely understood. We report here a structural characterization of the central 970 residues of this protein for herpes simplex virus type 1 (HSV-1 UL36, 3164 residues). This large fragment is essentially a 34-nm-long monomeric fiber. The crystal structure of its C terminus shows an elongated domain-swapped dimer. Modeling and molecular dynamics simulations give a likely molecular organization for the monomeric form and extend our findings to alphaherpesvirinae. Hence, we propose that an essential feature of UL36 is the existence in its central region of a stalk capable of connecting capsid and membrane across the tegument and that the ability to switch between monomeric and dimeric forms may help UL36 fulfill its multiple functions.
Collapse
Affiliation(s)
- Nathalie Scrima
- From the Institute for Integrative Biology of the Cell (I2BC), 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette
| | - Jean Lepault
- From the Institute for Integrative Biology of the Cell (I2BC), 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette
| | - Yves Boulard
- From the Institute for Integrative Biology of the Cell (I2BC), 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette, the Institute of Biology and Technologies of Saclay, Commissariat à l'Energie Atomique, 91191 Gif-sur-Yvette, and
| | - David Pasdeloup
- the Faculté de Pharmacie, INSERM UMR 984, 5 Rue J. B. Clément, 92290 Châtenay-Malabry, France
| | - Stéphane Bressanelli
- From the Institute for Integrative Biology of the Cell (I2BC), 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette,
| | - Stéphane Roche
- From the Institute for Integrative Biology of the Cell (I2BC), 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette,
| |
Collapse
|
55
|
Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts. BMC Bioinformatics 2014; 15 Suppl 16:S3. [PMID: 25522196 PMCID: PMC4290652 DOI: 10.1186/1471-2105-15-s16-s3] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond to the need of accurate classification of the rapidly increasing protein structures. There are many unannotated crystal contacts and there also exist false annotations in this rapidly expanding volume of data. Previous tools have been proposed to address this problem. However, challenging issues still remain, such as low performance when the training and test data contain mixed interfaces having diverse sizes of contact areas. Methods and results B factor is a measure to quantify the vibrational motion of an atom, a more relevant feature than interface size to characterize protein binding. We propose to use three features related to B factor for the classification between biological interfaces and crystal packing contacts. The first feature is the sum of the normalized B factors of the interfacial atoms in the contact area, the second is the average of the interfacial B factor per residue in the chain, and the third is the average number of interfacial atoms with a negative normalized B factor per residue in the chain. We investigate the distribution properties of these basic features and a compound feature on four datasets of biological binding and crystal packing, and on a protein binding-only dataset with known binding affinity. We also compare the cross-dataset classification performance of these features with existing methods and with a widely-used and the most effective feature interface area. The results demonstrate that our features outperform the interface area approach and the existing prediction methods remarkably for many tests on all of these datasets. Conclusions The proposed B factor related features are more effective than interface area to distinguish crystal packing from biological binding interfaces. Our computational methods have a potential for large-scale and accurate identification of biological interactions from the experimentally determined structural data stored at PDB which may have diverse interface sizes.
Collapse
|
56
|
Luo J, Guo Y, Fu Y, Wang Y, Li W, Li M. Effective discrimination between biologically relevant contacts and crystal packing contacts using new determinants. Proteins 2014; 82:3090-100. [PMID: 25142782 DOI: 10.1002/prot.24670] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Revised: 06/20/2014] [Accepted: 08/11/2014] [Indexed: 12/24/2022]
Abstract
In the structural models determined by X-ray crystallography, contacts between molecules can be divided into two categories: biologically relevant contacts and crystal packing contacts. With the growth in the number and quality of available large crystal packing contacts structures, distinguishing crystal packing contacts from biologically relevant contacts remains a difficult task, which can lead to wrong interpretation of structural models. In this study, we performed a systematic analysis on the biologically relevant contacts and crystal packing contacts. The analysis results reveal that biologically contacts are more tightly packed than crystal packing contacts. This property of biologically contacts may contribute to the formation of their interfacial core region. Meanwhile, the differences between the core and surface region of biologically contacts in amino acid composition and evolutionary measure are more dramatic than crystal packing contacts and these differences appear to be useful in distinguishing these two categories of contacts. On the basis of the features derived from our analysis, we developed a random forest model to classify biological relevant contacts and crystal packing contacts. Our method can achieve a high receiver operating curve of 0.923 in the 5-fold cross-validation and accuracies of 91.4% and 91.7% for two different test sets. Moreover, in a comparison study, our model outperforms other existing methods, such as DiMoVo, Pita, Pisa, and Eppic. We believe that this study will provide useful help in the validation of oligomeric proteins and protein complexes. The model and all data used in this paper are freely available at http://cic.scu.edu.cn/bioinformatics/bio-cry.zip.
Collapse
Affiliation(s)
- Jiesi Luo
- College of Chemistry and State Key Laboratory of Biotherapy, Sichuan University, Chengdu, Sichuan, 610064, People's Republic of China
| | | | | | | | | | | |
Collapse
|
57
|
Sudha G, Nussinov R, Srinivasan N. An overview of recent advances in structural bioinformatics of protein-protein interactions and a guide to their principles. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:141-50. [PMID: 25077409 DOI: 10.1016/j.pbiomolbio.2014.07.004] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 07/13/2014] [Indexed: 12/20/2022]
Abstract
Rich data bearing on the structural and evolutionary principles of protein-protein interactions are paving the way to a better understanding of the regulation of function in the cell. This is particularly the case when these interactions are considered in the framework of key pathways. Knowledge of the interactions may provide insights into the mechanisms of crucial 'driver' mutations in oncogenesis. They also provide the foundation toward the design of protein-protein interfaces and inhibitors that can abrogate their formation or enhance them. The main features to learn from known 3-D structures of protein-protein complexes and the extensive literature which analyzes them computationally and experimentally include the interaction details which permit undertaking structure-based drug discovery, the evolution of complexes and their interactions, the consequences of alterations such as post-translational modifications, ligand binding, disease causing mutations, host pathogen interactions, oligomerization, aggregation and the roles of disorder, dynamics, allostery and more to the protein and the cell. This review highlights some of the recent advances in these areas, including design, inhibition and prediction of protein-protein complexes. The field is broad, and much work has been carried out in these areas, making it challenging to cover it in its entirety. Much of this is due to the fast increase in the number of molecules whose structures have been determined experimentally and the vast increase in computational power. Here we provide a concise overview.
Collapse
Affiliation(s)
- Govindarajan Sudha
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India.
| | - Ruth Nussinov
- Cancer and Inflammation Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick, MD 21702, USA; Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| | | |
Collapse
|
58
|
Sudarshan S, Kodathala SB, Mahadik AC, Mehta I, Beck BW. Protein-protein interface detection using the energy centrality relationship (ECR) characteristic of proteins. PLoS One 2014; 9:e97115. [PMID: 24830938 PMCID: PMC4022497 DOI: 10.1371/journal.pone.0097115] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2014] [Accepted: 04/14/2014] [Indexed: 01/17/2023] Open
Abstract
Specific protein interactions are responsible for most biological functions. Distinguishing Functionally Linked Interfaces of Proteins (FLIPs), from Functionally uncorrelated Contacts (FunCs), is therefore important to characterizing these interactions. To achieve this goal, we have created a database of protein structures called FLIPdb, containing proteins belonging to various functional sub-categories. Here, we use geometric features coupled with Kortemme and Baker's computational alanine scanning method to calculate the energetic sensitivity of each amino acid at the interface to substitution, identify hotspots, and identify other factors that may contribute towards an interface being FLIP or FunC. Using Principal Component Analysis and K-means clustering on a training set of 160 interfaces, we could distinguish FLIPs from FunCs with an accuracy of 76%. When these methods were applied to two test sets of 18 and 170 interfaces, we achieved similar accuracies of 78% and 80%. We have identified that FLIP interfaces have a stronger central organizing tendency than FunCs, due, we suggest, to greater specificity. We also observe that certain functional sub-categories, such as enzymes, antibody-heavy-light, antibody-antigen, and enzyme-inhibitors form distinct sub-clusters. The antibody-antigen and enzyme-inhibitors interfaces have patterns of physical characteristics similar to those of FunCs, which is in agreement with the fact that the selection pressures of these interfaces is differently evolutionarily driven. As such, our ECR model also successfully describes the impact of evolution and natural selection on protein-protein interfaces. Finally, we indicate how our ECR method may be of use in reducing the false positive rate of docking calculations.
Collapse
Affiliation(s)
- Sanjana Sudarshan
- Department of Biology, Texas Woman's University, Denton, Texas, United States of America
| | - Sasi B. Kodathala
- Department of Biology, Texas Woman's University, Denton, Texas, United States of America
| | - Amruta C. Mahadik
- Department of Biology, Texas Woman's University, Denton, Texas, United States of America
| | - Isha Mehta
- Department of Biology, Texas Woman's University, Denton, Texas, United States of America
| | - Brian W. Beck
- Department of Biology, Texas Woman's University, Denton, Texas, United States of America
- Department of Mathematics and Computer Science, Texas Woman's University, Denton, Texas, United States of America
- Department of Chemistry and Biochemistry, Texas Woman's University, Denton, Texas, United States of America
- * E-mail:
| |
Collapse
|
59
|
A functional feature analysis on diverse protein–protein interactions: application for the prediction of binding affinity. J Comput Aided Mol Des 2014; 28:619-29. [DOI: 10.1007/s10822-014-9746-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 04/26/2014] [Indexed: 11/25/2022]
|
60
|
Silberberg Y, Kupiec M, Sharan R. A method for predicting protein-protein interaction types. PLoS One 2014; 9:e90904. [PMID: 24625764 PMCID: PMC3953217 DOI: 10.1371/journal.pone.0090904] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Accepted: 02/07/2014] [Indexed: 11/19/2022] Open
Abstract
Protein-protein interactions (PPIs) govern basic cellular processes through signal transduction and complex formation. The diversity of those processes gives rise to a remarkable diversity of interactions types, ranging from transient phosphorylation interactions to stable covalent bonding. Despite our increasing knowledge on PPIs in humans and other species, their types remain relatively unexplored and few annotations of types exist in public databases. Here, we propose the first method for systematic prediction of PPI type based solely on the techniques by which the interaction was detected. We show that different detection methods are better suited for detecting specific types. We apply our method to ten interaction types on a large scale human PPI dataset. We evaluate the performance of the method using both internal cross validation and external data sources. In cross validation, we obtain an area under receiver operating characteristic (ROC) curve ranging from 0.65 to 0.97 with an average of 0.84 across the predicted types. Comparing the predicted interaction types to external data sources, we obtained significant agreements for phosphorylation and ubiquitination interactions, with hypergeometric p-value = 2.3e(-54) and 5.6e(-28) respectively. We examine the biological relevance of our predictions using known signaling pathways and chart the abundance of interaction types in cell processes. Finally, we investigate the cross-relations between different interaction types within the network and characterize the discovered patterns, or motifs. We expect the resulting annotated network to facilitate the reconstruction of process-specific subnetworks and assist in predicting protein function or interaction.
Collapse
Affiliation(s)
- Yael Silberberg
- Department of Molecular Microbiology and Biotechnology, Tel-Aviv University, Tel Aviv, Israel
| | - Martin Kupiec
- Department of Molecular Microbiology and Biotechnology, Tel-Aviv University, Tel Aviv, Israel
| | - Roded Sharan
- The Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
- * E-mail:
| |
Collapse
|
61
|
Zahiri J, Bozorgmehr JH, Masoudi-Nejad A. Computational Prediction of Protein-Protein Interaction Networks: Algo-rithms and Resources. Curr Genomics 2014; 14:397-414. [PMID: 24396273 PMCID: PMC3861891 DOI: 10.2174/1389202911314060004] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2013] [Revised: 08/07/2013] [Accepted: 08/26/2013] [Indexed: 01/15/2023] Open
Abstract
Protein interactions play an important role in the discovery of protein functions and pathways in biological processes. This is especially true in case of the diseases caused by the loss of specific protein-protein interactions in the organism. The accuracy of experimental results in finding protein-protein interactions, however, is rather dubious and high throughput experimental results have shown both high false positive beside false negative information for protein interaction. Computational methods have attracted tremendous attention among biologists because of the ability to predict protein-protein interactions and validate the obtained experimental results. In this study, we have reviewed several computational methods for protein-protein interaction prediction as well as describing major databases, which store both predicted and detected protein-protein interactions, and the tools used for analyzing protein interaction networks and improving protein-protein interaction reliability.
Collapse
Affiliation(s)
- Javad Zahiri
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| | - Joseph Hannon Bozorgmehr
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Iran
| |
Collapse
|
62
|
Goebels F, Frishman D. Prediction of protein interaction types based on sequence and network features. BMC SYSTEMS BIOLOGY 2013; 7 Suppl 6:S5. [PMID: 24564924 PMCID: PMC4029746 DOI: 10.1186/1752-0509-7-s6-s5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Protein interactions mediate a wide spectrum of functions in various cellular contexts. Functional versatility of protein complexes is due to a broad range of structural adaptations that determine their binding affinity, the number of interaction sites, and the lifetime. In terms of stability it has become customary to distinguish between obligate and non-obligate interactions dependent on whether or not the protomers can exist independently. In terms of spatio-temporal control protein interactions can be either simultaneously possible (SP) or mutually exclusive (ME). In the former case a network hub interacts with several proteins at the same time, offering each of them a separate interface, while in the latter case the hub interacts with its partners one at a time via the same binding site. So far different types of interactions were distinguished based on the properties of the corresponding binding interfaces derived from known three-dimensional structures of protein complexes. RESULTS Here we present PiType, an accurate 3D structure-independent computational method for classifying protein interactions into simultaneously possible (SP) and mutually exclusive (ME) as well as into obligate and non-obligate. Our classifier exploits features of the binding partners predicted from amino acid sequence, their functional similarity, and network topology. We find that the constituents of non-obligate complexes possess a higher degree of structural disorder, more short linear motifs, and lower functional similarity compared to obligate interaction partners while SP and ME interactions are characterized by significant differences in network topology. Each interaction type is associated with a distinct set of biological functions. Moreover, interactions within multi-protein complexes tend to be enriched in one type of interactions. CONCLUSION PiType does not rely on atomic structures and is thus suitable for characterizing proteome-wide interaction datasets. It can also be used to identify sub-modules within protein complexes. PiType is available for download as a self-installing package from http://webclu.bio.wzw.tum.de/PiType/PiType.zip.
Collapse
|
63
|
Ahmed MH, Habtemariam M, Safo MK, Scarsdale JN, Spyrakis F, Cozzini P, Mozzarelli A, Kellogg GE. Unintended consequences? Water molecules at biological and crystallographic protein–protein interfaces. Comput Biol Chem 2013; 47:126-41. [DOI: 10.1016/j.compbiolchem.2013.08.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 08/27/2013] [Accepted: 08/27/2013] [Indexed: 01/31/2023]
|
64
|
Maleki M, Vasudev G, Rueda L. The role of electrostatic energy in prediction of obligate protein-protein interactions. Proteome Sci 2013; 11:S11. [PMID: 24564955 PMCID: PMC3907787 DOI: 10.1186/1477-5956-11-s1-s11] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Prediction and analysis of protein-protein interactions (PPI) and specifically types of PPIs is an important problem in life science research because of the fundamental roles of PPIs in many biological processes in living cells. In addition, electrostatic interactions are important in understanding inter-molecular interactions, since they are long-range, and because of their influence in charged molecules. This is the main motivation for using electrostatic energy for prediction of PPI types. RESULTS We propose a prediction model to analyze protein interaction types, namely obligate and non-obligate, using electrostatic energy values as properties. The prediction approach uses electrostatic energy values for pairs of atoms and amino acids present in interfaces where the interaction occurs. The main features of the complexes are found and then the prediction is performed via several state-of-the-art classification techniques, including linear dimensionality reduction (LDR), support vector machine (SVM), naive Bayes (NB) and k-nearest neighbor (k-NN). For an in-depth analysis of classification results, some other experiments were performed by varying the distance cutoffs between atom pairs of interacting chains, ranging from 5Å to 13Å. Moreover, several feature selection algorithms including gain ratio (GR), information gain (IG), chi-square (Chi2) and minimum redundancy maximum relevance (mRMR) are applied on the available datasets to obtain more discriminative pairs of atom types and amino acid types as features for prediction. CONCLUSIONS Our results on two well-known datasets of obligate and non-obligate complexes confirm that electrostatic energy is an important property to predict obligate and non-obligate protein interaction types on the basis of all the experimental results, achieving accuracies of over 98%. Furthermore, a comparison performed by changing the distance cutoff demonstrates that the best values for prediction of PPI types using electrostatic energy range from 9Å to 12Å, which show that electrostatic interactions are long-range and cover a broader area in the interface. In addition, the results on using feature selection before prediction confirm that (a) a few pairs of atoms and amino acids are appropriate for prediction, and (b) prediction performance can be improved by eliminating irrelevant and noisy features and selecting the most discriminative ones.
Collapse
Affiliation(s)
- Mina Maleki
- School of Computer Science, University of Windsor, 401 Sunset Avenue, Windsor, Ontario, N9B 3P4, Canada
| | - Gokul Vasudev
- School of Computer Science, University of Windsor, 401 Sunset Avenue, Windsor, Ontario, N9B 3P4, Canada
| | - Luis Rueda
- School of Computer Science, University of Windsor, 401 Sunset Avenue, Windsor, Ontario, N9B 3P4, Canada
| |
Collapse
|
65
|
Li Z, He Y, Liu Q, Zhao L, Wong L, Kwoh CK, Nguyen H, Li J. Structural analysis on mutation residues and interfacial water molecules for human TIM disease understanding. BMC Bioinformatics 2013; 14 Suppl 16:S11. [PMID: 24564410 PMCID: PMC3853089 DOI: 10.1186/1471-2105-14-s16-s11] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Background Human triosephosphate isomerase (HsTIM) deficiency is a genetic disease caused often by the pathogenic mutation E104D. This mutation, located at the side of an abnormally large cluster of water in the inter-subunit interface, reduces the thermostability of the enzyme. Why and how these water molecules are directly related to the excessive thermolability of the mutant have not been investigated in structural biology. Results This work compares the structure of the E104D mutant with its wild type counterparts. It is found that the water topology in the dimer interface of HsTIM is atypical, having a "wet-core-dry-rim" distribution with 16 water molecules tightly packed in a small deep region surrounded by 22 residues including GLU104. These water molecules are co-conserved with their surrounding residues in non-archaeal TIMs (dimers) but not conserved across archaeal TIMs (tetramers), indicating their importance in preserving the overall quaternary structure. As the structural permutation induced by the mutation is not significant, we hypothesize that the excessive thermolability of the E104D mutant is attributed to the easy propagation of atoms' flexibility from the surface into the core via the large cluster of water. It is indeed found that the B factor increment in the wet region is higher than other regions, and, more importantly, the B factor increment in the wet region is maintained in the deeply buried core. Molecular dynamics simulations revealed that for the mutant structure at normal temperature, a clear increase of the root-mean-square deviation is observed for the wet region contacting with the large cluster of interfacial water. Such increase is not observed for other interfacial regions or the whole protein. This clearly suggests that, in the E104D mutant, the large water cluster is responsible for the subunit interface flexibility and overall thermolability, and it ultimately leads to the deficiency of this enzyme. Conclusions Our study reveals that a large cluster of water buried in protein interfaces is fragile and high-maintenance, closely related to the structure, function and evolution of the whole protein.
Collapse
|
66
|
Kundrotas PJ, Vakser IA, Janin J. Structural templates for modeling homodimers. Protein Sci 2013; 22:1655-63. [PMID: 23996787 DOI: 10.1002/pro.2361] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 08/23/2013] [Accepted: 08/23/2013] [Indexed: 12/17/2022]
Abstract
Oligomeric proteins are more abundant in nature than monomeric proteins, and involved in all biological processes. In the absence of an experimental structure, their subunits can be modeled from their sequence like monomeric proteins, but reliable procedures to build the oligomeric assembly are scarce. Template-based methods, which start from known protein structures, are commonly applied to model subunits. We present a method to model homodimers that relies on a structural alignment of the subunits, and test it on a set of 511 target structures recently released by the Protein Data Bank, taking as templates the earlier released structures of 3108 homodimeric proteins (H-set), and 2691 monomeric proteins that form dimer-like assemblies in crystals (M-set). The structural alignment identifies a H-set template for 97% of the targets, and in half of the cases, it yields a correct model of the dimer geometry and residue-residue contacts in the target. It also identifies a M-set template for most of the targets, and some of the crystal dimers are very similar to the target homodimers. The procedure efficiently detects homology at low levels of sequence identities, and points to erroneous quaternary structures in the Protein Data Bank. The high coverage of the target set suggests that the content of the Protein Data Bank already approaches the structural diversity of protein assemblies in nature, and that template-based methods should become the choice method for modeling oligomeric as well as monomeric proteins.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Center for Bioinformatics, The University of Kansas, 2030 Becker Dr., Lawrence, Kansas, 66047
| | | | | |
Collapse
|
67
|
Kuzu G, Gursoy A, Nussinov R, Keskin O. Exploiting conformational ensembles in modeling protein-protein interactions on the proteome scale. J Proteome Res 2013; 12:2641-53. [PMID: 23590674 PMCID: PMC3685852 DOI: 10.1021/pr400006k] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Cellular functions are performed through protein-protein interactions; therefore, identification of these interactions is crucial for understanding biological processes. Recent studies suggest that knowledge-based approaches are more useful than "blind" docking for modeling at large scales. However, a caveat of knowledge-based approaches is that they treat molecules as rigid structures. The Protein Data Bank (PDB) offers a wealth of conformations. Here, we exploited an ensemble of the conformations in predictions by a knowledge-based method, PRISM. We tested "difficult" cases in a docking-benchmark data set, where the unbound and bound protein forms are structurally different. Considering alternative conformations for each protein, the percentage of successfully predicted interactions increased from ~26 to 66%, and 57% of the interactions were successfully predicted in an "unbiased" scenario, in which data related to the bound forms were not utilized. If the appropriate conformation, or relevant template interface, is unavailable in the PDB, PRISM could not predict the interaction successfully. The pace of the growth of the PDB promises a rapid increase of ensemble conformations emphasizing the merit of such knowledge-based ensemble strategies for higher success rates in protein-protein interaction predictions on an interactome scale. We constructed the structural network of ERK interacting proteins as a case study.
Collapse
Affiliation(s)
- Guray Kuzu
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | - Attila Gursoy
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | - Ruth Nussinov
- Basic Science Program, SAIC-Frederick, Inc. National Cancer Institute, Center for Cancer Research Nanobiology Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702
- Sackler Inst. of Molecular Medicine Department of Human Genetics and Molecular Medicine Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| |
Collapse
|
68
|
Andreani J, Faure G, Guerois R. InterEvScore: a novel coarse-grained interface scoring function using a multi-body statistical potential coupled to evolution. ACTA ACUST UNITED AC 2013; 29:1742-9. [PMID: 23652426 DOI: 10.1093/bioinformatics/btt260] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
MOTIVATION Structural prediction of protein interactions currently remains a challenging but fundamental goal. In particular, progress in scoring functions is critical for the efficient discrimination of near-native interfaces among large sets of decoys. Many functions have been developed using knowledge-based potentials, but few make use of multi-body interactions or evolutionary information, although multi-residue interactions are crucial for protein-protein binding and protein interfaces undergo significant selection pressure to maintain their interactions. RESULTS This article presents InterEvScore, a novel scoring function using a coarse-grained statistical potential including two- and three-body interactions, which provides each residue with the opportunity to contribute in its most favorable local structural environment. Combination of this potential with evolutionary information considerably improves scoring results on the 54 test cases from the widely used protein docking benchmark for which evolutionary information can be collected. We analyze how our way to include evolutionary information gradually increases the discriminative power of InterEvScore. Comparison with several previously published scoring functions (ZDOCK, ZRANK and SPIDER) shows the significant progress brought by InterEvScore. AVAILABILITY http://biodev.cea.fr/interevol/interevscore CONTACT guerois@cea.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jessica Andreani
- CEA, iBiTecS, Service de Bioenergetique Biologie Structurale et Mecanismes SB2SM, Laboratoire de Biologie Structurale et Radiobiologie LBSR, F-91191 Gif sur Yvette, France
| | | | | |
Collapse
|
69
|
Beta atomic contacts: identifying critical specific contacts in protein binding interfaces. PLoS One 2013; 8:e59737. [PMID: 23630569 PMCID: PMC3632532 DOI: 10.1371/journal.pone.0059737] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2012] [Accepted: 02/21/2013] [Indexed: 11/19/2022] Open
Abstract
Specific binding between proteins plays a crucial role in molecular functions and biological processes. Protein binding interfaces and their atomic contacts are typically defined by simple criteria, such as distance-based definitions that only use some threshold of spatial distance in previous studies. These definitions neglect the nearby atomic organization of contact atoms, and thus detect predominant contacts which are interrupted by other atoms. It is questionable whether such kinds of interrupted contacts are as important as other contacts in protein binding. To tackle this challenge, we propose a new definition called beta (β) atomic contacts. Our definition, founded on the β-skeletons in computational geometry, requires that there is no other atom in the contact spheres defined by two contact atoms; this sphere is similar to the van der Waals spheres of atoms. The statistical analysis on a large dataset shows that β contacts are only a small fraction of conventional distance-based contacts. To empirically quantify the importance of β contacts, we design βACV, an SVM classifier with β contacts as input, to classify homodimers from crystal packing. We found that our βACV is able to achieve the state-of-the-art classification performance superior to SVM classifiers with distance-based contacts as input. Our βACV also outperforms several existing methods when being evaluated on several datasets in previous works. The promising empirical performance suggests that β contacts can truly identify critical specific contacts in protein binding interfaces. β contacts thus provide a new model for more precise description of atomic organization in protein quaternary structures than distance-based contacts.
Collapse
|
70
|
Fan CY, Bai YH, Huang CY, Yao TJ, Chiang WH, Chang DTH. PRASA: an integrated web server that analyzes protein interaction types. Gene 2013; 518:78-83. [PMID: 23276706 DOI: 10.1016/j.gene.2012.11.083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 11/27/2012] [Indexed: 11/16/2022]
Abstract
This work presents the Protein Association Analyzer (PRASA) (http://zoro.ee.ncku.edu.tw/prasa/) that predicts protein interactions as well as interaction types. Protein interactions are essential to most biological functions. The existence of diverse interaction types, such as physically contacted or functionally related interactions, makes protein interactions complex. Different interaction types are distinct and should not be confused. However, most existing tools focus on a specific interaction type or mix different interaction types. This work collected 7234058 associations with experimentally verified interaction types from five databases and compiled individual probabilistic models for different interaction types. The PRASA result page shows predicted associations and their related references by interaction type. Experimental results demonstrate the performance difference when distinguishing between different interaction types. The PRASA provides a centralized and organized platform for easy browsing, downloading and comparing of interaction types, which helps reveal insights into the complex roles that proteins play in organisms.
Collapse
Affiliation(s)
- Chen-Yu Fan
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | | | | | | | | | | |
Collapse
|
71
|
Levy ED, Teichmann S. Structural, evolutionary, and assembly principles of protein oligomerization. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2013; 117:25-51. [PMID: 23663964 DOI: 10.1016/b978-0-12-386931-9.00002-7] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
In the protein universe, 30-50% of proteins self-assemble to form symmetrical complexes consisting of multiple copies of themselves, called homomers. The prevalence of homomers motivates us to review many of their properties. In Section 1, we describe the methods and challenges associated with quaternary structure inference-these methods are indeed at the basis of any analysis on homomers. In Section 2, we describe the morphological properties of homomers, as well as the database 3DComplex, which provides a taxonomy for both homomeric and heteromeric protein complexes. In Section 3, we review interface properties of homomeric complexes. In Section 4, we then present recent findings on the evolution of homomer interfaces, which we link in Section 5 to the evolution of homomers as entire entities. In Section 6, we discuss mechanisms involved in their assembly and how these mechanisms can be linked to evolution.
Collapse
Affiliation(s)
- Emmanuel D Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel.
| | | |
Collapse
|
72
|
Nishi H, Hashimoto K, Madej T, Panchenko AR. Evolutionary, physicochemical, and functional mechanisms of protein homooligomerization. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2013; 117:3-24. [PMID: 23663963 PMCID: PMC3786560 DOI: 10.1016/b978-0-12-386931-9.00001-5] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Protein homooligomers afford several important benefits for the cell; they mediate and regulate gene expression, activity of many enzymes, ion channels, receptors, and cell-cell adhesion processes. The evolutionary and physical mechanisms of oligomer formation are very diverse and are not well understood. Certain homooligomeric states may be conserved within protein subfamilies and between different subfamilies, therefore providing the specificity to particular substrates while minimizing interactions with unwanted partners. In addition, transitions between different oligomeric states may regulate protein activity and support the switch between different pathways. In this chapter, we summarize the biological importance of homooligomeric assemblies, physicochemical properties of their interfaces, experimental methods for their identification, their evolution, and role in human diseases.
Collapse
Affiliation(s)
- Hafumi Nishi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | | | | | | |
Collapse
|
73
|
Structural basis for the activation mechanism of the PlcR virulence regulator by the quorum-sensing signal peptide PapR. Proc Natl Acad Sci U S A 2012; 110:1047-52. [PMID: 23277548 DOI: 10.1073/pnas.1213770110] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The quorum-sensing regulator PlcR is the master regulator of most known virulence factors in Bacillus cereus. It is a helix-turn-helix (HTH)-type transcription factor activated upon binding of its cognate signaling peptide PapR on a tetratricopeptide repeat-type regulatory domain. The structural and functional properties of PlcR have defined a new family of sensor regulators, called the RNPP family (for Rap, NprR, PrgX, and PlcR), in Gram-positive bacteria. To fully understand the activation mechanism of PlcR, we took a closer look at the conformation changes induced upon binding of PapR and of its target DNA, known as PlcR-box. For that purpose we have determined the structures of the apoform of PlcR (Apo PlcR) and of the ternary complex of PlcR with PapR and the PlcR-box from the plcA promoter. Comparison of the apoform of PlcR with the previously published structure of the PlcR-PapR binary complex shows how a small conformational change induced in the C-terminal region of the tetratricopeptide repeat (TPR) domain upon peptide binding propagates via the linker helix to the N-terminal HTH DNA-binding domain. Further comparison with the PlcR-PapR-DNA ternary complex shows how the activation of the PlcR dimer allows the linker helix to undergo a drastic conformational change and subsequent proper positioning of the HTH domains in the major groove of the two half sites of the pseudopalindromic PlcR-box. Together with random mutagenesis experiments and interaction measurements using peptides from distinct pherogroups, this structural analysis allows us to propose a molecular mechanism for this functional switch.
Collapse
|
74
|
Duarte JM, Srebniak A, Schärer MA, Capitani G. Protein interface classification by evolutionary analysis. BMC Bioinformatics 2012; 13:334. [PMID: 23259833 PMCID: PMC3556496 DOI: 10.1186/1471-2105-13-334] [Citation(s) in RCA: 112] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2012] [Accepted: 12/15/2012] [Indexed: 01/01/2023] Open
Abstract
Background Distinguishing biologically relevant interfaces from lattice contacts in protein crystals is a fundamental problem in structural biology. Despite efforts towards the computational prediction of interface character, many issues are still unresolved. Results We present here a protein-protein interface classifier that relies on evolutionary data to detect the biological character of interfaces. The classifier uses a simple geometric measure, number of core residues, and two evolutionary indicators based on the sequence entropy of homolog sequences. Both aim at detecting differential selection pressure between interface core and rim or rest of surface. The core residues, defined as fully buried residues (>95% burial), appear to be fundamental determinants of biological interfaces: their number is in itself a powerful discriminator of interface character and together with the evolutionary measures it is able to clearly distinguish evolved biological contacts from crystal ones. We demonstrate that this definition of core residues leads to distinctively better results than earlier definitions from the literature. The stringent selection and quality filtering of structural and sequence data was key to the success of the method. Most importantly we demonstrate that a more conservative selection of homolog sequences - with relatively high sequence identities to the query - is able to produce a clearer signal than previous attempts. Conclusions An evolutionary approach like the one presented here is key to the advancement of the field, which so far was missing an effective method exploiting the evolutionary character of protein interfaces. Its coverage and performance will only improve over time thanks to the incessant growth of sequence databases. Currently our method reaches an accuracy of 89% in classifying interfaces of the Ponstingl 2003 datasets and it lends itself to a variety of useful applications in structural biology and bioinformatics. We made the corresponding software implementation available to the community as an easy-to-use graphical web interface at http://www.eppic-web.org.
Collapse
Affiliation(s)
- Jose M Duarte
- Paul Scherrer Institut, Villigen, CH-5232, Switzerland
| | | | | | | |
Collapse
|
75
|
Acuner Ozbabacan SE, Keskin O, Nussinov R, Gursoy A. Enriching the human apoptosis pathway by predicting the structures of protein-protein complexes. J Struct Biol 2012; 179:338-46. [PMID: 22349545 PMCID: PMC3378801 DOI: 10.1016/j.jsb.2012.02.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2011] [Revised: 12/23/2011] [Accepted: 02/04/2012] [Indexed: 10/14/2022]
Abstract
Apoptosis is a matter of life and death for cells and both inhibited and enhanced apoptosis may be involved in the pathogenesis of human diseases. The structures of protein-protein complexes in the apoptosis signaling pathway are important as the structural pathway helps in understanding the mechanism of the regulation and information transfer, and in identifying targets for drug design. Here, we aim to predict the structures toward a more informative pathway than currently available. Based on the 3D structures of complexes in the target pathway and a protein-protein interaction modeling tool which allows accurate and proteome-scale applications, we modeled the structures of 29 interactions, 21 of which were previously unknown. Next, 27 interactions which were not listed in the KEGG apoptosis pathway were predicted and subsequently validated by the experimental data in the literature. Additional interactions are also predicted. The multi-partner hub proteins are analyzed and interactions that can and cannot co-exist are identified. Overall, our results enrich the understanding of the pathway with interactions and provide structural details for the human apoptosis pathway. They also illustrate that computational modeling of protein-protein interactions on a large scale can help validate experimental data and provide accurate, structural atom-level detail of signaling pathways in the human cell.
Collapse
Affiliation(s)
- Saliha Ece Acuner Ozbabacan
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | - Ruth Nussinov
- Basic Science Program, SAIC-Frederick, Inc. Center for Cancer Research Nanobiology Program NCI-Frederick, Frederick, MD 21702
- Sackler Inst. of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Attila Gursoy
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| |
Collapse
|
76
|
Andreani J, Faure G, Guerois R. Versatility and invariance in the evolution of homologous heteromeric interfaces. PLoS Comput Biol 2012; 8:e1002677. [PMID: 22952442 PMCID: PMC3431345 DOI: 10.1371/journal.pcbi.1002677] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2012] [Accepted: 07/24/2012] [Indexed: 11/18/2022] Open
Abstract
Evolutionary pressures act on protein complex interfaces so that they preserve their complementarity. Nonetheless, the elementary interactions which compose the interface are highly versatile throughout evolution. Understanding and characterizing interface plasticity across evolution is a fundamental issue which could provide new insights into protein-protein interaction prediction. Using a database of 1,024 couples of close and remote heteromeric structural interologs, we studied protein-protein interactions from a structural and evolutionary point of view. We systematically and quantitatively analyzed the conservation of different types of interface contacts. Our study highlights astonishing plasticity regarding polar contacts at complex interfaces. It also reveals that up to a quarter of the residues switch out of the interface when comparing two homologous complexes. Despite such versatility, we identify two important interface descriptors which correlate with an increased conservation in the evolution of interfaces: apolar patches and contacts surrounding anchor residues. These observations hold true even when restricting the dataset to transiently formed complexes. We show that a combination of six features related either to sequence or to geometric properties of interfaces can be used to rank positions likely to share similar contacts between two interologs. Altogether, our analysis provides important tracks for extracting meaningful information from multiple sequence alignments of conserved binding partners and for discriminating near-native interfaces using evolutionary information. Unraveling how interfaces of protein complexes coevolved is of major importance to improve our ability to predict their structures and design novel binders. Proteins whose interaction was maintained throughout evolution generally have their homologs binding in a similar manner while their sequences can have significantly diverged. Constraints holding proteins together should be captured from the growing body of available multiple sequence alignments. However, it remains unclear which features of the interfaces provide most tolerance to mutations and it is unknown whether any invariant properties may help to extract meaningful signals from sequence alignments. To solve this issue, we tackled an unprecedented large scale analysis of more than 1000 non-redundant couples of structural interologs. Structural interologs are pairs of complexes of known structure whose chains are homologs. We quantitatively measured how the networks of contacts varied between two interfaces. Although highly versatile, we found that contact networks were more conserved for residues acting as anchors and for apolar contacts when they are clustered into surface patches. Altogether, our results provide major guidelines for exploiting the wealth of evolutionary information contained in the sequences of binding partners. On those bases we developed a method to predict which residues most likely conserve their contacts.
Collapse
Affiliation(s)
- Jessica Andreani
- CEA, iBiTecS, Service de Bioenergetique Biologie Structurale et Mecanismes (SB2SM), Laboratoire de Biologie Structurale et Radiobiologie (LBSR), Gif sur Yvette, France
- CNRS, UMR 8221, Gif sur Yvette, France
- Université Paris Sud, UMR 8221, Orsay, France
| | - Guilhem Faure
- CEA, iBiTecS, Service de Bioenergetique Biologie Structurale et Mecanismes (SB2SM), Laboratoire de Biologie Structurale et Radiobiologie (LBSR), Gif sur Yvette, France
- CNRS, UMR 8221, Gif sur Yvette, France
- Université Paris Sud, UMR 8221, Orsay, France
| | - Raphaël Guerois
- CEA, iBiTecS, Service de Bioenergetique Biologie Structurale et Mecanismes (SB2SM), Laboratoire de Biologie Structurale et Radiobiologie (LBSR), Gif sur Yvette, France
- CNRS, UMR 8221, Gif sur Yvette, France
- Université Paris Sud, UMR 8221, Orsay, France
- * E-mail:
| |
Collapse
|
77
|
Sampathkumar P, Kim SJ, Manglicmot D, Bain KT, Gilmore J, Gheyi T, Phillips J, Pieper U, Fernandez-Martinez J, Franke JD, Matsui T, Tsuruta H, Atwell S, Thompson DA, Emtage JS, Wasserman SR, Rout MP, Sali A, Sauder JM, Almo SC, Burley SK. Atomic structure of the nuclear pore complex targeting domain of a Nup116 homologue from the yeast, Candida glabrata. Proteins 2012; 80:2110-6. [PMID: 22544723 PMCID: PMC3686472 DOI: 10.1002/prot.24102] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2012] [Revised: 04/05/2012] [Accepted: 04/11/2012] [Indexed: 01/07/2023]
Abstract
The nuclear pore complex (NPC), embedded in the nuclear envelope, is a large, dynamic molecular assembly that facilitates exchange of macromolecules between the nucleus and the cytoplasm. The yeast NPC is an eightfold symmetric annular structure composed of ~456 polypeptide chains contributed by ~30 distinct proteins termed nucleoporins. Nup116, identified only in fungi, plays a central role in both protein import and mRNA export through the NPC. Nup116 is a modular protein with N-terminal "FG" repeats containing a Gle2p-binding sequence motif and a NPC targeting domain at its C-terminus. We report the crystal structure of the NPC targeting domain of Candida glabrata Nup116, consisting of residues 882-1034 [CgNup116(882-1034)], at 1.94 Å resolution. The X-ray structure of CgNup116(882-1034) is consistent with the molecular envelope determined in solution by small-angle X-ray scattering. Structural similarities of CgNup116(882-1034) with homologous domains from Saccharomyces cerevisiae Nup116, S. cerevisiae Nup145N, and human Nup98 are discussed.
Collapse
|
78
|
Liu Q, Wong L, Li J. Z-score biological significance of binding hot spots of protein interfaces by using crystal packing as the reference state. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2012; 1824:1457-67. [PMID: 22728649 DOI: 10.1016/j.bbapap.2012.05.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2012] [Revised: 05/12/2012] [Accepted: 05/31/2012] [Indexed: 11/19/2022]
Abstract
Characterization of binding hot spots of protein interfaces is a fundamental study in molecular biology. Many computational methods have been proposed to identify binding hot spots. However, there are few studies to assess the biological significance of binding hot spots. We introduce the notion of biological significance of a contact residue for capturing the probability of the residue occurring in or contributing to protein binding interfaces. We take a statistical Z-score approach to the assessment of the biological significance. The method has three main steps. First, the potential score of a residue is defined by using a knowledge-based potential function with relative accessible surface area calculations. A null distribution of this potential score is then generated from artifact crystal packing contacts. Finally, the Z-score significance of a contact residue with a specific potential score is determined according to this null distribution. We hypothesize that residues at binding hot spots have big absolute values of Z-score as they contribute greatly to binding free energy. Thus, we propose to use Z-score to predict whether a contact residue is a hot spot residue. Comparison with previously reported methods on two benchmark datasets shows that this Z-score method is mostly superior to earlier methods. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.
Collapse
Affiliation(s)
- Qian Liu
- BIRC, SCE, Nanyang Technological University, Singapore 639798, Singapore
| | | | | |
Collapse
|
79
|
Garma L, Mukherjee S, Mitra P, Zhang Y. How many protein-protein interactions types exist in nature? PLoS One 2012; 7:e38913. [PMID: 22719985 PMCID: PMC3374795 DOI: 10.1371/journal.pone.0038913] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2012] [Accepted: 05/14/2012] [Indexed: 11/18/2022] Open
Abstract
“Protein quaternary structure universe” refers to the ensemble of all protein-protein complexes across all organisms in nature. The number of quaternary folds thus corresponds to the number of ways proteins physically interact with other proteins. This study focuses on answering two basic questions: Whether the number of protein-protein interactions is limited and, if yes, how many different quaternary folds exist in nature. By all-to-all sequence and structure comparisons, we grouped the protein complexes in the protein data bank (PDB) into 3,629 families and 1,761 folds. A statistical model was introduced to obtain the quantitative relation between the numbers of quaternary families and quaternary folds in nature. The total number of possible protein-protein interactions was estimated around 4,000, which indicates that the current protein repository contains only 42% of quaternary folds in nature and a full coverage needs approximately a quarter century of experimental effort. The results have important implications to the protein complex structural modeling and the structure genomics of protein-protein interactions.
Collapse
Affiliation(s)
- Leonardo Garma
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Biocenter Oulu and Department of Biochemistry, University of Oulu, Oulu, Finland
| | - Srayanta Mukherjee
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Pralay Mitra
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
80
|
Li Z, He Y, Wong L, Li J. Progressive dry-core-wet-rim hydration trend in a nested-ring topology of protein binding interfaces. BMC Bioinformatics 2012; 13:51. [PMID: 22452998 PMCID: PMC3373366 DOI: 10.1186/1471-2105-13-51] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2011] [Accepted: 03/27/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Water is an integral part of protein complexes. It shapes protein binding sites by filling cavities and it bridges local contacts by hydrogen bonds. However, water molecules are usually not included in protein interface models in the past, and few distribution profiles of water molecules in protein binding interfaces are known. RESULTS In this work, we use a tripartite protein-water-protein interface model and a nested-ring atom re-organization method to detect hydration trends and patterns from an interface data set which involves immobilized interfacial water molecules. This data set consists of 206 obligate interfaces, 160 non-obligate interfaces, and 522 crystal packing contacts. The two types of biological interfaces are found to be drier than the crystal packing interfaces in our data, agreeable to a hydration pattern reported earlier although the previous definition of immobilized water is pure distance-based. The biological interfaces in our data set are also found to be subject to stronger water exclusion in their formation. To study the overall hydration trend in protein binding interfaces, atoms at the same burial level in each tripartite protein-water-protein interface are organized into a ring. The rings of an interface are then ordered with the core atoms placed at the middle of the structure to form a nested-ring topology. We find that water molecules on the rings of an interface are generally configured in a dry-core-wet-rim pattern with a progressive level-wise solvation towards to the rim of the interface. This solvation trend becomes even sharper when counterexamples are separated. CONCLUSIONS Immobilized water molecules are regularly organized in protein binding interfaces and they should be carefully considered in the studies of protein hydration mechanisms.
Collapse
Affiliation(s)
- Zhenhua Li
- Bioinformatics Research Center at the School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | | | | | | |
Collapse
|
81
|
Swapna LS, Bhaskara RM, Sharma J, Srinivasan N. Roles of residues in the interface of transient protein-protein complexes before complexation. Sci Rep 2012; 2:334. [PMID: 22451863 PMCID: PMC3312204 DOI: 10.1038/srep00334] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/07/2012] [Indexed: 12/26/2022] Open
Abstract
Transient protein-protein interactions play crucial roles in all facets of cellular physiology. Here, using an analysis on known 3-D structures of transient protein-protein complexes, their corresponding uncomplexed forms and energy calculations we seek to understand the roles of protein-protein interfacial residues in the unbound forms. We show that there are conformationally near invariant and evolutionarily conserved interfacial residues which are rigid and they account for ∼65% of the core interface. Interestingly, some of these residues contribute significantly to the stabilization of the interface structure in the uncomplexed form. Such residues have strong energetic basis to perform dual roles of stabilizing the structure of the uncomplexed form as well as the complex once formed while they maintain their rigid nature throughout. This feature is evolutionarily well conserved at both the structural and sequence levels. We believe this analysis has general bearing in the prediction of interfaces and understanding molecular recognition.
Collapse
|
82
|
Tyagi M, Thangudu RR, Zhang D, Bryant SH, Madej T, Panchenko AR. Homology inference of protein-protein interactions via conserved binding sites. PLoS One 2012; 7:e28896. [PMID: 22303436 PMCID: PMC3269416 DOI: 10.1371/journal.pone.0028896] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 11/16/2011] [Indexed: 11/18/2022] Open
Abstract
The coverage and reliability of protein-protein interactions determined by high-throughput experiments still needs to be improved, especially for higher organisms, therefore the question persists, how interactions can be verified and predicted by computational approaches using available data on protein structural complexes. Recently we developed an approach called IBIS (Inferred Biomolecular Interaction Server) to predict and annotate protein-protein binding sites and interaction partners, which is based on the assumption that the structural location and sequence patterns of protein-protein binding sites are conserved between close homologs. In this study first we confirmed high accuracy of our method and found that its accuracy depends critically on the usage of all available data on structures of homologous complexes, compared to the approaches where only a non-redundant set of complexes is employed. Second we showed that there exists a trade-off between specificity and sensitivity if we employ in the prediction only evolutionarily conserved binding site clusters or clusters supported by only one observation (singletons). Finally we addressed the question of identifying the biologically relevant interactions using the homology inference approach and demonstrated that a large majority of crystal packing interactions can be correctly identified and filtered by our algorithm. At the same time, about half of biological interfaces that are not present in the protein crystallographic asymmetric unit can be reconstructed by IBIS from homologous complexes without the prior knowledge of crystal parameters of the query protein.
Collapse
Affiliation(s)
- Manoj Tyagi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ratna R. Thangudu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Dachuan Zhang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Stephen H. Bryant
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Thomas Madej
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (TM); (AP)
| | - Anna R. Panchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (TM); (AP)
| |
Collapse
|
83
|
Faure G, Andreani J, Guerois R. InterEvol database: exploring the structure and evolution of protein complex interfaces. Nucleic Acids Res 2012; 40:D847-56. [PMID: 22053089 PMCID: PMC3245184 DOI: 10.1093/nar/gkr845] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Revised: 09/15/2011] [Accepted: 09/21/2011] [Indexed: 11/12/2022] Open
Abstract
Capturing how the structures of interacting partners evolved at their binding interfaces is a fundamental issue for understanding interactomes evolution. In that scope, the InterEvol database was designed for exploring 3D structures of homologous interfaces of protein complexes. For every chain forming a complex in the protein data bank (PDB), close and remote structural interologs were identified providing essential snapshots for studying interfaces evolution. The database provides tools to retrieve and visualize these structures. In addition, pre-computed multiple sequence alignments of most likely interologs retrieved from a wide range of species can be downloaded to enrich the analysis. The database can be queried either directly by pdb code or keyword but also from the sequence of one or two partners. Interologs multiple sequence alignments can also be recomputed online with tailored parameters using the InterEvolAlign facility. Last, an InterEvol PyMol plugin was developed to improve interactive exploration of structures versus sequence alignments at the interfaces of complexes. Based on a series of automatic methods to extract structural and sequence data, the database will be monthly updated. Structures coordinates and sequence alignments can be queried and downloaded from the InterEvol web interface at http://biodev.cea.fr/interevol/.
Collapse
Affiliation(s)
- Guilhem Faure
- CEA, iBiTecS, F-91191 Gif sur Yvette and CNRS, F-91191 Gif sur Yvette, France
| | - Jessica Andreani
- CEA, iBiTecS, F-91191 Gif sur Yvette and CNRS, F-91191 Gif sur Yvette, France
| | - Raphaël Guerois
- CEA, iBiTecS, F-91191 Gif sur Yvette and CNRS, F-91191 Gif sur Yvette, France
| |
Collapse
|
84
|
Zhao N, Pang B, Shyu CR, Korkin D. Feature-based classification of native and non-native protein-protein interactions: Comparing supervised and semi-supervised learning approaches. Proteomics 2011; 11:4321-30. [PMID: 22002942 DOI: 10.1002/pmic.201100217] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2011] [Revised: 07/26/2011] [Accepted: 08/18/2011] [Indexed: 12/12/2022]
Abstract
Structural knowledge about protein-protein interactions can provide insights to the basic processes underlying cell function. Recent progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computationally determined near-native models of protein-protein interactions. However, determining whether a protein-protein interaction is physiological or it is the artifact of an experimental or computational method remains a challenging problem. In this work, we have addressed two related problems. The first problem is distinguishing between the experimentally obtained physiological and crystal-packing protein-protein interactions. The second problem is concerned with the classification of near-native and inaccurate docking models. We first defined a universal set of interface features and employed a support vector machines (SVM)-based approach to classify the interactions for both problems, with the accuracy, precision, and recall for the first problem classifier reaching 93%. To improve the classification, we next developed a semi-supervised learning approach for the second problem, using transductive SVM (TSVM). We applied both classifiers to a commonly used protein docking benchmark of 124 complexes. We found that while we reached the classification accuracies of 78.9% for the SVM classifier and 80.3% for the TSVM classifier, improving protein-docking methods by model re-ranking remains a challenging problem.
Collapse
Affiliation(s)
- Nan Zhao
- Informatics Institute and Department of Computer Science, University of Missouri, Columbia, MO, USA
| | | | | | | |
Collapse
|
85
|
Teyra J, Samsonov SA, Schreiber S, Pisabarro MT. SCOWLP update: 3D classification of protein-protein, -peptide, -saccharide and -nucleic acid interactions, and structure-based binding inferences across folds. BMC Bioinformatics 2011; 12:398. [PMID: 21992011 PMCID: PMC3210135 DOI: 10.1186/1471-2105-12-398] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2011] [Accepted: 10/13/2011] [Indexed: 11/10/2022] Open
Abstract
Background Protein interactions are essential for coordinating cellular functions. Proteomic studies have already elucidated a huge amount of protein-protein interactions that require detailed functional analysis. Understanding the structural basis of each individual interaction through their structural determination is necessary, yet an unfeasible task. Therefore, computational tools able to predict protein binding regions and recognition modes are required to rationalize putative molecular functions for proteins. With this aim, we previously created SCOWLP, a structural classification of protein binding regions at protein family level, based on the information obtained from high-resolution 3D protein-protein and protein-peptide complexes. Description We present here a new version of SCOWLP that has been enhanced by the inclusion of protein-nucleic acid and protein-saccharide interactions. SCOWLP takes interfacial solvent into account for a detailed characterization of protein interactions. In addition, the binding regions obtained per protein family have been enriched by the inclusion of predicted binding regions, which have been inferred from structurally related proteins across all existing folds. These inferences might become very useful to suggest novel recognition regions and compare structurally similar interfaces from different families. Conclusions The updated SCOWLP has new functionalities that allow both, detection and comparison of protein regions recognizing different types of ligands, which include other proteins, peptides, nucleic acids and saccharides, within a solvated environment. Currently, SCOWLP allows the analysis of predicted protein binding regions based on structure-based inferences across fold space. These predictions may have a unique potential in assisting protein docking, in providing insights into protein interaction networks, and in guiding rational engineering of protein ligands. The newly designed SCOWLP web application has an improved user-friendly interface that facilitates its usage, and is available at http://www.scowlp.org.
Collapse
Affiliation(s)
- Joan Teyra
- Structural Bioinformatics BIOTEC TU Dresden, Tatzberg 47-51 01037 Dresden, Germany.
| | | | | | | |
Collapse
|
86
|
Bound water at protein-protein interfaces: partners, roles and hydrophobic bubbles as a conserved motif. PLoS One 2011; 6:e24712. [PMID: 21961043 PMCID: PMC3178540 DOI: 10.1371/journal.pone.0024712] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 08/17/2011] [Indexed: 12/18/2022] Open
Abstract
Background There is a great interest in understanding and exploiting protein-protein associations as new routes for treating human disease. However, these associations are difficult to structurally characterize or model although the number of X-ray structures for protein-protein complexes is expanding. One feature of these complexes that has received little attention is the role of water molecules in the interfacial region. Methodology A data set of 4741 water molecules abstracted from 179 high-resolution (≤ 2.30 Å) X-ray crystal structures of protein-protein complexes was analyzed with a suite of modeling tools based on the HINT forcefield and hydrogen-bonding geometry. A metric termed Relevance was used to classify the general roles of the water molecules. Results The water molecules were found to be involved in: a) (bridging) interactions with both proteins (21%), b) favorable interactions with only one protein (53%), and c) no interactions with either protein (26%). This trend is shown to be independent of the crystallographic resolution. Interactions with residue backbones are consistent for all classes and account for 21.5% of all interactions. Interactions with polar residues are significantly more common for the first group and interactions with non-polar residues dominate the last group. Waters interacting with both proteins stabilize on average the proteins' interaction (−0.46 kcal mol−1), but the overall average contribution of a single water to the protein-protein interaction energy is unfavorable (+0.03 kcal mol−1). Analysis of the waters without favorable interactions with either protein suggests that this is a conserved phenomenon: 42% of these waters have SASA ≤ 10 Å2 and are thus largely buried, and 69% of these are within predominantly hydrophobic environments or “hydrophobic bubbles”. Such water molecules may have an important biological purpose in mediating protein-protein interactions.
Collapse
|
87
|
Aziz MM, Maleki M, Rueda L, Raza M, Banerjee S. Prediction of biological protein-protein interactions using atom-type and amino acid properties. Proteomics 2011; 11:3802-10. [DOI: 10.1002/pmic.201100186] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2011] [Revised: 05/25/2011] [Accepted: 05/30/2011] [Indexed: 11/10/2022]
|
88
|
Mitra P, Pal D. Combining Bayes classification and point group symmetry under Boolean framework for enhanced protein quaternary structure inference. Structure 2011; 19:304-12. [PMID: 21397182 DOI: 10.1016/j.str.2011.01.009] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Revised: 01/10/2011] [Accepted: 01/10/2011] [Indexed: 11/30/2022]
Abstract
Our ability to infer the protein quaternary structure automatically from atom and lattice information is inadequate, especially for weak complexes, and heteromeric quaternary structures. Several approaches exist, but they have limited performance. Here, we present a new scheme to infer protein quaternary structure from lattice and protein information, with all-around coverage for strong, weak and very weak affinity homomeric and heteromeric complexes. The scheme combines naive Bayes classifier and point group symmetry under Boolean framework to detect quaternary structures in crystal lattice. It consistently produces ≥90% coverage across diverse benchmarking data sets, including a notably superior 95% coverage for recognition heteromeric complexes, compared with 53% on the same data set by current state-of-the-art method. The detailed study of a limited number of prediction-failed cases offers interesting insights into the intriguing nature of protein contacts in lattice. The findings have implications for accurate inference of quaternary states of proteins, especially weak affinity complexes.
Collapse
Affiliation(s)
- Pralay Mitra
- Bioinformatics Centre, Supercomputer Education Research Centre, Indian Institute of Science, Bangalore 560 012, India
| | | |
Collapse
|
89
|
Acuner Ozbabacan SE, Engin HB, Gursoy A, Keskin O. Transient protein-protein interactions. Protein Eng Des Sel 2011; 24:635-48. [DOI: 10.1093/protein/gzr025] [Citation(s) in RCA: 170] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
|
90
|
Teyra J, Hawkins J, Zhu H, Pisabarro MT. Studies on the inference of protein binding regions across fold space based on structural similarities. Proteins 2011; 79:499-508. [PMID: 21069715 DOI: 10.1002/prot.22897] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The emerging picture of a continuous protein fold space highlights the existence of non obvious structural similarities between proteins with apparent different topologies. The identification of structure resemblances across fold space and the analysis of similar recognition regions may be a valuable source of information towards protein structure-based functional characterization. In this work, we use non-sequential structural alignment methods (ns-SAs) to identify structural similarities between protein pairs independently of their SCOP hierarchy, and we calculate the significance of binding region conservation using the interacting residues overlap in the ns-SA. We cluster the binding inferences for each family to distinguish already known family binding regions from putative new ones. Our methodology exploits the enormous amount of data available in the PDB to identify binding region similarities within protein families and to propose putative binding regions. Our results indicate that there is a plethora of structurally common binding regions among proteins, independently of current fold classifications. We obtain a 6- to 8-fold enrichment of novel binding regions, and identify binding inferences for 728 protein families that so far lack binding information in the PDB. We explore binding mode analogies between ligands from commonly clustered binding regions to investigate the utility of our methodology. A comprehensive analysis of the obtained binding inferences may help in the functional characterization of protein recognition and assist rational engineering. The data obtained in this work is available in the download link at www.scowlp.org.
Collapse
Affiliation(s)
- Joan Teyra
- Structural Bioinformatics, BIOTEC, Technical University of Dresden, Tatzberg 47-51, 01307 Dresden, Germany.
| | | | | | | |
Collapse
|
91
|
Hashimoto K, Nishi H, Bryant S, Panchenko AR. Caught in self-interaction: evolutionary and functional mechanisms of protein homooligomerization. Phys Biol 2011; 8:035007. [PMID: 21572178 DOI: 10.1088/1478-3975/8/3/035007] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Many soluble and membrane proteins form homooligomeric complexes in a cell which are responsible for the diversity and specificity of many pathways, may mediate and regulate gene expression, activity of enzymes, ion channels, receptors, and cell adhesion processes. The evolutionary and physical mechanisms of oligomerization are very diverse and its general principles have not yet been formulated. Homooligomeric states may be conserved within certain protein subfamilies and might be important in providing specificity to certain substrates while minimizing interactions with other unwanted partners. Moreover, recent studies have led to a greater awareness that transitions between different oligomeric states may regulate protein activity and provide the switch between different pathways. In this paper we summarize the biological importance of homooligomeric assemblies, physico-chemical properties of their interfaces, experimental and computational methods for their identification and prediction. We particularly focus on homooligomer evolution and describe the mechanisms to develop new specificities through the formation of different homooligomeric complexes. Finally, we discuss the possible role of oligomeric transitions in the regulation of protein activity and compile a set of experimental examples with such regulatory mechanisms.
Collapse
Affiliation(s)
- Kosuke Hashimoto
- National Center for Biotechnology Information, National Library of Medicine, National Institutes ofHealth, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
92
|
Tuncbag N, Gursoy A, Keskin O. Prediction of protein-protein interactions: unifying evolution and structure at protein interfaces. Phys Biol 2011; 8:035006. [PMID: 21572173 DOI: 10.1088/1478-3975/8/3/035006] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The vast majority of the chores in the living cell involve protein-protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein-protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Koc University, Center for Computational Biology and Bioinformatics, and College of Engineering, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | | | | |
Collapse
|
93
|
Fernández‐Recio J. Prediction of protein binding sites and hot spots. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.45] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
94
|
Zheng M, Cierpicki T, Burdette AJ, Utepbergenov D, Janczyk PŁ, Derewenda U, Stukenberg PT, Caldwell KA, Derewenda ZS. Structural features and chaperone activity of the NudC protein family. J Mol Biol 2011; 409:722-41. [PMID: 21530541 DOI: 10.1016/j.jmb.2011.04.018] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Revised: 04/07/2011] [Accepted: 04/07/2011] [Indexed: 11/19/2022]
Abstract
The NudC family consists of four conserved proteins with representatives in all eukaryotes. The archetypal nudC gene from Aspergillus nidulans is a member of the nud gene family that is involved in the maintenance of nuclear migration. This family also includes nudF, whose human orthologue, Lis1, codes for a protein essential for brain cortex development. Three paralogues of NudC are known in vertebrates: NudC, NudC-like (NudCL), and NudC-like 2 (NudCL2). The fourth distantly related member of the family, CML66, contains a NudC-like domain. The three principal NudC proteins have no catalytic activity but appear to play as yet poorly defined roles in proliferating and dividing cells. We present crystallographic and NMR studies of the human NudC protein and discuss the results in the context of structures recently deposited by structural genomics centers (i.e., NudCL and mouse NudCL2). All proteins share the same core CS domain characteristic of proteins acting either as cochaperones of Hsp90 or as independent small heat shock proteins. However, while NudC and NudCL dimerize via an N-terminally located coiled coil, the smaller NudCL2 lacks this motif and instead dimerizes as a result of unique domain swapping. We show that NudC and NudCL, but not NudCL2, inhibit the aggregation of several target proteins, consistent with an Hsp90-independent heat shock protein function. Importantly, and in contrast to several previous reports, none of the three proteins is able to form binary complexes with Lis1. The availability of structural information will be of help in further studies on the cellular functions of the NudC family.
Collapse
Affiliation(s)
- Meiying Zheng
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
95
|
Prediction of protein–protein interaction types using the decision templates based on multiple classier fusion. ACTA ACUST UNITED AC 2010. [DOI: 10.1016/j.mcm.2010.01.025] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
96
|
Launay G, Simonson T. A large decoy set of protein-protein complexes produced by flexible docking. J Comput Chem 2010; 32:106-20. [DOI: 10.1002/jcc.21604] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
97
|
Mechanisms of protein oligomerization, the critical role of insertions and deletions in maintaining different oligomeric states. Proc Natl Acad Sci U S A 2010; 107:20352-7. [PMID: 21048085 DOI: 10.1073/pnas.1012999107] [Citation(s) in RCA: 140] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The main principles of protein-protein recognition are elucidated by the studies of homooligomers which in turn mediate and regulate gene expression, activity of enzymes, ion channels, receptors, and cell-cell adhesion processes. Here we explore oligomeric states of homologous proteins in various organisms to better understand the functional roles and evolutionary mechanisms of homooligomerization. We observe a great diversity in mechanisms controlling oligomerization and focus in our study on insertions and deletions in homologous proteins and how they enable or disable complex formation. We show that insertions and deletions which differentiate monomers and dimers have a significant tendency to be located on the interaction interfaces and about a quarter of all proteins studied and forty percent of enzymes have regions which mediate or disrupt the formation of oligomers. We suggest that relatively small insertions or deletions may have a profound effect on complex stability and/or specificity. Indeed removal of complex enabling regions from protein structures in many cases resulted in the complete or partial loss of stability. Moreover, we find that insertions and deletions modulating oligomerization have a lower aggregation propensity and contain a larger fraction of polar, charged residues, glycine and proline compared to conventional interfaces and protein surface. Most likely, these regions may mediate specific interactions, prevent nonspecific dysfunctional aggregation and preclude undesired interactions between close paralogs therefore separating their functional pathways. Last, we show how the presence or absence of insertions and deletions on interfaces might be of practical value in annotating protein oligomeric states.
Collapse
|
98
|
Xu Q, Dunbrack RL. The protein common interface database (ProtCID)--a comprehensive database of interactions of homologous proteins in multiple crystal forms. Nucleic Acids Res 2010; 39:D761-70. [PMID: 21036862 PMCID: PMC3013667 DOI: 10.1093/nar/gkq1059] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The protein common interface database (ProtCID) is a database that contains clusters of similar homodimeric and heterodimeric interfaces observed in multiple crystal forms (CFs). Such interfaces, especially of homologous but non-identical proteins, have been associated with biologically relevant interactions. In ProtCID, protein chains in the protein data bank (PDB) are grouped based on their PFAM domain architectures. For a single PFAM architecture, all the dimers present in each CF are constructed and compared with those in other CFs that contain the same domain architecture. Interfaces occurring in two or more CFs comprise an interface cluster in the database. The same process is used to compare heterodimers of chains with different domain architectures. By examining interfaces that are shared by many homologous proteins in different CFs, we find that the PDB and the Protein Interfaces, Surfaces, and Assemblies (PISA) are not always consistent in their annotations of biological assemblies in a homologous family. Our data therefore provide an independent check on publicly available annotations of the structures of biological interactions for PDB entries. Common interfaces may also be useful in studies of protein evolution. Coordinates for all interfaces in a cluster are downloadable for further analysis. ProtCiD is available at http://dunbrack2.fccc.edu/protcid.
Collapse
Affiliation(s)
- Qifang Xu
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA
| | | |
Collapse
|
99
|
Abstract
M-ORBIS is a Molecular Cartography approach that performs integrative high-throughput analysis of structural data to localize all types of binding sites and associated partners by homology and to characterize their properties and behaviors in a systemic way. The robustness of our binding site inferences was compared to four curated datasets corresponding to protein heterodimers and homodimers and protein–DNA/RNA assemblies. The Molecular Cartographies of structurally well-detailed proteins shows that 44% of their surfaces interact with non-solvent partners. Residue contact frequencies with water suggest that ∼86% of their surfaces are transiently solvated, whereas only 15% are specifically solvated. Our analysis also reveals the existence of two major binding site families: specific binding sites which can only bind one type of molecule (protein, DNA, RNA, etc.) and polyvalent binding sites that can bind several distinct types of molecule. Specific homodimer binding sites are for instance nearly twice as hydrophobic than previously described and more closely resemble the protein core, while polyvalent binding sites able to form homo and heterodimers more closely resemble the surfaces involved in crystal packing. Similarly, the regions able to bind DNA and to alternatively form homodimers, are more hydrophobic and less polar than previously described DNA binding sites.
Collapse
|
100
|
Zen A, Micheletti C, Keskin O, Nussinov R. Comparing interfacial dynamics in protein-protein complexes: an elastic network approach. BMC STRUCTURAL BIOLOGY 2010; 10:26. [PMID: 20691107 PMCID: PMC2927602 DOI: 10.1186/1472-6807-10-26] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Accepted: 08/08/2010] [Indexed: 01/12/2023]
Abstract
Background The transient, or permanent, association of proteins to form organized complexes is one of the most common mechanisms of regulation of biological processes. Systematic physico-chemical studies of the binding interfaces have previously shown that a key mechanism for the formation/stabilization of dimers is the steric and chemical complementarity of the two semi-interfaces. The role of the fluctuation dynamics at the interface of the interacting subunits, although expectedly important, proved more elusive to characterize. The aim of the present computational study is to gain insight into salient dynamics-based aspects of protein-protein interfaces. Results The interface dynamics was characterized by means of an elastic network model for 22 representative dimers covering three main interface types. The three groups gather dimers sharing the same interface but with good (type I) or poor (type II) similarity of the overall fold, or dimers sharing only one of the semi-interfaces (type III). The set comprises obligate dimers, which are complexes for which no structural representative of the free form(s) is available. Considerations were accordingly limited to bound and unbound forms of the monomeric subunits of the dimers. We proceeded by first computing the mobility of amino acids at the interface of the bound forms and compare it with the mobility of (i) other surface amino acids (ii) interface amino acids in the unbound forms. In both cases different dynamic patterns were observed across interface types and depending on whether the interface belongs to an obligate or non-obligate complex. Conclusions The comparative investigation indicated that the mobility of amino acids at the dimeric interface is generally lower than for other amino acids at the protein surface. The change in interfacial mobility upon removing "in silico" the partner monomer (unbound form) was next found to be correlated with the interface type, size and obligate nature of the complex. In particular, going from the unbound to the bound forms, the interfacial mobility is noticeably reduced for dimers with type I interfaces, while it is largely unchanged for type II ones. The results suggest that these structurally- and biologically-different types of interfaces are stabilized by different balancing mechanisms between enthalpy and conformational entropy.
Collapse
Affiliation(s)
- Andrea Zen
- SISSA, Democritos CNR-IOM and Italian Institute of Technology, Via Bonomea 265, 34136 Trieste, Italy
| | | | | | | |
Collapse
|