101
|
Wang J, Cao Z, Yu J. Protein Structures-based Neighborhood Analysis vs Preferential Interactions Between the Special Pairs of Amino acids? J Biomol Struct Dyn 2011; 28:629-32; discussion 669-674. [DOI: 10.1080/073911011010524968] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
102
|
Abdin MZ, Kiran U, Alam A. Analysis of osmotin, a PR protein as metabolic modulator in plants. Bioinformation 2011; 5:336-40. [PMID: 21383921 PMCID: PMC3046038 DOI: 10.6026/97320630005336] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2010] [Accepted: 12/04/2010] [Indexed: 11/23/2022] Open
Abstract
Osmotin is an abundant cationic multifunctional protein discovered in cells of tobacco (Nicotiana tabacum L. var Wisconsin 38) adapted to an environment of low osmotic potential. Beside its role as osmoregulator, it provides plants protection from pathogens, hence also placed in the PRP family of proteins. The osmotin induced proline accumulation has been reported to confer tolerance against both biotic and abiotic stresses in plants including transgenic tomato and strawberry overexpressing osmotin gene. The exact mechanism of induction of proline by osmotin is however, not known till date. These observations have led us to hypothesize that osmotin could be regulating these plant responses through its involvement either as transcription factor, cell signal pathway modulator or both in plants. We have therefore, undertaken the present investigation to analyze the osmotin protein as transcription factor using bioinformatics tools. The results of available online DNA binding motif search programs revealed that osmotin does not contain DNAbinding motifs. The alignment results of osmotin protein with the protein sequence from DATF showed the homology in the range of 0-20%, suggesting that it might not contain a DNA binding motif. Further to find unique DNA-binding domain, the superimposition of osmotin 3D structure on modeled Arabidopsis transcription factors using Chimera also suggested absence of the same. However, evidence implicating osmotin in cell signaling were found during the study. With these results, we therefore, concluded that osmotin is not a transcription factor, but regulating plant responses to biotic and abiotic stresses through cell signaling.
Collapse
Affiliation(s)
- Malik Zainul Abdin
- Department of Biotechnology, Faculty of Science, Jamia Hamdard, New Delhi-110062, India
- Malik Zainul Abdin: Tel: +91-11-26059688, Extn: 5583
| | - Usha Kiran
- Faculty of Engineering and Interdisciplinary Sciences, Jamia Hamdard, New Delhi-110062, Indi
| | - Afshar Alam
- Department of Computer Science, Jamia Hamdard, New Delhi-110062, India
| |
Collapse
|
103
|
Xue LC, Jordan RA, El-Manzalawy Y, Dobbs D, Honavar V. Ranking Docked Models of Protein-Protein Complexes Using Predicted Partner-Specific Protein-Protein Interfaces: A Preliminary Study. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2011; 2011:441-445. [PMID: 25905110 DOI: 10.1145/2147805.2147866] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Computational protein-protein docking is a valuable tool for determining the conformation of complexes formed by interacting proteins. Selecting near-native conformations from the large number of possible models generated by docking software presents a significant challenge in practice. We introduce a novel method for ranking docked conformations based on the degree of overlap between the interface residues of a docked conformation formed by a pair of proteins with the set of predicted interface residues between them. Our approach relies on a method, called PS-HomPPI, for reliably predicting protein-protein interface residues by taking into account information derived from both interacting proteins. PS-HomPPI infers the residues of a query protein that are likely to interact with a partner protein based on known interface residues of the homo-interologs of the query-partner protein pair, i.e., pairs of interacting proteins that are homologous to the query protein and partner protein. Our results on Docking Benchmark 3.0 show that the quality of the ranking of docked conformations using our method is consistently superior to that produced using ClusPro cluster-size-based and energy-based criteria for 61 out of the 64 docking complexes for which PS-HomPPI produces interface predictions. An implementation of our method for ranking docked models is freely available at: http://einstein.cs.iastate.edu/DockRank/.
Collapse
Affiliation(s)
- Li C Xue
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, 50011, USA
| | - Rafael A Jordan
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, 50011, USA
| | | | - Drena Dobbs
- Department of Computer Science, Pontificia Universidad Javeriana, Cali, Colombia
| | - Vasant Honavar
- Department of Systems and Computer Engineering, AI-Azhar University, Cairo, Egypt
| |
Collapse
|
104
|
Yokoyama KD, Thorne JL, Wray GA. Coordinated genome-wide modifications within proximal promoter cis-regulatory elements during vertebrate evolution. Genome Biol Evol 2010; 3:66-74. [PMID: 21118975 PMCID: PMC3021792 DOI: 10.1093/gbe/evq078] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
There often exists a "one-to-many" relationship between a transcription factor and a multitude of binding sites throughout the genome. It is commonly assumed that transcription factor binding motifs remain largely static over the course of evolution because changes in binding specificity can alter the interactions with potentially hundreds of sites across the genome. Focusing on regulatory motifs overrepresented at specific locations within or near the promoter, we find that a surprisingly large number of cis-regulatory elements have been subject to coordinated genome-wide modifications during vertebrate evolution, such that the motif frequency changes on a single branch of vertebrate phylogeny. This was found to be the case even between closely related mammal species, with nearly a third of all location-specific consensus motifs exhibiting significant modifications within the human or mouse lineage since their divergence. Many of these modifications are likely to be compensatory changes throughout the genome following changes in protein factor binding affinities, whereas others may be due to changes in mutation rates or effective population size. The likelihood that this happened many times during vertebrate evolution highlights the need to examine additional taxa and to understand the evolutionary and molecular mechanisms underlying the evolution of protein-DNA interactions.
Collapse
|
105
|
Lensink MF, Wodak SJ. Blind predictions of protein interfaces by docking calculations in CAPRI. Proteins 2010; 78:3085-95. [DOI: 10.1002/prot.22850] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
106
|
Chen P, Li J. Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information. BMC Bioinformatics 2010; 11:402. [PMID: 20667087 PMCID: PMC2921408 DOI: 10.1186/1471-2105-11-402] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2010] [Accepted: 07/28/2010] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Protein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. Therefore, it is important to improve the performance for predicting protein interaction sites based on primary sequence alone. RESULTS We propose a new idea to construct an integrative profile for each residue in a protein by combining its hydrophobic and evolutionary information. A support vector machine (SVM) ensemble is then developed, where SVMs train on different pairs of positive (interface sites) and negative (non-interface sites) subsets. The subsets having roughly the same sizes are grouped in the order of accessible surface area change before and after complexation. A self-organizing map (SOM) technique is applied to group similar input vectors to make more accurate the identification of interface residues. An ensemble of ten-SVMs achieves an MCC improvement by around 8% and F1 improvement by around 9% over that of three-SVMs. As expected, SVM ensembles constantly perform better than individual SVMs. In addition, the model by the integrative profiles outperforms that based on the sequence profile or the hydropathy scale alone. As our method uses a small number of features to encode the input vectors, our model is simpler, faster and more accurate than the existing methods. CONCLUSIONS The integrative profile by combining hydrophobic and evolutionary information contributes most to the protein-protein interaction prediction. Results show that evolutionary context of residue with respect to hydrophobicity makes better the identification of protein interface residues. In addition, the ensemble of SVM classifiers improves the prediction performance. AVAILABILITY Datasets and software are available at http://mail.ustc.edu.cn/~bigeagle/BMCBioinfo2010/index.htm.
Collapse
Affiliation(s)
- Peng Chen
- Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, 639798 Singapore
| | | |
Collapse
|
107
|
Murakami Y, Mizuguchi K. Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein-protein interaction sites. ACTA ACUST UNITED AC 2010; 26:1841-8. [PMID: 20529890 DOI: 10.1093/bioinformatics/btq302] [Citation(s) in RCA: 161] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The limited availability of protein structures often restricts the functional annotation of proteins and the identification of their protein-protein interaction sites. Computational methods to identify interaction sites from protein sequences alone are, therefore, required for unraveling the functions of many proteins. This article describes a new method (PSIVER) to predict interaction sites, i.e. residues binding to other proteins, in protein sequences. Only sequence features (position-specific scoring matrix and predicted accessibility) are used for training a Naïve Bayes classifier (NBC), and conditional probabilities of each sequence feature are estimated using a kernel density estimation method (KDE). RESULTS The leave-one out cross-validation of PSIVER achieved a Matthews correlation coefficient (MCC) of 0.151, an F-measure of 35.3%, a precision of 30.6% and a recall of 41.6% on a non-redundant set of 186 protein sequences extracted from 105 heterodimers in the Protein Data Bank (consisting of 36 219 residues, of which 15.2% were known interface residues). Even though the dataset used for training was highly imbalanced, a randomization test demonstrated that the proposed method managed to avoid overfitting. PSIVER was also tested on 72 sequences not used in training (consisting of 18 140 residues, of which 10.6% were known interface residues), and achieved an MCC of 0.135, an F-measure of 31.5%, a precision of 25.0% and a recall of 46.5%, outperforming other publicly available servers tested on the same dataset. PSIVER enables experimental biologists to identify potential interface residues in unknown proteins from sequence information alone, and to mutate those residues selectively in order to unravel protein functions. AVAILABILITY Freely available on the web at http://tardis.nibio.go.jp/PSIVER/
Collapse
|
108
|
Konc J, Janezic D. ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment. ACTA ACUST UNITED AC 2010; 26:1160-8. [PMID: 20305268 PMCID: PMC2859123 DOI: 10.1093/bioinformatics/btq100] [Citation(s) in RCA: 184] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Motivation: Exploitation of locally similar 3D patterns of physicochemical properties on the surface of a protein for detection of binding sites that may lack sequence and global structural conservation. Results: An algorithm, ProBiS is described that detects structurally similar sites on protein surfaces by local surface structure alignment. It compares the query protein to members of a database of protein 3D structures and detects with sub-residue precision, structurally similar sites as patterns of physicochemical properties on the protein surface. Using an efficient maximum clique algorithm, the program identifies proteins that share local structural similarities with the query protein and generates structure-based alignments of these proteins with the query. Structural similarity scores are calculated for the query protein's surface residues, and are expressed as different colors on the query protein surface. The algorithm has been used successfully for the detection of protein–protein, protein–small ligand and protein–DNA binding sites. Availability: The software is available, as a web tool, free of charge for academic users at http://probis.cmm.ki.si Contact:dusa@cmm.ki.si Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Janez Konc
- National Institute of Chemistry, Ljubljana, Slovenia
| | | |
Collapse
|
109
|
López G, Ezkurdia I, Tress ML. Assessment of ligand binding residue predictions in CASP8. Proteins 2010; 77 Suppl 9:138-46. [PMID: 19714771 DOI: 10.1002/prot.22557] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Here we detail the assessment process for the binding site prediction category of the eighth Critical Assessment of Protein Structure Prediction experiment (CASP8). Predictions were only evaluated for those targets that bound biologically relevant ligands and were assessed using the Matthews Correlation Coefficient. The results of the analysis clearly demonstrate that three predictors from two groups (Lee and Sternberg) stand out from the rest. A further two groups perform well over subsets of metal binding or nonmetal ligand binding targets. The best methods were able to make consistently reliable predictions based on model structures, though it was noticeable that the two targets that were not well predicted were also the hardest targets. The number of predictors that submitted new methods in this category was highly encouraging and suggests that current technology is at the level that experimental biochemists and structural biologists could benefit from what is clearly a growing field.
Collapse
Affiliation(s)
- Gonzalo López
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | | |
Collapse
|
110
|
Development of a Novel Bioinformatics Tool for In Silico Validation of Protein Interactions. J Biomed Biotechnol 2010; 2010:670125. [PMID: 20625507 PMCID: PMC2896714 DOI: 10.1155/2010/670125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2009] [Revised: 03/10/2010] [Accepted: 03/30/2010] [Indexed: 11/17/2022] Open
Abstract
Protein interactions are crucial in most biological processes. Several in silico methods have been recently developed to predict them. This paper describes a bioinformatics method that combines sequence similarity and structural information to support experimental studies on protein interactions. Given a target protein, the approach selects the most likely interactors among the candidates revealed by experimental techniques, but not yet in vivo validated. The sequence and the structural information of the in vivo confirmed proteins and complexes are exploited to evaluate the candidate interactors. Finally, a score is calculated to suggest the most likely interactors of the target protein. As an example, we searched for GRB2 interactors. We ranked a set of 46 candidate interactors by the presented method. These candidates were then reduced to 21, through a score threshold chosen by means of a cross-validation strategy. Among them, the isoform 1 of MAPK14 was in silico confirmed as a GRB2 interactor. Finally, given a set of already confirmed interactors of GRB2, the accuracy and the precision of the approach were 75% and 86%, respectively. In conclusion, the proposed method can be conveniently exploited to select the proteins to be experimentally investigated within a set of potential interactors.
Collapse
|
111
|
Liu B, Wang X, Lin L, Tang B, Dong Q, Wang X. Prediction of protein binding sites in protein structures using hidden Markov support vector machine. BMC Bioinformatics 2009; 10:381. [PMID: 19925685 PMCID: PMC2785799 DOI: 10.1186/1471-2105-10-381] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2009] [Accepted: 11/20/2009] [Indexed: 01/08/2023] Open
Abstract
Background Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance. Results In this study, we introduce a novel machine learning model hidden Markov support vector machine for protein binding site prediction. The model treats the protein binding site prediction as a sequential labelling task based on the maximum margin criterion. Common features derived from protein sequences and structures, including protein sequence profile and residue accessible surface area, are used to train hidden Markov support vector machine. When tested on six data sets, the method based on hidden Markov support vector machine shows better performance than some state-of-the-art methods, including artificial neural networks, support vector machines and conditional random field. Furthermore, its running time is several orders of magnitude shorter than that of the compared methods. Conclusion The improved prediction performance and computational efficiency of the method based on hidden Markov support vector machine can be attributed to the following three factors. Firstly, the relation between labels of neighbouring residues is useful for protein binding site prediction. Secondly, the kernel trick is very advantageous to this field. Thirdly, the complexity of the training step for hidden Markov support vector machine is linear with the number of training samples by using the cutting-plane algorithm.
Collapse
Affiliation(s)
- Bin Liu
- Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, PR China.
| | | | | | | | | | | |
Collapse
|