Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bordner AJ. Predicting small ligand binding sites in proteins using backbone structure. ACTA ACUST UNITED AC 2008;24:2865-71. [PMID: 18940825 PMCID: PMC2639300 DOI: 10.1093/bioinformatics/btn543] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

For:	Bordner AJ. Predicting small ligand binding sites in proteins using backbone structure. ACTA ACUST UNITED AC 2008;24:2865-71. [PMID: 18940825 PMCID: PMC2639300 DOI: 10.1093/bioinformatics/btn543] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Number

Cited by Other Article(s)

Zhang Y, Li S, Meng K, Sun S. Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction. J Chem Inf Model 2024;64:1456-1472. [PMID: 38385768 DOI: 10.1021/acs.jcim.3c01841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]

Dhakal A, McKay C, Tanner JJ, Cheng J. Artificial intelligence in the prediction of protein-ligand interactions: recent advances and future directions. Brief Bioinform 2022;23:bbab476. [PMID: 34849575 PMCID: PMC8690157 DOI: 10.1093/bib/bbab476] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 09/28/2021] [Accepted: 10/15/2021] [Indexed: 12/13/2022] Open

Hu X, Feng Z, Zhang X, Liu L, Wang S. The Identification of Metal Ion Ligand-Binding Residues by Adding the Reclassified Relative Solvent Accessibility. Front Genet 2020;11:214. [PMID: 32265982 PMCID: PMC7096583 DOI: 10.3389/fgene.2020.00214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 02/24/2020] [Indexed: 11/13/2022] Open

Pirzada RH, Javaid N, Choi S. The Roles of the NLRP3 Inflammasome in Neurodegenerative and Metabolic Diseases and in Relevant Advanced Therapeutic Interventions. Genes (Basel) 2020;11:E131. [PMID: 32012695 PMCID: PMC7074480 DOI: 10.3390/genes11020131] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2019] [Revised: 01/22/2020] [Accepted: 01/22/2020] [Indexed: 02/07/2023] Open

Yan R, Wang X, Tian Y, Xu J, Xu X, Lin J. Prediction of zinc-binding sites using multiple sequence profiles and machine learning methods. Mol Omics 2019;15:205-215. [PMID: 31046040 DOI: 10.1039/c9mo00043g] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Abstract

The zinc (Zn²⁺) cofactor has been proven to be involved in numerous biological mechanisms and the zinc-binding site is recognized as one of the most important post-translation modifications in proteins. Therefore, accurate knowledge of zinc ions in protein structures can provide potential clues for elucidation of protein folding and functions. However, determining zinc-binding residues by experimental means is usually lab-intensive and associated with high cost in most cases. In this context, the development of computational tools for identifying zinc-binding sites is highly desired, especially in the current post-genomic era. In this work, we developed a novel zinc-binding site prediction method by combining several intensively-trained machine learning models. To establish an accurate and generative method, we downloaded all zinc-binding proteins from the Protein Data Bank and prepared a non-redundant dataset. Meanwhile, a well-prepared dataset by other groups was also used. Then, effective and complementary features were extracted from sequences and three-dimensional structures of these proteins. Moreover, several well-designed machine learning models were intensively trained to construct accurate models. To assess the performance, the obtained predictors were stringently benchmarked using the diverse zinc-binding sites. Furthermore, several state-of-the-art in silico methods developed specifically for zinc-binding sites were also evaluated and compared. The results confirmed that our method is very competitive in real world applications and could become a complementary tool to wet lab experiments. To facilitate research in the community, a web server and stand-alone program implementing our method were constructed and are publicly available at . The downloadable program of our method can be easily used for the high-throughput screening of potential zinc-binding sites across proteomes.

Collapse

Srivastava A, Kumar M. Prediction of zinc binding sites in proteins using sequence derived information. J Biomol Struct Dyn 2018;36:4413-4423. [PMID: 29241411 DOI: 10.1080/07391102.2017.1417910] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Zhang J, Chai H, Gao B, Yang G, Ma Z. HEMEsPred: Structure-Based Ligand-Specific Heme Binding Residues Prediction by Using Fast-Adaptive Ensemble Learning Scheme. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:147-156. [PMID: 28029626 DOI: 10.1109/tcbb.2016.2615010] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Lima AN, Philot EA, Trossini GHG, Scott LPB, Maltarollo VG, Honorio KM. Use of machine learning approaches for novel drug discovery. Expert Opin Drug Discov 2016;11:225-39. [PMID: 26814169 DOI: 10.1517/17460441.2016.1146250] [Citation(s) in RCA: 138] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Zhou W, Tang GW, Altman RB. High Resolution Prediction of Calcium-Binding Sites in 3D Protein Structures Using FEATURE. J Chem Inf Model 2015;55:1663-72. [PMID: 26226489 PMCID: PMC4731830 DOI: 10.1021/acs.jcim.5b00367] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Morshed N, Echols N, Adams PD. Using support vector machines to improve elemental ion identification in macromolecular crystal structures. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2015;71:1147-58. [PMID: 25945580 PMCID: PMC4427199 DOI: 10.1107/s1399004715004241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 03/01/2015] [Indexed: 11/11/2022]

Krivák R, Hoksza D. Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features. J Cheminform 2015;7:12. [PMID: 25932051 PMCID: PMC4414931 DOI: 10.1186/s13321-015-0059-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 02/24/2015] [Indexed: 11/10/2022] Open

He W, Liang Z, Teng M, Niu L. mFASD: a structure-based algorithm for discriminating different types of metal-binding sites. Bioinformatics 2015;31:1938-44. [DOI: 10.1093/bioinformatics/btv044] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 01/17/2015] [Indexed: 11/13/2022] Open

Zhou Y, Xue S, Yang JJ. Calciomics: integrative studies of Ca2+-binding proteins and their interactomes in biological systems. Metallomics 2013;5:29-42. [PMID: 23235533 DOI: 10.1039/c2mt20009k] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Liu Z, Wang Y, Zhou C, Xue Y, Zhao W, Liu H. Computationally characterizing and comprehensive analysis of zinc-binding sites in proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013;1844:171-80. [PMID: 23499845 DOI: 10.1016/j.bbapap.2013.03.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Revised: 03/02/2013] [Accepted: 03/04/2013] [Indexed: 10/27/2022]

Abstract

Zinc is one of the most essential metals utilized by organisms, and zinc-binding proteins play an important role in a variety of biological processes such as transcription regulation, cell metabolism and apoptosis. Thus, characterizing the precise zinc-binding sites is fundamental to an elucidation of the biological functions and molecular mechanisms of zinc-binding proteins. Using systematic analyses of structural characteristics, we observed that 4-residue and 3-residue zinc-binding sites have distinctly specific geometric features. Based on the results, we developed the novel computational program Geometric REstriction for Zinc-binding (GRE4Zn) to characterize the zinc-binding sites in protein structures, by restricting the distances between zinc and its coordinating atoms. The comparison between GRE4Zn and analogous tools revealed that it achieved a superior performance. A large-scale prediction for structurally characterized proteins was performed with this powerful predictor, and statistical analyses for the results indicated zinc-binding proteins have come to be significantly involved in more complicated biological processes in higher species than simpler species during the course of evolution. Further analyses suggested that zinc-binding proteins are preferentially implicated in a variety of diseases and highly enriched in known drug targets, and the prediction of zinc-binding sites can be helpful for the investigation of molecular mechanisms. In this regard, these prediction and analysis results should prove to be highly useful be helpful for further biomedical study and drug design. The online service of GRE4Zn is freely available at: http://biocomp.ustc.edu.cn/gre4zn/. This article is part of a Special Issue entitled: Computational Proteomics, Systems Biology & Clinical Implications. Guest Editor: Yudong Cai.

Collapse

Chen Z, Wang Y, Zhai YF, Song J, Zhang Z. ZincExplorer: an accurate hybrid method to improve the prediction of zinc-binding sites from protein sequences. MOLECULAR BIOSYSTEMS 2013;9:2213-22. [DOI: 10.1039/c3mb70100j] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

An integrative computational framework based on a two-step random forest algorithm improves prediction of zinc-binding sites in proteins. PLoS One 2012;7:e49716. [PMID: 23166753 PMCID: PMC3499040 DOI: 10.1371/journal.pone.0049716] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 10/12/2012] [Indexed: 11/30/2022] Open

Abstract

Zinc-binding proteins are the most abundant metalloproteins in the Protein Data Bank where the zinc ions usually have catalytic, regulatory or structural roles critical for the function of the protein. Accurate prediction of zinc-binding sites is not only useful for the inference of protein function but also important for the prediction of 3D structure. Here, we present a new integrative framework that combines multiple sequence and structural properties and graph-theoretic network features, followed by an efficient feature selection to improve prediction of zinc-binding sites. We investigate what information can be retrieved from the sequence, structure and network levels that is relevant to zinc-binding site prediction. We perform a two-step feature selection using random forest to remove redundant features and quantify the relative importance of the retrieved features. Benchmarking on a high-quality structural dataset containing 1,103 protein chains and 484 zinc-binding residues, our method achieved >80% recall at a precision of 75% for the zinc-binding residues Cys, His, Glu and Asp on 5-fold cross-validation tests, which is a 10%-28% higher recall at the 75% equal precision compared to SitePredict and zincfinder at residue level using the same dataset. The independent test also indicates that our method has achieved recall of 0.790 and 0.759 at residue and protein levels, respectively, which is a performance better than the other two methods. Moreover, AUC (the Area Under the Curve) and AURPC (the Area Under the Recall-Precision Curve) by our method are also respectively better than those of the other two methods. Our method can not only be applied to large-scale identification of zinc-binding sites when structural information of the target is available, but also give valuable insights into important features arising from different levels that collectively characterize the zinc-binding sites. The scripts and datasets are available at http://protein.cau.edu.cn/zincidentifier/.

Collapse

Zhao K, Wang X, Wong HC, Wohlhueter R, Kirberger MP, Chen G, Yang JJ. Predicting Ca2+ -binding sites using refined carbon clusters. Proteins 2012;80:2666-79. [PMID: 22821762 DOI: 10.1002/prot.24149] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2012] [Revised: 06/14/2012] [Accepted: 07/11/2012] [Indexed: 12/13/2022]

Passerini A, Lippi M, Frasconi P. Predicting metal-binding sites from protein sequence. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:203-213. [PMID: 21606549 DOI: 10.1109/tcbb.2011.94] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Tang Y, Sheng X, Stumpf MPH. The roles of contact residue disorder and domain composition in characterizing protein-ligand binding specificity and promiscuity. MOLECULAR BIOSYSTEMS 2011;7:3280-6. [PMID: 22002096 DOI: 10.1039/c1mb05325f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Kochańczyk M. Prediction of functionally important residues in globular proteins from unusual central distances of amino acids. BMC STRUCTURAL BIOLOGY 2011;11:34. [PMID: 21923943 PMCID: PMC3188475 DOI: 10.1186/1472-6807-11-34] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2011] [Accepted: 09/18/2011] [Indexed: 12/12/2022]

Abstract

BACKGROUND

Well-performing automated protein function recognition approaches usually comprise several complementary techniques. Beside constructing better consensus, their predictive power can be improved by either adding or refining independent modules that explore orthogonal features of proteins. In this work, we demonstrated how the exploration of global atomic distributions can be used to indicate functionally important residues.

RESULTS

Using a set of carefully selected globular proteins, we parametrized continuous probability density functions describing preferred central distances of individual protein atoms. Relative preferred burials were estimated using mixture models of radial density functions dependent on the amino acid composition of a protein under consideration. The unexpectedness of extraordinary locations of atoms was evaluated in the information-theoretic manner and used directly for the identification of key amino acids. In the validation study, we tested capabilities of a tool built upon our approach, called SurpResi, by searching for binding sites interacting with ligands. The tool indicated multiple candidate sites achieving success rates comparable to several geometric methods. We also showed that the unexpectedness is a property of regions involved in protein-protein interactions, and thus can be used for the ranking of protein docking predictions. The computational approach implemented in this work is freely available via a Web interface at http://www.bioinformatics.org/surpresi.

CONCLUSIONS

Probabilistic analysis of atomic central distances in globular proteins is capable of capturing distinct orientational preferences of amino acids as resulting from different sizes, charges and hydrophobic characters of their side chains. When idealized spatial preferences can be inferred from the sole amino acid composition of a protein, residues located in hydrophobically unfavorable environments can be easily detected. Such residues turn out to be often directly involved in binding ligands or interfacing with other proteins.

Collapse

Regad L, Martin J, Camproux AC. Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs. BMC Bioinformatics 2011;12:247. [PMID: 21689388 PMCID: PMC3158783 DOI: 10.1186/1471-2105-12-247] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2010] [Accepted: 06/20/2011] [Indexed: 12/24/2022] Open

Regad L, Saladin A, Maupetit J, Geneix C, Camproux AC. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops. Nucleic Acids Res 2011;39:W203-9. [PMID: 21665924 PMCID: PMC3125790 DOI: 10.1093/nar/gkr410] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Liu R, Hu J. HemeBIND: a novel method for heme binding residue prediction by combining structural and sequence information. BMC Bioinformatics 2011;12:207. [PMID: 21612668 PMCID: PMC3124436 DOI: 10.1186/1471-2105-12-207] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2010] [Accepted: 05/26/2011] [Indexed: 02/03/2023] Open

Abstract

BACKGROUND

Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues.

RESULTS

Here we introduced an efficient algorithm HemeBIND for predicting heme binding residues by integrating structural and sequence information. We systematically investigated the characteristics of binding interfaces based on a non-redundant dataset of heme-protein complexes. It was found that several sequence and structural attributes such as evolutionary conservation, solvent accessibility, depth and protrusion clearly illustrate the differences between heme binding and non-binding residues. These features can then be separately used or combined to build the structure-based classifiers using support vector machine (SVM). The results showed that the information contained in these features is largely complementary and their combination achieved the best performance. To further improve the performance, an attempt has been made to develop a post-processing procedure to reduce the number of false positives. In addition, we built a sequence-based classifier based on SVM and sequence profile as an alternative when only sequence information can be used. Finally, we employed a voting method to combine the outputs of structure-based and sequence-based classifiers, which demonstrated remarkably better performance than the individual classifier alone.

CONCLUSIONS

HemeBIND is the first specialized algorithm used to predict binding residues in protein structures for heme ligands. Extensive experiments indicated that both the structure-based and sequence-based methods have effectively identified heme binding residues while the complementary relationship between them can result in a significant improvement in prediction performance. The value of our method is highlighted through the development of HemeBIND web server that is freely accessible at http://mleg.cse.sc.edu/hemeBIND/.

Collapse

Zhao W, Xu M, Liang Z, Ding B, Niu L, Liu H, Teng M. Structure-based de novo prediction of zinc-binding sites in proteins of unknown function. ACTA ACUST UNITED AC 2011;27:1262-8. [PMID: 21414989 DOI: 10.1093/bioinformatics/btr133] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Harding MM, Nowicki MW, Walkinshaw MD. Metals in protein structures: a review of their principal features. CRYSTALLOGR REV 2010. [DOI: 10.1080/0889311x.2010.485616] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Wang X, Zhao K, Kirberger M, Wong H, Chen G, Yang JJ. Analysis and prediction of calcium-binding pockets from apo-protein structures exhibiting calcium-induced localized conformational changes. Protein Sci 2010;19:1180-90. [PMID: 20512971 DOI: 10.1002/pro.394] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

López G, Ezkurdia I, Tress ML. Assessment of ligand binding residue predictions in CASP8. Proteins 2010;77 Suppl 9:138-46. [PMID: 19714771 DOI: 10.1002/prot.22557] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA. Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS Comput Biol 2009;5:e1000585. [PMID: 19997483 PMCID: PMC2777313 DOI: 10.1371/journal.pcbi.1000585] [Citation(s) in RCA: 285] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Accepted: 10/30/2009] [Indexed: 11/20/2022] Open

Abstract

Identifying a protein's functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).

Protein molecules are ubiquitous in the cell; they perform thousands of functions crucial for life. Proteins accomplish nearly all of these functions by interacting with other molecules. These interactions are mediated by specific amino acid positions in the proteins. Knowledge of these “functional sites” is crucial for understanding the molecular mechanisms by which proteins carry out their functions; however, functional sites have not been identified in the vast majority of proteins. Here, we present ConCavity, a computational method that predicts small molecule binding sites in proteins by combining analysis of evolutionary sequence conservation and protein 3D structure. ConCavity provides significant improvement over previous approaches, especially on large, multi-chain proteins. In contrast to earlier methods which only predict entire binding sites, ConCavity makes specific predictions of positions in space that are likely to overlap ligand atoms and of residues that are likely to contact bound ligands. These predictions can be used to aid computational function prediction, to guide experimental protein analysis, and to focus computationally intensive techniques used in drug discovery.

Collapse

Geurts P, Irrthum A, Wehenkel L. Supervised learning with decision tree-based methods in computational and systems biology. MOLECULAR BIOSYSTEMS 2009;5:1593-605. [PMID: 20023720 DOI: 10.1039/b907946g] [Citation(s) in RCA: 124] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Bordner AJ. Predicting protein-protein binding sites in membrane proteins. BMC Bioinformatics 2009;10:312. [PMID: 19778442 PMCID: PMC2761413 DOI: 10.1186/1471-2105-10-312] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2009] [Accepted: 09/24/2009] [Indexed: 01/15/2023] Open

Abstract

Background

Many integral membrane proteins, like their non-membrane counterparts, form either transient or permanent multi-subunit complexes in order to carry out their biochemical function. Computational methods that provide structural details of these interactions are needed since, despite their importance, relatively few structures of membrane protein complexes are available.

Results

We present a method for predicting which residues are in protein-protein binding sites within the transmembrane regions of membrane proteins. The method uses a Random Forest classifier trained on residue type distributions and evolutionary conservation for individual surface residues, followed by spatial averaging of the residue scores. The prediction accuracy achieved for membrane proteins is comparable to that for non-membrane proteins. Also, like previous results for non-membrane proteins, the accuracy is significantly higher for residues distant from the binding site boundary. Furthermore, a predictor trained on non-membrane proteins was found to yield poor accuracy on membrane proteins, as expected from the different distribution of surface residue types between the two classes of proteins. Thus, although the same procedure can be used to predict binding sites in membrane and non-membrane proteins, separate predictors trained on each class of proteins are required. Finally, the contribution of each residue property to the overall prediction accuracy is analyzed and prediction examples are discussed.

Conclusion

Given a membrane protein structure and a multiple alignment of related sequences, the presented method gives a prioritized list of which surface residues participate in intramembrane protein-protein interactions. The method has potential applications in guiding the experimental verification of membrane protein interactions, structure-based drug discovery, and also in constraining the search space for computational methods, such as protein docking or threading, that predict membrane protein complex structures.

Collapse

Masso M, Mathe E, Parvez N, Hijazi K, Vaisman II. Modeling the functional consequences of single residue replacements in bacteriophage f1 gene V protein. Protein Eng Des Sel 2009;22:665-71. [PMID: 19690089 DOI: 10.1093/protein/gzp050] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open