Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kristensen DM, Chen BY, Fofanov VY, Ward RM, Lisewski AM, Kimmel M, Kavraki LE, Lichtarge O. Recurrent use of evolutionary importance for functional annotation of proteins based on local structural similarity. Protein Sci 2006;15:1530-6. [PMID: 16672239 PMCID: PMC2242527 DOI: 10.1110/ps.062152706] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

For:	Kristensen DM, Chen BY, Fofanov VY, Ward RM, Lisewski AM, Kimmel M, Kavraki LE, Lichtarge O. Recurrent use of evolutionary importance for functional annotation of proteins based on local structural similarity. Protein Sci 2006;15:1530-6. [PMID: 16672239 PMCID: PMC2242527 DOI: 10.1110/ps.062152706] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Number

Cited by Other Article(s)

Roche DB, Brüls T. The enzymatic nature of an anonymous protein sequence cannot reliably be inferred from superfamily level structural information alone. Protein Sci 2015;24:643-50. [PMID: 25559918 DOI: 10.1002/pro.2635] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 12/13/2014] [Accepted: 12/29/2014] [Indexed: 11/12/2022]

Prediction and experimental validation of enzyme substrate specificity in protein structures. Proc Natl Acad Sci U S A 2013;110:E4195-202. [PMID: 24145433 DOI: 10.1073/pnas.1305162110] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Erdin S, Venner E, Lisewski AM, Lichtarge O. Function prediction from networks of local evolutionary similarity in protein structure. BMC Bioinformatics 2013;14 Suppl 3:S6. [PMID: 23514548 PMCID: PMC3584919 DOI: 10.1186/1471-2105-14-s3-s6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Ellingson L, Zhang J. Protein surface matching by combining local and global geometric information. PLoS One 2012;7:e40540. [PMID: 22815760 PMCID: PMC3398928 DOI: 10.1371/journal.pone.0040540] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Accepted: 06/12/2012] [Indexed: 01/01/2023] Open

Evolutionary trace for prediction and redesign of protein functional sites. Methods Mol Biol 2012;819:29-42. [PMID: 22183528 DOI: 10.1007/978-1-61779-465-0_3] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Erdin S, Lisewski AM, Lichtarge O. Protein function prediction: towards integration of similarity metrics. Curr Opin Struct Biol 2011;21:180-8. [PMID: 21353529 DOI: 10.1016/j.sbi.2011.02.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2011] [Accepted: 02/03/2011] [Indexed: 11/16/2022]

Venner E, Lisewski AM, Erdin S, Ward RM, Amin SR, Lichtarge O. Accurate protein structure annotation through competitive diffusion of enzymatic functions over a network of local evolutionary similarities. PLoS One 2010;5:e14286. [PMID: 21179190 PMCID: PMC3001439 DOI: 10.1371/journal.pone.0014286] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Accepted: 11/10/2010] [Indexed: 12/24/2022] Open

Wu CY, Chen YC, Lim C. A structural-alphabet-based strategy for finding structural motifs across protein families. Nucleic Acids Res 2010;38:e150. [PMID: 20525797 PMCID: PMC2919736 DOI: 10.1093/nar/gkq478] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Evolution: a guide to perturb protein function and networks. Curr Opin Struct Biol 2010;20:351-9. [PMID: 20444593 DOI: 10.1016/j.sbi.2010.04.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Accepted: 04/08/2010] [Indexed: 12/11/2022]

Erdin S, Ward RM, Venner E, Lichtarge O. Evolutionary trace annotation of protein function in the structural proteome. J Mol Biol 2009;396:1451-73. [PMID: 20036248 DOI: 10.1016/j.jmb.2009.12.037] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2009] [Revised: 12/05/2009] [Accepted: 12/18/2009] [Indexed: 11/16/2022]

Abstract

By design, structural genomics (SG) solves many structures that cannot be assigned function based on homology to known proteins. Alternative function annotation methods are therefore needed and this study focuses on function prediction with three-dimensional (3D) templates: small structural motifs built of just a few functionally critical residues. Although experimentally proven functional residues are scarce, we show here that Evolutionary Trace (ET) rankings of residue importance are sufficient to build 3D templates, match them, and then assign Gene Ontology (GO) functions in enzymes and non-enzymes alike. In a high-specificity mode, this Evolutionary Trace Annotation (ETA) method covered half (53%) of the 2384 annotated SG protein controls. Three-quarters (76%) of predictions were both correct and complete. The positive predictive value for all GO depths (all-depth PPV) was 84%, and it rose to 94% over GO depths 1-3 (depth 3 PPV). In a high-sensitivity mode, coverage rose significantly (84%), while accuracy fell moderately: 68% of predictions were both correct and complete, all-depth PPV was 75%, and depth 3 PPV was 86%. These data concur with prior mutational experiments showing that ET rank information identifies key functional determinants in proteins. In practice, ETA predicted functions in 42% of 3461 unannotated SG proteins. In 529 cases--including 280 non-enzymes and 21 for metal ion ligands--the expected accuracy is 84% at any GO depth and 94% down to GO depth 3, while for the remaining 931 the expected accuracies are 60% and 71%, respectively. Thus, local structural comparisons of evolutionarily important residues can help decipher protein functions to known reliability levels and without prior assumption on functional mechanisms. ETA is available at http://mammoth.bcm.tmc.edu/eta.

Collapse

Bray T, Chan P, Bougouffa S, Greaves R, Doig AJ, Warwicker J. SitesIdentify: a protein functional site prediction tool. BMC Bioinformatics 2009;10:379. [PMID: 19922660 PMCID: PMC2783165 DOI: 10.1186/1471-2105-10-379] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2009] [Accepted: 11/18/2009] [Indexed: 01/31/2023] Open

Ward RM, Venner E, Daines B, Murray S, Erdin S, Kristensen DM, Lichtarge O. Evolutionary Trace Annotation Server: automated enzyme function prediction in protein structures using 3D templates. Bioinformatics 2009;25:1426-7. [PMID: 19307237 PMCID: PMC2682511 DOI: 10.1093/bioinformatics/btp160] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Ward RM, Erdin S, Tran TA, Kristensen DM, Lisewski AM, Lichtarge O. De-orphaning the structural proteome through reciprocal comparison of evolutionarily important structural features. PLoS One 2008;3:e2136. [PMID: 18461181 PMCID: PMC2362850 DOI: 10.1371/journal.pone.0002136] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2008] [Accepted: 03/25/2008] [Indexed: 12/01/2022] Open

Kristensen DM, Ward RM, Lisewski AM, Erdin S, Chen BY, Fofanov VY, Kimmel M, Kavraki LE, Lichtarge O. Prediction of enzyme function based on 3D templates of evolutionarily important amino acids. BMC Bioinformatics 2008;9:17. [PMID: 18190718 PMCID: PMC2219985 DOI: 10.1186/1471-2105-9-17] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2007] [Accepted: 01/11/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Structural genomics projects such as the Protein Structure Initiative (PSI) yield many new structures, but often these have no known molecular functions. One approach to recover this information is to use 3D templates - structure-function motifs that consist of a few functionally critical amino acids and may suggest functional similarity when geometrically matched to other structures. Since experimentally determined functional sites are not common enough to define 3D templates on a large scale, this work tests a computational strategy to select relevant residues for 3D templates.

RESULTS

Based on evolutionary information and heuristics, an Evolutionary Trace Annotation (ETA) pipeline built templates for 98 enzymes, half taken from the PSI, and sought matches in a non-redundant structure database. On average each template matched 2.7 distinct proteins, of which 2.0 share the first three Enzyme Commission digits as the template's enzyme of origin. In many cases (61%) a single most likely function could be predicted as the annotation with the most matches, and in these cases such a plurality vote identified the correct function with 87% accuracy. ETA was also found to be complementary to sequence homology-based annotations. When matches are required to both geometrically match the 3D template and to be sequence homologs found by BLAST or PSI-BLAST, the annotation accuracy is greater than either method alone, especially in the region of lower sequence identity where homology-based annotations are least reliable.

CONCLUSION

These data suggest that knowledge of evolutionarily important residues improves functional annotation among distant enzyme homologs. Since, unlike other 3D template approaches, the ETA method bypasses the need for experimental knowledge of the catalytic mechanism, it should prove a useful, large scale, and general adjunct to combine with other methods to decipher protein function in the structural proteome.

Collapse

Shin H, Lisewski AM, Lichtarge O. Graph sharpening plus graph integration: a synergy that improves protein functional classification. ACTA ACUST UNITED AC 2007;23:3217-24. [PMID: 17977886 DOI: 10.1093/bioinformatics/btm511] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

MOTIVATION

Predicting protein function is a central problem in bioinformatics, and many approaches use partially or fully automated methods based on various combination of sequence, structure and other information on proteins or genes. Such information establishes relationships between proteins that can be modelled most naturally as edges in graphs. A priori, however, it is often unclear which edges from which graph may contribute most to accurate predictions. For that reason, one established strategy is to integrate all available sources, or graphs as in graph integration, in the hope that the positive signals will add to each other. However, in the problem of functional prediction, noise, i.e. the presence of inaccurate or false edges, can still be large enough that integration alone has little effect on prediction accuracy. In order to reduce noise levels and to improve integration efficiency, we present here a recent method in graph-based learning, graph sharpening, which provides a theoretically firm yet intuitive and practical approach for disconnecting undesirable edges from protein similarity graphs. This approach has several attractive features: it is quick, scalable in the number of proteins, robust with respect to errors and tolerant of very diverse types of protein similarity measures.

RESULTS

We tested the classification accuracy in a test set of 599 proteins with remote sequence homology spread over 20 Gene Ontology (GO) functional classes. When compared to integration alone, graph sharpening plus integration of four vastly different molecular similarity measures improved the overall classification by nearly 30% [0.17 average increase in the area under the ROC curve (AUC)]. Moreover, and partially through the increased sparsity of the graphs induced by sharpening, this gain in accuracy came at negligible computational cost: sharpening and integration took on average 4.66 (+/-4.44) CPU seconds.

AVAILABILITY

Software and Supplementary data will be available on http://mammoth.bcm.tmc.edu/

Collapse

Chen BY, Fofanov VY, Bryant DH, Dodson BD, Kristensen DM, Lisewski AM, Kimmel M, Lichtarge O, Kavraki LE. The MASH pipeline for protein function prediction and an algorithm for the geometric refinement of 3D motifs. J Comput Biol 2007;14:791-816. [PMID: 17691895 DOI: 10.1089/cmb.2007.r017] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

The development of new and effective drugs is strongly affected by the need to identify drug targets and to reduce side effects. Resolving these issues depends partially on a thorough understanding of the biological function of proteins. Unfortunately, the experimental determination of protein function is expensive and time consuming. To support and accelerate the determination of protein functions, algorithms for function prediction are designed to gather evidence indicating functional similarity with well studied proteins. One such approach is the MASH pipeline, described in the first half of this paper. MASH identifies matches of geometric and chemical similarity between motifs, representing known functional sites, and substructures of functionally uncharacterized proteins (targets). Observations from several research groups concur that statistically significant matches can indicate functionally related active sites. One major subproblem is the design of effective motifs, which have many matches to functionally related targets (sensitive motifs), and few matches to functionally unrelated targets (specific motifs). Current techniques select and combine structural, physical, and evolutionary properties to generate motifs that mirror functional characteristics in active sites. This approach ignores incidental similarities that may occur with functionally unrelated proteins. To address this problem, we have developed Geometric Sieving (GS), a parallel distributed algorithm that efficiently refines motifs, designed by existing methods, into optimized motifs with maximal geometric and chemical dissimilarity from all known protein structures. In exhaustive comparison of all possible motifs based on the active sites of 10 well-studied proteins, we observed that optimized motifs were among the most sensitive and specific.

Collapse

Chen BY, Bryant DH, Fofanov VY, Kristensen DM, Cruess AE, Kimmel M, Lichtarge O, Kavraki LE. Cavity scaling: automated refinement of cavity-aware motifs in protein function prediction. J Bioinform Comput Biol 2007;5:353-82. [PMID: 17589966 DOI: 10.1142/s021972000700276x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2006] [Revised: 12/12/2006] [Indexed: 11/18/2022]

Discovering structural motifs using a structural alphabet: application to magnesium-binding sites. BMC Bioinformatics 2007;8:106. [PMID: 17389049 PMCID: PMC1851716 DOI: 10.1186/1471-2105-8-106] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2006] [Accepted: 03/28/2007] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

For many metalloproteins, sequence motifs characteristic of metal-binding sites have not been found or are so short that they would not be expected to be metal-specific. Striking examples of such metalloproteins are those containing Mg2+, one of the most versatile metal cofactors in cellular biochemistry. Even when Mg2+-proteins share insufficient sequence homology to identify Mg2+-specific sequence motifs, they may still share similarity in the Mg2+-binding site structure. However, no structural motifs characteristic of Mg2+-binding sites have been reported. Thus, our aims are (i) to develop a general method for discovering structural patterns/motifs characteristic of ligand-binding sites, given the 3D protein structures, and (ii) to apply it to Mg2+-proteins sharing <30% sequence identity. Our motif discovery method employs structural alphabet encoding to convert 3D structures to the corresponding 1D structural letter sequences, where the Mg2+-structural motifs are identified as recurring structural patterns.

RESULTS

The structural alphabet-based motif discovery method has revealed the structural preference of Mg2+-binding sites for certain local/secondary structures: compared to all residues in the Mg2+-proteins, both first and second-shell Mg2+-ligands prefer loops to helices. Even when the Mg2+-proteins share no significant sequence homology, some of them share a similar Mg2+-binding site structure: 4 Mg2+-structural motifs, comprising 21% of the binding sites, were found. In particular, one of the Mg2+-structural motifs found maps to a specific functional group, namely, hydrolases. Furthermore, 2 of the motifs were not found in non metalloproteins or in Ca2+-binding proteins. The structural motifs discovered thus capture some essential biochemical and/or evolutionary properties, and hence may be useful for discovering proteins where Mg2+ plays an important biological role.

CONCLUSION

The structural motif discovery method presented herein is general and can be applied to any set of proteins with known 3D structures. This new method is timely considering the increasing number of structures for proteins with unknown function that are being solved from structural genomics incentives. For such proteins, which share no significant sequence homology to proteins of known function, the presence of a structural motif that maps to a specific protein function in the structure would suggest likely active/binding sites and a particular biological function.

Collapse

Lisewski AM, Lichtarge O. Rapid detection of similarity in protein structure and function through contact metric distances. Nucleic Acids Res 2006;34:e152. [PMID: 17130161 PMCID: PMC1702494 DOI: 10.1093/nar/gkl788] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Friedberg I, Jambon M, Godzik A. New avenues in protein function prediction. Protein Sci 2006;15:1527-9. [PMID: 16731984 PMCID: PMC2242544 DOI: 10.1110/ps.062158406] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]