Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Cheek S, Qi Y, Krishna SS, Kinch LN, Grishin NV. 4SCOPmap: automated assignment of protein structures to evolutionary superfamilies. BMC Bioinformatics 2004;5:197. [PMID: 15598351 PMCID: PMC544345 DOI: 10.1186/1471-2105-5-197] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2004] [Accepted: 12/14/2004] [Indexed: 11/24/2022] Open

For:	Cheek S, Qi Y, Krishna SS, Kinch LN, Grishin NV. 4SCOPmap: automated assignment of protein structures to evolutionary superfamilies. BMC Bioinformatics 2004;5:197. [PMID: 15598351 PMCID: PMC544345 DOI: 10.1186/1471-2105-5-197] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2004] [Accepted: 12/14/2004] [Indexed: 11/24/2022] Open

Number

Cited by Other Article(s)

Najibi SM, Maadooliat M, Zhou L, Huang JZ, Gao X. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions. Comput Struct Biotechnol J 2017;15:243-254. [PMID: 28280526 PMCID: PMC5331158 DOI: 10.1016/j.csbj.2017.01.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Revised: 01/26/2017] [Accepted: 01/28/2017] [Indexed: 11/19/2022] Open

Chandonia JM, Fox NK, Brenner SE. SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database. J Mol Biol 2016;429:348-355. [PMID: 27914894 DOI: 10.1016/j.jmb.2016.11.023] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Revised: 11/23/2016] [Accepted: 11/24/2016] [Indexed: 12/23/2022]

Xu J, Zhang J. Impact of structure space continuity on protein fold classification. Sci Rep 2016;6:23263. [PMID: 27006112 PMCID: PMC4804218 DOI: 10.1038/srep23263] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 03/03/2016] [Indexed: 11/09/2022] Open

Cheng H, Schaeffer RD, Liao Y, Kinch LN, Pei J, Shi S, Kim BH, Grishin NV. ECOD: an evolutionary classification of protein domains. PLoS Comput Biol 2014;10:e1003926. [PMID: 25474468 PMCID: PMC4256011 DOI: 10.1371/journal.pcbi.1003926] [Citation(s) in RCA: 225] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2014] [Accepted: 09/22/2014] [Indexed: 01/02/2023] Open

Abstract

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

Collapse

Fox NK, Brenner SE, Chandonia JM. SCOPe: Structural Classification of Proteins--extended, integrating SCOP and ASTRAL data and classification of new structures. Nucleic Acids Res 2013;42:D304-9. [PMID: 24304899 PMCID: PMC3965108 DOI: 10.1093/nar/gkt1240] [Citation(s) in RCA: 478] [Impact Index Per Article: 43.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Andreeva A, Howorth D, Chothia C, Kulesha E, Murzin AG. SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 2013;42:D310-4. [PMID: 24293656 PMCID: PMC3964979 DOI: 10.1093/nar/gkt1242] [Citation(s) in RCA: 198] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Daniels NM, Kumar A, Cowen LJ, Menke M. Touring protein space with Matt. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:286-93. [PMID: 21464511 PMCID: PMC3355523 DOI: 10.1109/tcbb.2011.70] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Angadi UB, Venkatesulu M. Structural SCOP superfamily level classification using unsupervised machine learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;9:601-608. [PMID: 21844638 DOI: 10.1109/tcbb.2011.114] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Hamp T, Birzele F, Buchwald F, Kramer S. Improving structure alignment-based prediction of SCOP families using Vorolign kernels. ACTA ACUST UNITED AC 2010;27:204-10. [PMID: 21098432 DOI: 10.1093/bioinformatics/btq618] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Angadi UB, Venkatesulu M. FuzzyART neural network for protein classification. J Bioinform Comput Biol 2010;8:825-41. [PMID: 20981890 DOI: 10.1142/s0219720010004951] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Revised: 05/13/2010] [Accepted: 05/13/2010] [Indexed: 11/18/2022]

Jain P, Garibaldi JM, Hirst JD. Supervised machine learning algorithms for protein structure classification. Comput Biol Chem 2009;33:216-23. [PMID: 19473879 DOI: 10.1016/j.compbiolchem.2009.04.004] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2008] [Revised: 03/25/2009] [Accepted: 04/23/2009] [Indexed: 10/20/2022]

Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors. ALGORITHMS 2009. [DOI: 10.3390/a2020692] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

A feature vector integration approach for a generalized support vector machine pairwise homology algorithm. Comput Biol Chem 2008;32:458-61. [DOI: 10.1016/j.compbiolchem.2008.07.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2008] [Revised: 06/23/2008] [Accepted: 07/02/2008] [Indexed: 11/30/2022]

Zemla A, Geisbrecht B, Smith J, Lam M, Kirkpatrick B, Wagner M, Slezak T, Zhou CE. STRALCP--structure alignment-based clustering of proteins. Nucleic Acids Res 2007;35:e150. [PMID: 18039711 PMCID: PMC2190701 DOI: 10.1093/nar/gkm1049] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

Qi Y, Sadreyev RI, Wang Y, Kim BH, Grishin NV. A comprehensive system for evaluation of remote sequence similarity detection. BMC Bioinformatics 2007;8:314. [PMID: 17725841 PMCID: PMC2031906 DOI: 10.1186/1471-2105-8-314] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2007] [Accepted: 08/28/2007] [Indexed: 11/25/2022] Open

Abstract

Background

Accurate and sensitive performance evaluation is crucial for both effective development of better structure prediction methods based on sequence similarity, and for the comparative analysis of existing methods. Up to date, there has been no satisfactory comprehensive evaluation method that (i) is based on a large and statistically unbiased set of proteins with clearly defined relationships; and (ii) covers all performance aspects of sequence-based structure predictors, such as sensitivity and specificity, alignment accuracy and coverage, and structure template quality.

Results

With the aim of designing such a method, we (i) select a statistically balanced set of divergent protein domains from SCOP, and define similarity relationships for the majority of these domains by complementing the best of information available in SCOP with a rigorous SVM-based algorithm; and (ii) develop protocols for the assessment of similarity detection and alignment quality from several complementary perspectives. The evaluation of similarity detection is based on ROC-like curves and includes several complementary approaches to the definition of true/false positives. Reference-dependent approaches use the 'gold standard' of pre-defined domain relationships and structure-based alignments. Reference-independent approaches assess the quality of structural match predicted by the sequence alignment, with respect to the whole domain length (global mode) or to the aligned region only (local mode). Similarly, the evaluation of alignment quality includes several reference-dependent and -independent measures, in global and local modes. As an illustration, we use our benchmark to compare the performance of several methods for the detection of remote sequence similarities, and show that different aspects of evaluation reveal different properties of the evaluated methods, highlighting their advantages, weaknesses, and potential for further development.

Conclusion

The presented benchmark provides a new tool for a statistically unbiased assessment of methods for remote sequence similarity detection, from various complementary perspectives. This tool should be useful both for users choosing the best method for a given purpose, and for developers designing new, more powerful methods. The benchmark set, reference alignments, and evaluation codes can be downloaded from .

Collapse

Tung CH, Yang JM. fastSCOP: a fast web server for recognizing protein structural domains and SCOP superfamilies. Nucleic Acids Res 2007;35:W438-43. [PMID: 17485476 PMCID: PMC1933144 DOI: 10.1093/nar/gkm288] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Gewehr JE, Hintermair V, Zimmer R. AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings. Bioinformatics 2007;23:1203-10. [PMID: 17379694 DOI: 10.1093/bioinformatics/btm089] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

The sequence patterns contained in the available motif and hidden Markov model (HMM) databases are a valuable source of information for protein sequence annotation. For structure prediction and fold recognition purposes, we computed mappings from such pattern databases to the protein domain hierarchy given by the ASTRAL compendium and applied them to the prediction of SCOP classifications. Our aim is to make highly confident predictions also for non-trivial cases if possible and abstain from a prediction otherwise, and thus to provide a method that can be used as a first step in a pipeline of prediction methods. We describe two successful examples for such pipelines. With the AutoSCOP approach, it is possible to make predictions in a large-scale manner for many domains of the available sequences in the well-known protein sequence databases.

RESULTS

AutoSCOP computes unique sequence patterns and pattern combinations for SCOP classifications. For instance, we assign a SCOP superfamily to a pattern found in its members whenever the pattern does not occur in any other SCOP superfamily. Especially on the fold and superfamily level, our method achieves both high sensitivity (above 93%) and high specificity (above 98%) on the difference set between two ASTRAL versions, due to being able to abstain from unreliable predictions. Further, on a harder test set filtered at low sequence identity, the combination with profile-profile alignments improves accuracy and performs comparably even to structure alignment methods. Integrating our method with structure alignment, we are able to achieve an accuracy of 99% on SCOP fold classifications on this set. In an analysis of false assignments of domains from new folds/superfamilies/families to existing SCOP classifications, AutoSCOP correctly abstains for more than 70% of the domains belonging to new folds and superfamilies, and more than 80% of the domains belonging to new families. These findings show that our approach is a useful additional filter for SCOP classification prediction of protein domains in combination with well-known methods such as profile-profile alignment.

AVAILABILITY

A web server where users can input their domain sequences is available at http://www.bio.ifi.lmu.de/autoscop.

Collapse

Kim YJ, Patel JM. A framework for protein structure classification and identification of novel protein structures. BMC Bioinformatics 2006;7:456. [PMID: 17042958 PMCID: PMC1622760 DOI: 10.1186/1471-2105-7-456] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2006] [Accepted: 10/16/2006] [Indexed: 11/10/2022] Open

Chi PH, Shyu CR, Xu D. A fast SCOP fold classification system using content-based E-Predict algorithm. BMC Bioinformatics 2006;7:362. [PMID: 16872501 PMCID: PMC1579235 DOI: 10.1186/1471-2105-7-362] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2005] [Accepted: 07/26/2006] [Indexed: 11/10/2022] Open

Daras P, Zarpalas D, Axenopoulos A, Tzovaras D, Strintzis MG. Three-dimensional shape-structure comparison method for protein classification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2006;3:193-207. [PMID: 17048458 DOI: 10.1109/tcbb.2006.43] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Shin DH, Lou Y, Jancarik J, Yokota H, Kim R, Kim SH. Crystal structure of TM1457 from Thermotoga maritima. J Struct Biol 2005;152:113-7. [PMID: 16242963 DOI: 10.1016/j.jsb.2005.08.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2005] [Revised: 08/19/2005] [Accepted: 08/23/2005] [Indexed: 11/24/2022]

Kinch LN, Cheek S, Grishin NV. EDD, a novel phosphotransferase domain common to mannose transporter EIIA, dihydroxyacetone kinase, and DegV. Protein Sci 2005;14:360-7. [PMID: 15632288 PMCID: PMC2253402 DOI: 10.1110/ps.041114805] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]