Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Eyrich VA, Przybylski D, Koh IYY, Grana O, Pazos F, Valencia A, Rost B. CAFASP3 in the spotlight of EVA. Proteins 2004;53 Suppl 6:548-60. [PMID: 14579345 DOI: 10.1002/prot.10534] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

For:	Eyrich VA, Przybylski D, Koh IYY, Grana O, Pazos F, Valencia A, Rost B. CAFASP3 in the spotlight of EVA. Proteins 2004;53 Suppl 6:548-60. [PMID: 14579345 DOI: 10.1002/prot.10534] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Number

Cited by Other Article(s)

Li W, Kinch LN, Karplus PA, Grishin NV. ChSeq: A database of chameleon sequences. Protein Sci 2015;24:1075-86. [PMID: 25970262 DOI: 10.1002/pro.2689] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Revised: 04/15/2015] [Accepted: 04/24/2015] [Indexed: 11/11/2022]

Goldberg T, Hamp T, Rost B. LocTree2 predicts localization for all domains of life. ACTA ACUST UNITED AC 2013;28:i458-i465. [PMID: 22962467 PMCID: PMC3436817 DOI: 10.1093/bioinformatics/bts390] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Ding W, Xie J, Dai D, Zhang H, Xie H, Zhang W. CNNcon: improved protein contact maps prediction using cascaded neural networks. PLoS One 2013;8:e61533. [PMID: 23626696 PMCID: PMC3634008 DOI: 10.1371/journal.pone.0061533] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 03/11/2013] [Indexed: 11/18/2022] Open

Abstract

BACKGROUNDS

Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly.

METHODS

CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction.

RESULTS

The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective prediction of long length proteins could be possible by the CNNcon.

Collapse

Hamp T, Kassner R, Seemayer S, Vicedo E, Schaefer C, Achten D, Auer F, Boehm A, Braun T, Hecht M, Heron M, Hönigschmid P, Hopf TA, Kaufmann S, Kiening M, Krompass D, Landerer C, Mahlich Y, Roos M, Rost B. Homology-based inference sets the bar high for protein function prediction. BMC Bioinformatics 2013;14 Suppl 3:S7. [PMID: 23514582 PMCID: PMC3584931 DOI: 10.1186/1471-2105-14-s3-s7] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Yan J, Marcus M, Kurgan L. Comprehensively designed consensus of standalone secondary structure predictors improves Q3 by over 3%. J Biomol Struct Dyn 2013;32:36-51. [PMID: 23298369 DOI: 10.1080/07391102.2012.746945] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

PSS-3D1D: an improved 3D1D profile method of protein fold recognition for the annotation of twilight zone sequences. ACTA ACUST UNITED AC 2011;12:181-9. [DOI: 10.1007/s10969-011-9119-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 11/24/2011] [Indexed: 10/14/2022]

Wrzeszczynski KO, Rost B. Cell cycle kinases predicted from conserved biophysical properties. Proteins 2009;74:655-68. [PMID: 18704950 DOI: 10.1002/prot.22181] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Swanson R, Kagiampakis I, Tsai JW. An information measure of the quality of protein secondary structure prediction. J Comput Biol 2008;15:65-79. [PMID: 18199024 DOI: 10.1089/cmb.2007.0199] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Abstract

We describe an information-theory-based measure of the quality of secondary structure prediction (RELINFO). RELINFO has a simple yet intuitive interpretation: it represents the factor by which secondary structure choice at a residue has been restricted by a prediction scheme. As an alternative interpretation of secondary structure prediction, RELINFO complements currently used methods by providing an information-based view as to why a prediction succeeds and fails. To demonstrate this score's capabilities, we applied RELINFO to an analysis of a large set of secondary structure predictions obtained from the first five rounds of the Critical Assessment of Structure Prediction (CASP) experiment. RELINFO is compared with two other common measures: percent correct (Q3) and secondary structure overlap (SOV). While the correlation between Q3 and RELINFO is approximately 0.85, RELINFO avoids certain disadvantages of Q3, including overestimating the quality of a prediction. The correlation between SOV and RELINFO is approximately 0.75. The valuable SOV measure unfortunately suffers from a saturation problem, and perhaps has unfairly given the general impression that secondary structure prediction has reached its limit since SOV hasn't improved much over the recent rounds of CASP. Although not a replacement for SOV, RELINFO has greater dispersion. Over the five rounds of CASP assessed here, RELINFO shows that predictions targets have been more difficult in successive CASP experiments, yet the predictions quality has continued to improve measurably over each round. In terms of information, the secondary structure prediction quality has almost doubled from CASP1 to CASP5. Therefore, as a different perspective of accuracy, RELINFO can help to improve prediction of protein secondary structure by providing a measure of difficulty as well as final quality of a prediction.

Collapse

The pros and cons of predicting protein contact maps. Methods Mol Biol 2008;413:199-217. [PMID: 18075167 DOI: 10.1007/978-1-59745-574-9_8] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Homaeian L, Kurgan LA, Ruan J, Cios KJ, Chen K. Prediction of protein secondary structure content for the twilight zone sequences. Proteins 2007;69:486-98. [PMID: 17623861 DOI: 10.1002/prot.21527] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Abstract

Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure.

Collapse

Graña O, Baker D, MacCallum RM, Meiler J, Punta M, Rost B, Tress ML, Valencia A. CASP6 assessment of contact prediction. Proteins 2006;61 Suppl 7:214-224. [PMID: 16187364 DOI: 10.1002/prot.20739] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Ferré S, King RD. Finding Motifs in Protein Secondary Structure for Use in Function Prediction. J Comput Biol 2006;13:719-31. [PMID: 16706721 DOI: 10.1089/cmb.2006.13.719] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Bodén M, Yuan Z, Bailey TL. Prediction of protein continuum secondary structure with probabilistic models based on NMR solved structures. BMC Bioinformatics 2006;7:68. [PMID: 16478545 PMCID: PMC1386714 DOI: 10.1186/1471-2105-7-68] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2005] [Accepted: 02/14/2006] [Indexed: 11/10/2022] Open

Graña O, Eyrich VA, Pazos F, Rost B, Valencia A. EVAcon: a protein contact prediction evaluation service. Nucleic Acids Res 2005;33:W347-51. [PMID: 15980486 PMCID: PMC1160172 DOI: 10.1093/nar/gki411] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Przybylski D, Rost B. Improving Fold Recognition Without Folds. J Mol Biol 2004;341:255-69. [PMID: 15312777 DOI: 10.1016/j.jmb.2004.05.041] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2004] [Revised: 05/18/2004] [Accepted: 05/18/2004] [Indexed: 11/21/2022]

Lett D, Hsing M, Pio F. Interaction profile-based protein classification of death domain. BMC Bioinformatics 2004;5:75. [PMID: 15189571 PMCID: PMC459208 DOI: 10.1186/1471-2105-5-75] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2004] [Accepted: 06/09/2004] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The increasing number of protein sequences and 3D structure obtained from genomic initiatives is leading many of us to focus on proteomics, and to dedicate our experimental and computational efforts on the creation and analysis of information derived from 3D structure. In particular, the high-throughput generation of protein-protein interaction data from a few organisms makes such an approach very important towards understanding the molecular recognition that make-up the entire protein-protein interaction network. Since the generation of sequences, and experimental protein-protein interactions increases faster than the 3D structure determination of protein complexes, there is tremendous interest in developing in silico methods that generate such structure for prediction and classification purposes. In this study we focused on classifying protein family members based on their protein-protein interaction distinctiveness. Structure-based classification of protein-protein interfaces has been described initially by Ponstingl et al. 1 and more recently by Valdar et al. 2 and Mintseris et al. 3, from complex structures that have been solved experimentally. However, little has been done on protein classification based on the prediction of protein-protein complexes obtained from homology modeling and docking simulation.

RESULTS

We have developed an in silico classification system entitled HODOCO (Homology modeling, Docking and Classification Oracle), in which protein Residue Potential Interaction Profiles (RPIPS) are used to summarize protein-protein interaction characteristics. This system applied to a dataset of 64 proteins of the death domain superfamily was used to classify each member into its proper subfamily. Two classification methods were attempted, heuristic and support vector machine learning. Both methods were tested with a 5-fold cross-validation. The heuristic approach yielded a 61% average accuracy, while the machine learning approach yielded an 89% average accuracy.

CONCLUSION

We have confirmed the reliability and potential value of classifying proteins via their predicted interactions. Our results are in the same range of accuracy as other studies that classify protein-protein interactions from 3D complex structure obtained experimentally. While our classification scheme does not take directly into account sequence information our results are in agreement with functional and sequence based classification of death domain family members.

Collapse