Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins 2006;57:702-10. [PMID: 15476259 DOI: 10.1002/prot.20264] [Citation(s) in RCA: 1332] [Impact Index Per Article: 74.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Number

Cited by Other Article(s)

1251

Pandit SB, Skolnick J. Fr-TM-align: a new protein structural alignment method based on fragment alignments and the TM-score. BMC Bioinformatics 2008;9:531. [PMID: 19077267 PMCID: PMC2628391 DOI: 10.1186/1471-2105-9-531] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2008] [Accepted: 12/12/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Protein tertiary structure comparisons are employed in various fields of contemporary structural biology. Most structure comparison methods involve generation of an initial seed alignment, which is extended and/or refined to provide the best structural superposition between a pair of protein structures as assessed by a structure comparison metric. One such metric, the TM-score, was recently introduced to provide a combined structure quality measure of the coordinate root mean square deviation between a pair of structures and coverage. Using the TM-score, the TM-align structure alignment algorithm was developed that was often found to have better accuracy and coverage than the most commonly used structural alignment programs; however, there were a number of situations when this was not true.

RESULTS

To further improve structure alignment quality, the Fr-TM-align algorithm has been developed where aligned fragment pairs are used to generate the initial seed alignments that are then refined using dynamic programming to maximize the TM-score. For the assessment of the structural alignment quality from Fr-TM-align in comparison to other programs such as CE and TM-align, we examined various alignment quality assessment scores such as PSI and TM-score. The assessment showed that the structural alignment quality from Fr-TM-align is better in comparison to both CE and TM-align. On average, the structural alignments generated using Fr-TM-align have a higher TM-score (~9%) and coverage (~7%) in comparison to those generated by TM-align. Fr-TM-align uses an exhaustive procedure to generate initial seed alignments. Hence, the algorithm is computationally more expensive than TM-align.

CONCLUSION

Fr-TM-align, a new algorithm that employs fragment alignment and assembly provides better structural alignments in comparison to TM-align. The source code and executables of Fr-TM-align are freely downloadable at: http://cssb.biology.gatech.edu/skolnick/files/FrTMalign/.

Collapse

1252

Carrillo-Tripp M, Brooks CL, Reddy VS. A novel method to map and compare protein-protein interactions in spherical viral capsids. Proteins 2008;73:644-55. [PMID: 18491385 DOI: 10.1002/prot.22088] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Abstract

Viral capsids are composed of multiple copies of one or a few chemically distinct capsid proteins and are mostly stabilized by inter subunit protein-protein interactions. There have been efforts to identify and analyze these protein-protein interactions, in terms of their extent and similarity, between the subunit interfaces related by quasi- and icosahedral symmetry. Here, we describe a new method to map quaternary interactions in spherical virus capsids onto polar angle space with respect to the icosahedral symmetry axes using azimuthal orthographic diagrams. This approach enables one to map the nonredundant interactions in a spherical virus capsid, irrespective of its size or triangulation number (T), onto the reference icosahedral asymmetric unit space. The resultant diagrams represent characteristic fingerprints of quaternary interactions of the respective capsids. Hence, they can be used as road maps of the protein-protein interactions to visualize the distribution and the density of the interactions. In addition, unlike the previous studies, the fingerprints of different capsids, when represented in a matrix form, can be compared with one another to quantitatively evaluate the similarity (S-score) in the subunit environments and the associated protein-protein interactions. The S-score selectively distinguishes the similarity, or lack of it, in the locations of the quaternary interactions as opposed to other well-known structural similarity metrics (e.g., RMSD, TM-score). Application of this method on a subset of T = 1 and T = 3 capsids suggests that S-score values range between 1 and 0.6 for capsids that belong to the same virus family/genus; 0.6-0.3 for capsids from different families with the same T-number and similar subunit fold; and <0.3 for comparisons of the dissimilar capsids that display different quaternary architectures (T-numbers). Finally, the sequence conserved interface residues within a virus family, whose spatial locations were also conserved have been hypothesized as the essential residues for self-assembly of the member virus capsids.

Collapse

1253

Lu Y, Sze SH. Improving accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues. Nucleic Acids Res 2008;37:463-72. [PMID: 19056820 PMCID: PMC2632924 DOI: 10.1093/nar/gkn945] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

1254

Lee J, Joo K, Kim SY, Lee J. Re-examination of structure optimization of off-lattice protein AB models by conformational space annealing. J Comput Chem 2008;29:2479-84. [PMID: 18470971 DOI: 10.1002/jcc.20995] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

1255

Nadler W, Meinke JH, Hansmann UHE. Folding proteins by first-passage-times-optimized replica exchange. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2008;78:061905. [PMID: 19256866 DOI: 10.1103/physreve.78.061905] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2008] [Indexed: 05/27/2023]

1256

Nicosia G, Stracquadanio G. Generalized pattern search algorithm for Peptide structure prediction. Biophys J 2008;95:4988-99. [PMID: 18487293 PMCID: PMC2576383 DOI: 10.1529/biophysj.107.124016] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2007] [Accepted: 03/20/2008] [Indexed: 11/18/2022] Open

1257

Sacan A, Toroslu IH, Ferhatosmanoglu H. Integrated search and alignment of protein structures. Bioinformatics 2008;24:2872-9. [PMID: 18945684 DOI: 10.1093/bioinformatics/btn545] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

1258

Wu S, Zhang Y. ANGLOR: a composite machine-learning algorithm for protein backbone torsion angle prediction. PLoS One 2008;3:e3400. [PMID: 18923703 PMCID: PMC2559866 DOI: 10.1371/journal.pone.0003400] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2008] [Accepted: 09/18/2008] [Indexed: 11/20/2022] Open

1259

Eramian D, Eswar N, Shen MY, Sali A. How well can the accuracy of comparative protein structure models be predicted? Protein Sci 2008;17:1881-93. [PMID: 18832340 DOI: 10.1110/ps.036061.108] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

1260

Wu S, Zhang Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 2008;72:547-56. [PMID: 18247410 DOI: 10.1002/prot.21945] [Citation(s) in RCA: 276] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

1261

Vallat BK, Pillardy J, Elber R. A template-finding algorithm and a comprehensive benchmark for homology modeling of proteins. Proteins 2008;72:910-28. [PMID: 18300226 PMCID: PMC2907141 DOI: 10.1002/prot.21976] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

1262

Zhou H, Skolnick J. Protein model quality assessment prediction by combining fragment comparisons and a consensus C(alpha) contact potential. Proteins 2008;71:1211-8. [PMID: 18004783 DOI: 10.1002/prot.21813] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

1263

Bernsel A, Viklund H, Elofsson A. Remote homology detection of integral membrane proteins using conserved sequence features. Proteins 2008;71:1387-99. [PMID: 18076048 DOI: 10.1002/prot.21825] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

1264

Latek D, Kolinski A. Contact prediction in protein modeling: scoring, folding and refinement of coarse-grained models. BMC STRUCTURAL BIOLOGY 2008;8:36. [PMID: 18694501 PMCID: PMC2527566 DOI: 10.1186/1472-6807-8-36] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2008] [Accepted: 08/11/2008] [Indexed: 11/10/2022]

Abstract

BACKGROUND

Several different methods for contact prediction succeeded within the Sixth Critical Assessment of Techniques for Protein Structure Prediction (CASP6). The most relevant were non-local contact predictions for targets from the most difficult categories: fold recognition-analogy and new fold. Such contacts could provide valuable structural information in case a template structure cannot be found in the PDB.

RESULTS

We described comprehensive tests of the effectiveness of contact data in various aspects of de novo modeling with CABS, an algorithm which was used successfully in CASP6 by the Kolinski-Bujnicki group. We used the predicted contacts in a simple scoring function for the post-simulation ranking of protein models and as a soft bias in the folding simulations and in the fold-refinement procedure. The latter approach turned out to be the most successful. The CABS force field used in the Replica Exchange Monte Carlo simulations cooperated with the true contacts and discriminated the false ones, which resulted in an improvement of the majority of Kolinski-Bujnicki's protein models. In the modeling we tested different sets of predicted contact data submitted to the CASP6 server. According to our results, the best performing were the contacts with the accuracy balanced with the coverage, obtained either from the best two predictors only or by a consensus from as many predictors as possible.

CONCLUSION

Our tests have shown that theoretically predicted contacts can be very beneficial for protein structure prediction. Depending on the protein modeling method, a contact data set applied should be prepared with differently balanced coverage and accuracy of predicted contacts. Namely, high coverage of contact data is important for the model ranking and high accuracy for the folding simulations.

Collapse

1265

Csaba G, Birzele F, Zimmer R. Protein structure alignment considering phenotypic plasticity. Bioinformatics 2008;24:i98-104. [DOI: 10.1093/bioinformatics/btn271] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

1266

Alternating evolutionary pressure in a genetic algorithm facilitates protein model selection. BMC STRUCTURAL BIOLOGY 2008;8:34. [PMID: 18673557 PMCID: PMC2527322 DOI: 10.1186/1472-6807-8-34] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Accepted: 08/01/2008] [Indexed: 11/12/2022]

1267

McGuffin LJ. Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 2008;24:1798-804. [DOI: 10.1093/bioinformatics/btn326] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

1268

Protein model refinement using an optimized physics-based all-atom force field. Proc Natl Acad Sci U S A 2008;105:8268-73. [PMID: 18550813 DOI: 10.1073/pnas.0800054105] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

1269

Benchmarking of TASSER_2.0: an improved protein structure prediction algorithm with more accurate predicted contact restraints. Biophys J 2008;95:1956-64. [PMID: 18487301 DOI: 10.1529/biophysj.108.129759] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

1270

Pei J. Multiple protein sequence alignment. Curr Opin Struct Biol 2008;18:382-6. [PMID: 18485694 DOI: 10.1016/j.sbi.2008.03.007] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2008] [Accepted: 03/18/2008] [Indexed: 11/16/2022]

1271

Larsson P, Wallner B, Lindahl E, Elofsson A. Using multiple templates to improve quality of homology models in automated homology modeling. Protein Sci 2008;17:990-1002. [PMID: 18441233 DOI: 10.1110/ps.073344908] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

1272

Zhang Y. Progress and challenges in protein structure prediction. Curr Opin Struct Biol 2008;18:342-8. [PMID: 18436442 DOI: 10.1016/j.sbi.2008.02.004] [Citation(s) in RCA: 304] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2007] [Accepted: 02/14/2008] [Indexed: 10/22/2022]

1273

Bennett-Lovsey RM, Herbert AD, Sternberg MJE, Kelley LA. Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre. Proteins 2008;70:611-25. [PMID: 17876813 DOI: 10.1002/prot.21688] [Citation(s) in RCA: 348] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Abstract

Structural and functional annotation of the large and growing database of genomic sequences is a major problem in modern biology. Protein structure prediction by detecting remote homology to known structures is a well-established and successful annotation technique. However, the broad spectrum of evolutionary change that accompanies the divergence of close homologues to become remote homologues cannot easily be captured with a single algorithm. Recent advances to tackle this problem have involved the use of multiple predictive algorithms available on the Internet. Here we demonstrate how such ensembles of predictors can be designed in-house under controlled conditions and permit significant improvements in recognition by using a concept taken from protein loop energetics and applying it to the general problem of 3D clustering. We have developed a stringent test that simulates the situation where a protein sequence of interest is submitted to multiple different algorithms and not one of these algorithms can make a confident (95%) correct assignment. A method of meta-server prediction (Phyre) that exploits the benefits of a controlled environment for the component methods was implemented. At 95% precision or higher, Phyre identified 64.0% of all correct homologous query-template relationships, and 84.0% of the individual test query proteins could be accurately annotated. In comparison to the improvement that the single best fold recognition algorithm (according to training) has over PSI-Blast, this represents a 29.6% increase in the number of correct homologous query-template relationships, and a 46.2% increase in the number of accurately annotated queries. It has been well recognised in fold prediction, other bioinformatics applications, and in many other areas, that ensemble predictions generally are superior in accuracy to any of the component individual methods. However there is a paucity of information as to why the ensemble methods are superior and indeed this has never been systematically addressed in fold recognition. Here we show that the source of ensemble power stems from noise reduction in filtering out false positive matches. The results indicate greater coverage of sequence space and improved model quality, which can consequently lead to a reduction in the experimental workload of structural genomics initiatives.

Collapse

1274

Benkert P, Tosatto SCE, Schomburg D. QMEAN: A comprehensive scoring function for model quality assessment. Proteins 2008;71:261-77. [PMID: 17932912 DOI: 10.1002/prot.21715] [Citation(s) in RCA: 737] [Impact Index Per Article: 46.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

1275

Wrabl JO, Grishin NV. Statistics of Random Protein Superpositions: p-Values for Pairwise Structure Alignment. J Comput Biol 2008;15:317-55. [DOI: 10.1089/cmb.2007.0161] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

1276

Silva PJ. Assessing the reliability of sequence similarities detected through hydrophobic cluster analysis. Proteins 2008;70:1588-94. [PMID: 17918727 DOI: 10.1002/prot.21803] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

1277

Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. ACTA ACUST UNITED AC 2008;24:924-31. [PMID: 18296462 DOI: 10.1093/bioinformatics/btn069] [Citation(s) in RCA: 151] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

MOTIVATION

Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions.

RESULTS

We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into 'Easy', 'Medium', 'Hard' and 'Very Hard' targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12-25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation > or =24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.

Collapse

1278

Tan CW, Jones DT. Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction. BMC Bioinformatics 2008;9:94. [PMID: 18267018 PMCID: PMC2267779 DOI: 10.1186/1471-2105-9-94] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2007] [Accepted: 02/11/2008] [Indexed: 11/13/2022] Open

1279

Biegert A, Söding J. De novo identification of highly diverged protein repeats by probabilistic consistency. Bioinformatics 2008;24:807-14. [DOI: 10.1093/bioinformatics/btn039] [Citation(s) in RCA: 123] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

1280

Identification of Quaternary Structure and Functional Domains of the CI Repressor from Bacteriophage TP901-1. J Mol Biol 2008;376:983-96. [DOI: 10.1016/j.jmb.2007.12.022] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2007] [Revised: 12/10/2007] [Accepted: 12/11/2007] [Indexed: 11/21/2022]

1281

Furnham N, de Bakker PI, Gore S, Burke DF, Blundell TL. Comparative modelling by restraint-based conformational sampling. BMC STRUCTURAL BIOLOGY 2008;8:7. [PMID: 18237407 PMCID: PMC2275734 DOI: 10.1186/1472-6807-8-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Accepted: 01/31/2008] [Indexed: 11/10/2022]

Abstract

BACKGROUND

Although comparative modelling is routinely used to produce three-dimensional models of proteins, very few automated approaches are formulated in a way that allows inclusion of restraints derived from experimental data as well as those from the structures of homologues. Furthermore, proteins are usually described as a single conformer, rather than an ensemble that represents the heterogeneity and inaccuracy of experimentally determined protein structures. Here we address these issues by exploring the application of the restraint-based conformational space search engine, RAPPER, which has previously been developed for rebuilding experimentally defined protein structures and for fitting models to electron density derived from X-ray diffraction analyses.

RESULTS

A new application of RAPPER for comparative modelling uses positional restraints and knowledge-based sampling to generate models with accuracies comparable to other leading modelling tools. Knowledge-based predictions are based on geometrical features of the homologous templates and rules concerning main-chain and side-chain conformations. By directly changing the restraints derived from available templates we estimate the accuracy limits of the method in comparative modelling.

CONCLUSION

The application of RAPPER to comparative modelling provides an effective means of exploring the conformational space available to a target sequence. Enhanced methods for generating positional restraints can greatly improve structure prediction. Generation of an ensemble of solutions that are consistent with both target sequence and knowledge derived from the template structures provides a more appropriate representation of a structural prediction than a single model. By formulating homologous structural information as sets of restraints we can begin to consider how comparative models might be used to inform conformer generation from sparse experimental data.

Collapse

1282

Mereghetti P, Ganadu ML, Papaleo E, Fantucci P, De Gioia L. Validation of protein models by a neural network approach. BMC Bioinformatics 2008;9:66. [PMID: 18230168 PMCID: PMC2276493 DOI: 10.1186/1471-2105-9-66] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2007] [Accepted: 01/29/2008] [Indexed: 11/30/2022] Open

1283

Wallner B, Elofsson A. Prediction of global and local model quality in CASP7 using Pcons and ProQ. Proteins 2008;69 Suppl 8:184-93. [PMID: 17894353 DOI: 10.1002/prot.21774] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

1284

Tress M, Cheng J, Baldi P, Joo K, Lee J, Seo JH, Lee J, Baker D, Chivian D, Kim D, Ezkurdia I. Assessment of predictions submitted for the CASP7 domain prediction category. Proteins 2008;69 Suppl 8:137-51. [PMID: 17680686 DOI: 10.1002/prot.21675] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

1285

Zhou H, Pandit SB, Lee SY, Borreguero J, Chen H, Wroblewska L, Skolnick J. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins 2008;69 Suppl 8:90-7. [PMID: 17705276 DOI: 10.1002/prot.21649] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

1286

Zhang Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 2008;69 Suppl 8:108-17. [PMID: 17894355 DOI: 10.1002/prot.21702] [Citation(s) in RCA: 338] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

1287

Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 2008;9:40. [PMID: 18215316 PMCID: PMC2245901 DOI: 10.1186/1471-2105-9-40] [Citation(s) in RCA: 3801] [Impact Index Per Article: 237.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2007] [Accepted: 01/23/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP) experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions.

RESULTS

An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score) based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1]) of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 A for RMSD.

CONCLUSION

The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/I-TASSER.

Collapse

1288

McGuffin LJ. The ModFOLD server for the quality assessment of protein structural models. Bioinformatics 2008;24:586-7. [DOI: 10.1093/bioinformatics/btn014] [Citation(s) in RCA: 107] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

1289

Taly JF, Marin A, Gibrat JF. Can molecular dynamics simulations help in discriminating correct from erroneous protein 3D models? BMC Bioinformatics 2008;9:6. [PMID: 18179702 PMCID: PMC2245900 DOI: 10.1186/1471-2105-9-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2007] [Accepted: 01/07/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Recent approaches for predicting the three-dimensional (3D) structure of proteins such as de novo or fold recognition methods mostly rely on simplified energy potential functions and a reduced representation of the polypeptide chain. These simplifications facilitate the exploration of the protein conformational space but do not permit to capture entirely the subtle relationship that exists between the amino acid sequence and its native structure. It has been proposed that physics-based energy functions together with techniques for sampling the conformational space, e.g., Monte Carlo or molecular dynamics (MD) simulations, are better suited to the task of modelling proteins at higher resolutions than those of models obtained with the former type of methods. In this study we monitor different protein structural properties along MD trajectories to discriminate correct from erroneous models. These models are based on the sequence-structure alignments provided by our fold recognition method, FROST. We define correct models as being built from alignments of sequences with structures similar to their native structures and erroneous models from alignments of sequences with structures unrelated to their native structures.

RESULTS

For three test sequences whose native structures belong to the all-alpha, all-beta and alphabeta classes we built a set of models intended to cover the whole spectrum: from a perfect model, i.e., the native structure, to a very poor model, i.e., a random alignment of the test sequence with a structure belonging to another structural class, including several intermediate models based on fold recognition alignments. We submitted these models to 11 ns of MD simulations at three different temperatures. We monitored along the corresponding trajectories the mean of the Root-Mean-Square deviations (RMSd) with respect to the initial conformation, the RMSd fluctuations, the number of conformation clusters, the evolution of secondary structures and the surface area of residues. None of these criteria alone is 100% efficient in discriminating correct from erroneous models. The mean RMSd, RMSd fluctuations, secondary structure and clustering of conformations show some false positives whereas the residue surface area criterion shows false negatives. However if we consider these criteria in combination it is straightforward to discriminate the two types of models.

CONCLUSION

The ability of discriminating correct from erroneous models allows us to improve the specificity and sensitivity of our fold recognition method for a number of ambiguous cases.

Collapse

1290

Development of a physics-based force field for the scoring and refinement of protein models. Biophys J 2008;94:3227-40. [PMID: 18178653 DOI: 10.1529/biophysj.107.121947] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

1291

Lorenzen S, Zhang Y. Identification of near-native structures by clustering protein docking conformations. Proteins 2007;68:187-94. [PMID: 17397057 DOI: 10.1002/prot.21442] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

1292

Borreguero JM, Skolnick J. Benchmarking of TASSER in the ab initio limit. Proteins 2007;68:48-56. [PMID: 17444524 DOI: 10.1002/prot.21392] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

A significant number of protein sequences in a given proteome have no obvious evolutionarily related protein in the database of solved protein structures, the PDB. Under these conditions, ab initio or template-free modeling methods are the sole means of predicting protein structure. To assess its expected performance on proteomes, the TASSER structure prediction algorithm is benchmarked in the ab initio limit on a representative set of 1129 nonhomologous sequences ranging from 40 to 200 residues that cover the PDB at 30% sequence identity and which adopt alpha, alpha + beta, and beta secondary structures. For sequences in the 40-100 (100-200) residue range, as assessed by their root mean square deviation from native, RMSD, the best of the top five ranked models of TASSER has a global fold that is significantly close to the native structure for 25% (16%) of the sequences, and with a correct identification of the structure of the protein core for 59% (36%). In the absence of a native structure, the structural similarity among the top five ranked models is a moderately reliable predictor of folding accuracy. If we classify the sequences according to their secondary structure content, then 64% (36%) of alpha, 43% (24%) of alpha + beta, and 20% (12%) of beta sequences in the 40-100 (100-200) residue range have a significant TM-score (TM-score > or = 0.4). TASSER performs best on helical proteins because there are less secondary structural elements to arrange in a helical protein than in a beta protein of equal length, since the average length of a helix is longer than that of a strand. In addition, helical proteins have shorter loops and dangling tails. If we exclude these flexible fragments, then TASSER has similar accuracy for sequences containing the same number of secondary structural elements, irrespective of whether they are helices and/or strands. Thus, it is the effective configurational entropy of the protein that dictates the average likelihood of correctly arranging the secondary structure elements.

Collapse

1293

Sadowski MI, Jones DT. Benchmarking template selection and model quality assessment for high-resolution comparative modeling. Proteins 2007;69:476-85. [PMID: 17623860 DOI: 10.1002/prot.21531] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

1294

Lee M, Jeong CS, Kim D. Predicting and improving the protein sequence alignment quality by support vector regression. BMC Bioinformatics 2007;8:471. [PMID: 18053160 PMCID: PMC2222655 DOI: 10.1186/1471-2105-8-471] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2007] [Accepted: 12/03/2007] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significantly depending on our choice of various alignment parameters such as gap opening penalty and gap extension penalty. Because the accuracy of sequence alignment is typically measured by comparing it with its corresponding structure alignment, there is no good way of evaluating alignment accuracy without knowing the structure of a query protein, which is obviously not available at the time of structure prediction. Moreover, there is no universal alignment parameter option that would always yield the optimal alignment.

RESULTS

In this work, we develop a method to predict the quality of the alignment between a query and a template. We train the support vector regression (SVR) models to predict the MaxSub scores as a measure of alignment quality. The alignment between a query protein and a template of length n is transformed into a (n + 1)-dimensional feature vector, then it is used as an input to predict the alignment quality by the trained SVR model. Performance of our work is evaluated by various measures including Pearson correlation coefficient between the observed and predicted MaxSub scores. Result shows high correlation coefficient of 0.945. For a pair of query and template, 48 alignments are generated by changing alignment options. Trained SVR models are then applied to predict the MaxSub scores of those and to select the best alignment option which is chosen specifically to the query-template pair. This adaptive selection procedure results in 7.4% improvement of MaxSub scores, compared to those when the single best parameter option is used for all query-template pairs.

CONCLUSION

The present work demonstrates that the alignment quality can be predicted with reasonable accuracy. Our method is useful not only for selecting the optimal alignment parameters for a chosen template based on predicted alignment quality, but also for filtering out problematic templates that are not suitable for structure prediction due to poor alignment accuracy. This is implemented as a part in FORECAST, the server for fold-recognition and is freely available on the web at http://pbil.kaist.ac.kr/forecast.

Collapse

1295

Birzele F, Csaba G, Zimmer R. Alternative splicing and protein structure evolution. Nucleic Acids Res 2007;36:550-8. [PMID: 18055499 PMCID: PMC2241867 DOI: 10.1093/nar/gkm1054] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

1296

Qiu J, Sheffler W, Baker D, Noble WS. Ranking predicted protein structures with support vector regression. Proteins 2007;71:1175-82. [PMID: 18004754 DOI: 10.1002/prot.21809] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

1297

Reddy BVB, Kaznessis YN. Use of secondary structural information and C alpha-C alpha distance restraints to model protein structures with MODELLER. J Biosci 2007;32:929-36. [PMID: 17914235 DOI: 10.1007/s12038-007-0093-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

1298

Chen H, Skolnick J. M-TASSER: an algorithm for protein quaternary structure prediction. Biophys J 2007;94:918-28. [PMID: 17905848 PMCID: PMC2186260 DOI: 10.1529/biophysj.107.114280] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open

1299

McGuffin LJ. Benchmarking consensus model quality assessment for protein fold recognition. BMC Bioinformatics 2007;8:345. [PMID: 17877795 PMCID: PMC2048972 DOI: 10.1186/1471-2105-8-345] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2007] [Accepted: 09/18/2007] [Indexed: 11/25/2022] Open

Abstract

Background

Selecting the highest quality 3D model of a protein structure from a number of alternatives remains an important challenge in the field of structural bioinformatics. Many Model Quality Assessment Programs (MQAPs) have been developed which adopt various strategies in order to tackle this problem, ranging from the so called "true" MQAPs capable of producing a single energy score based on a single model, to methods which rely on structural comparisons of multiple models or additional information from meta-servers. However, it is clear that no current method can separate the highest accuracy models from the lowest consistently. In this paper, a number of the top performing MQAP methods are benchmarked in the context of the potential value that they add to protein fold recognition. Two novel methods are also described: ModSSEA, which based on the alignment of predicted secondary structure elements and ModFOLD which combines several true MQAP methods using an artificial neural network.

Results

The ModSSEA method is found to be an effective model quality assessment program for ranking multiple models from many servers, however further accuracy can be gained by using the consensus approach of ModFOLD. The ModFOLD method is shown to significantly outperform the true MQAPs tested and is competitive with methods which make use of clustering or additional information from multiple servers. Several of the true MQAPs are also shown to add value to most individual fold recognition servers by improving model selection, when applied as a post filter in order to re-rank models.

Conclusion

MQAPs should be benchmarked appropriately for the practical context in which they are intended to be used. Clustering based methods are the top performing MQAPs where many models are available from many servers; however, they often do not add value to individual fold recognition servers when limited models are available. Conversely, the true MQAP methods tested can often be used as effective post filters for re-ranking few models from individual fold recognition servers and further improvements can be achieved using a consensus of these methods.

Collapse

1300

Wroblewska L, Skolnick J. Can a physics-based, all-atom potential find a protein's native structure among misfolded structures? I. Large scale AMBER benchmarking. J Comput Chem 2007;28:2059-66. [PMID: 17407093 DOI: 10.1002/jcc.20720] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]