Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Wang G, Jin Y, Dunbrack RL. Assessment of fold recognition predictions in CASP6. Proteins 2006;61 Suppl 7:46-66. [PMID: 16187346 DOI: 10.1002/prot.20721] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Number

Cited by Other Article(s)

Achimba F, Faezov B, Cohen B, Dunbrack R, Holford M. Targeting Dysregulated Ion Channels in Liver Tumors with Venom Peptides. Mol Cancer Ther 2024;23:139-147. [PMID: 38015557 PMCID: PMC10831335 DOI: 10.1158/1535-7163.mct-23-0256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 10/04/2023] [Accepted: 11/14/2023] [Indexed: 11/29/2023]

Modi V, Xu Q, Adhikari S, Dunbrack RL. Assessment of template-based modeling of protein structure in CASP11. Proteins 2016;84 Suppl 1:200-20. [PMID: 27081927 DOI: 10.1002/prot.25049] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2016] [Revised: 04/04/2016] [Accepted: 04/11/2016] [Indexed: 12/27/2022]

Wabik J, Kurcinski M, Kolinski A. Coarse-Grained Modeling of Peptide Docking Associated with Large Conformation Transitions of the Binding Protein: Troponin I Fragment-Troponin C System. Molecules 2015;20:10763-80. [PMID: 26111167 PMCID: PMC6272278 DOI: 10.3390/molecules200610763] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 05/14/2015] [Accepted: 05/21/2015] [Indexed: 11/25/2022] Open

Ligand heterogeneity of the cysteine protease binding protein family in the parasitic protist Entamoeba histolytica. Int J Parasitol 2014;44:625-35. [DOI: 10.1016/j.ijpara.2014.04.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Revised: 04/11/2014] [Accepted: 04/15/2014] [Indexed: 01/08/2023]

Feng Y, Lin H, Luo L. Prediction of protein secondary structure using feature selection and analysis approach. Acta Biotheor 2014;62:1-14. [PMID: 24052343 DOI: 10.1007/s10441-013-9203-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2012] [Accepted: 08/24/2013] [Indexed: 01/09/2023]

Kaushik S, Mutt E, Chellappan A, Sankaran S, Srinivasan N, Sowdhamini R. Improved detection of remote homologues using cascade PSI-BLAST: influence of neighbouring protein families on sequence coverage. PLoS One 2013;8:e56449. [PMID: 23437136 PMCID: PMC3577913 DOI: 10.1371/journal.pone.0056449] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Accepted: 01/13/2013] [Indexed: 12/31/2022] Open

Abstract

Background

Development of sensitive sequence search procedures for the detection of distant relationships between proteins at superfamily/fold level is still a big challenge. The intermediate sequence search approach is the most frequently employed manner of identifying remote homologues effectively. In this study, examination of serine proteases of prolyl oligopeptidase, rhomboid and subtilisin protein families were carried out using plant serine proteases as queries from two genomes including A. thaliana and O. sativa and 13 other families of unrelated folds to identify the distant homologues which could not be obtained using PSI-BLAST.

Methodology/Principal Findings

We have proposed to start with multiple queries of classical serine protease members to identify remote homologues in families, using a rigorous approach like Cascade PSI-BLAST. We found that classical sequence based approaches, like PSI-BLAST, showed very low sequence coverage in identifying plant serine proteases. The algorithm was applied on enriched sequence database of homologous domains and we obtained overall average coverage of 88% at family, 77% at superfamily or fold level along with specificity of ∼100% and Mathew’s correlation coefficient of 0.91. Similar approach was also implemented on 13 other protein families representing every structural class in SCOP database. Further investigation with statistical tests, like jackknifing, helped us to better understand the influence of neighbouring protein families.

Conclusions/Significance

Our study suggests that employment of multiple queries of a family for the Cascade PSI-BLAST searches is useful for predicting distant relationships effectively even at superfamily level. We have proposed a generalized strategy to cover all the distant members of a particular family using multiple query sequences. Our findings reveal that prior selection of sequences as query and the presence of neighbouring families can be important for covering the search space effectively in minimal computational time. This study also provides an understanding of the ‘bridging’ role of related families.

Collapse

González J, Gálvez A, Morales L, Barreto GE, Capani F, Sierra O, Torres Y. Integrative Approach for Computationally Inferring Interactions between the Alpha and Beta Subunits of the Calcium-Activated Potassium Channel (BK): A Docking Study. Bioinform Biol Insights 2013;7:73-82. [PMID: 23492851 PMCID: PMC3588595 DOI: 10.4137/bbi.s10077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

3D profile-based approach to proteome-wide discovery of novel human chemokines. PLoS One 2012;7:e36151. [PMID: 22586462 PMCID: PMC3346806 DOI: 10.1371/journal.pone.0036151] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 03/27/2012] [Indexed: 12/29/2022] Open

Abstract

Chemokines are small secreted proteins with important roles in immune responses. They consist of a conserved three-dimensional (3D) structure, so-called IL8-like chemokine fold, which is supported by disulfide bridges characteristic of this protein family. Sequence- and profile-based computational methods have been proficient in discovering novel chemokines by making use of their sequence-conserved cysteine patterns. However, it has been recently shown that some chemokines escaped annotation by these methods due to low sequence similarity to known chemokines and to different arrangement of cysteines in sequence and in 3D. Innovative methods overcoming the limitations of current techniques may allow the discovery of new remote homologs in the still functionally uncharacterized fraction of the human genome. We report a novel computational approach for proteome-wide identification of remote homologs of the chemokine family that uses fold recognition techniques in combination with a scaffold-based automatic mapping of disulfide bonds to define a 3D profile of the chemokine protein family. By applying our methodology to all currently uncharacterized human protein sequences, we have discovered two novel proteins that, without having significant sequence similarity to known chemokines or characteristic cysteine patterns, show strong structural resemblance to known anti-HIV chemokines. Detailed computational analysis and experimental structural investigations based on mass spectrometry and circular dichroism support our structural predictions and highlight several other chemokine-like features. The results obtained support their functional annotation as putative novel chemokines and encourage further experimental characterization. The identification of remote homologs of human chemokines may provide new insights into the molecular mechanisms causing pathologies such as cancer or AIDS, and may contribute to the development of novel treatments. Besides, the genome-wide applicability of our methodology based on 3D protein family profiles may open up new possibilities for improving and accelerating protein function annotation processes.

Collapse

POLEKSIC ALEKSANDAR, FIENUP MARK, DANZER JOSEPHF, DEBE DEREKA. A DIFFERENT LOOK AT THE QUALITY OF MODELED THREE-DIMENSIONAL PROTEIN STRUCTURES. J Bioinform Comput Biol 2011;6:335-45. [DOI: 10.1142/s0219720008003424] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2007] [Revised: 11/14/2007] [Accepted: 12/05/2007] [Indexed: 11/18/2022]

Wei Y, Thompson J, Floudas CA. CONCORD: a consensus method for protein secondary structure prediction via mixed integer linear optimization. Proc Math Phys Eng Sci 2011. [DOI: 10.1098/rspa.2011.0514] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Wei Y, Floudas CA. Enhanced Inter-helical Residue Contact Prediction in Transmembrane Proteins. Chem Eng Sci 2011;66:4356-4369. [PMID: 21892227 PMCID: PMC3164537 DOI: 10.1016/j.ces.2011.04.033] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Zhou H, Skolnick J. Improving threading algorithms for remote homology modeling by combining fragment and template comparisons. Proteins 2010;78:2041-8. [PMID: 20455261 DOI: 10.1002/prot.22717] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Wu S, Zhang Y. Recognizing protein substructure similarity using segmental threading. Structure 2010;18:858-67. [PMID: 20637422 DOI: 10.1016/j.str.2010.04.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 04/02/2010] [Accepted: 04/03/2010] [Indexed: 11/15/2022]

Krivov GG, Shapovalov MV, Dunbrack RL. Improved prediction of protein side-chain conformations with SCWRL4. Proteins 2010;77:778-95. [PMID: 19603484 DOI: 10.1002/prot.22488] [Citation(s) in RCA: 984] [Impact Index Per Article: 70.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Cozzetto D, Kryshtafovych A, Fidelis K, Moult J, Rost B, Tramontano A. Evaluation of template-based models in CASP8 with standard measures. Proteins 2010;77 Suppl 9:18-28. [PMID: 19731382 DOI: 10.1002/prot.22561] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Keedy DA, Williams CJ, Headd JJ, Arendall WB, Chen VB, Kapral GJ, Gillespie RA, Block JN, Zemla A, Richardson DC, Richardson JS. The other 90% of the protein: assessment beyond the Calphas for CASP8 template-based and high-accuracy models. Proteins 2010;77 Suppl 9:29-49. [PMID: 19731372 DOI: 10.1002/prot.22551] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Hvidsten TR, Kryshtafovych A, Fidelis K. Local descriptors of protein structure: a systematic analysis of the sequence-structure relationship in proteins using short- and long-range interactions. Proteins 2009;75:870-84. [PMID: 19025980 DOI: 10.1002/prot.22296] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Abstract

Local protein structure representations that incorporate long-range contacts between residues are often considered in protein structure comparison but have found relatively little use in structure prediction where assembly from single backbone fragments dominates. Here, we introduce the concept of local descriptors of protein structure to characterize local neighborhoods of amino acids including short- and long-range interactions. We build a library of recurring local descriptors and show that this library is general enough to allow assembly of unseen protein structures. The library could on average re-assemble 83% of 119 unseen structures, and showed little or no performance decrease between homologous targets and targets with folds not represented among domains used to build it. We then systematically evaluate the descriptor library to establish the level of the sequence signal in sets of protein fragments of similar geometrical conformation. In particular, we test whether that signal is strong enough to facilitate correct assignment and alignment of these local geometries to new sequences. We use the signal to assign descriptors to a test set of 479 sequences with less than 40% sequence identity to any domain used to build the library, and show that on average more than 50% of the backbone fragments constituting descriptors can be correctly aligned. We also use the assigned descriptors to infer SCOP folds, and show that correct predictions can be made in many of the 151 cases where PSI-BLAST was unable to detect significant sequence similarity to proteins in the library. Although the combinatorial problem of simultaneously aligning several fragments to sequence is a major bottleneck compared with single fragment methods, the advantage of the current approach is that correct alignments imply correct long range distance constraints. The lack of these constraints is most likely the major reason why structure prediction methods fail to consistently produce adequate models when good templates are unavailable or undetectable. Thus, we believe that the current study offers new and valuable insight into the prediction of sequence-structure relationships in proteins.

Collapse

Tkaczuk KL. Trm13p, the tRNA:Xm4 modification enzyme from Saccharomyces cerevisiae is a member of the Rossmann-fold MTase superfamily: prediction of structure and active site. J Mol Model 2009;16:599-606. [PMID: 19697067 DOI: 10.1007/s00894-009-0570-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Accepted: 07/28/2009] [Indexed: 01/09/2023]

Sadreyev RI, Shi S, Baker D, Grishin NV. Structure similarity measure with penalty for close non-equivalent residues. Bioinformatics 2009;25:1259-63. [PMID: 19321733 PMCID: PMC2677741 DOI: 10.1093/bioinformatics/btp148] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Orlowski J, Mebrhatu MT, Michiels CW, Bujnicki JM, Aertsen A. Mutational analysis and a structural model of methyl-directed restriction enzyme Mrr. Biochem Biophys Res Commun 2008;377:862-6. [DOI: 10.1016/j.bbrc.2008.10.064] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2008] [Accepted: 10/15/2008] [Indexed: 11/29/2022]

Wu S, Zhang Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 2008;72:547-56. [PMID: 18247410 DOI: 10.1002/prot.21945] [Citation(s) in RCA: 310] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences. BMC Bioinformatics 2008;9:226. [PMID: 18452616 PMCID: PMC2391167 DOI: 10.1186/1471-2105-9-226] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2007] [Accepted: 05/01/2008] [Indexed: 11/16/2022] Open

Abstract

Background

Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction.

Results

SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors.

Conclusion

The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods.

Collapse

Poleksic A, Fienup M. Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms. Bioinformatics 2008;24:1145-53. [PMID: 18337259 DOI: 10.1093/bioinformatics/btn097] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Cozzetto D, Kryshtafovych A, Ceriani M, Tramontano A. Assessment of predictions in the model quality assessment category. Proteins 2008;69 Suppl 8:175-83. [PMID: 17680695 DOI: 10.1002/prot.21669] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Kopp J, Bordoli L, Battey JND, Kiefer F, Schwede T. Assessment of CASP7 predictions for template-based modeling targets. Proteins 2008;69 Suppl 8:38-56. [PMID: 17894352 DOI: 10.1002/prot.21753] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Yang J. Comprehensive description of protein structures using protein folding shape code. Proteins 2008;71:1497-518. [DOI: 10.1002/prot.21932] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

A historical perspective of template-based protein structure prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008;413:3-42. [PMID: 18075160 DOI: 10.1007/978-1-59745-574-9_1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Protein Structure Prediction. Bioinformatics 2008;453:33-85. [DOI: 10.1007/978-1-60327-429-6_2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Kosinski J, Kubareva E, Bujnicki JM. A model of restriction endonuclease MvaI in complex with DNA: a template for interpretation of experimental data and a guide for specificity engineering. Proteins 2007;68:324-36. [PMID: 17407166 DOI: 10.1002/prot.21460] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Yan A, Kloczkowski A, Hofmann H, Jernigan RL. Prediction of side chain orientations in proteins by statistical machine learning methods. J Biomol Struct Dyn 2007;25:275-88. [PMID: 17937489 DOI: 10.1080/07391102.2007.10507176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Chen K, Kurgan L. PFRES: protein fold classification by using evolutionary information and predicted secondary structure. Bioinformatics 2007;23:2843-50. [DOI: 10.1093/bioinformatics/btm475] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Maravić Vlahovicek G, Cubrilo S, Tkaczuk KL, Bujnicki JM. Modeling and experimental analyses reveal a two-domain structure and amino acids important for the activity of aminoglycoside resistance methyltransferase Sgm. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2007;1784:582-90. [PMID: 18343347 DOI: 10.1016/j.bbapap.2007.09.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2007] [Revised: 09/18/2007] [Accepted: 09/19/2007] [Indexed: 12/19/2022]

Floudas CA. Computational methods in protein structure prediction. Biotechnol Bioeng 2007;97:207-13. [PMID: 17455371 DOI: 10.1002/bit.21411] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases. BMC Bioinformatics 2007;8:73. [PMID: 17338813 PMCID: PMC1829167 DOI: 10.1186/1471-2105-8-73] [Citation(s) in RCA: 128] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2006] [Accepted: 03/05/2007] [Indexed: 11/29/2022] Open

Abstract

Background

SPOUT methyltransferases (MTases) are a large class of S-adenosyl-L-methionine-dependent enzymes that exhibit an unusual alpha/beta fold with a very deep topological knot. In 2001, when no crystal structures were available for any of these proteins, Anantharaman, Koonin, and Aravind identified homology between SpoU and TrmD MTases and defined the SPOUT superfamily. Since then, multiple crystal structures of knotted MTases have been solved and numerous new homologous sequences appeared in the databases. However, no comprehensive comparative analysis of these proteins has been carried out to classify them based on structural and evolutionary criteria and to guide functional predictions.

Results

We carried out extensive searches of databases of protein structures and sequences to collect all members of previously identified SPOUT MTases, and to identify previously unknown homologs. Based on sequence clustering, characterization of domain architecture, structure predictions and sequence/structure comparisons, we re-defined families within the SPOUT superfamily and predicted putative active sites and biochemical functions for the so far uncharacterized members. We have also delineated the common core of SPOUT MTases and inferred a multiple sequence alignment for the conserved knot region, from which we calculated the phylogenetic tree of the superfamily. We have also studied phylogenetic distribution of different families, and used this information to infer the evolutionary history of the SPOUT superfamily.

Conclusion

We present the first phylogenetic tree of the SPOUT superfamily since it was defined, together with a new scheme for its classification, and discussion about conservation of sequence and structure in different families, and their functional implications. We identified four protein families as new members of the SPOUT superfamily. Three of these families are functionally uncharacterized (COG1772, COG1901, and COG4080), and one (COG1756 represented by Nep1p) has been already implicated in RNA metabolism, but its biochemical function has been unknown. Based on the inference of orthologous and paralogous relationships between all SPOUT families we propose that the Last Universal Common Ancestor (LUCA) of all extant organisms contained at least three SPOUT members, ancestors of contemporary RNA MTases that carry out m¹G, m3U, and 2'O-ribose methylation, respectively. In this work we also speculate on the origin of the knot and propose possible 'unknotted' ancestors. The results of our analysis provide a comprehensive 'roadmap' for experimental characterization of SPOUT MTases and interpretation of functional studies in the light of sequence-structure relationships.

Collapse

Sickmeier M, Hamilton JA, LeGall T, Vacic V, Cortese MS, Tantos A, Szabo B, Tompa P, Chen J, Uversky VN, Obradovic Z, Dunker AK. DisProt: the Database of Disordered Proteins. Nucleic Acids Res 2006;35:D786-93. [PMID: 17145717 PMCID: PMC1751543 DOI: 10.1093/nar/gkl893] [Citation(s) in RCA: 616] [Impact Index Per Article: 34.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Affiliation(s)

Megan Sickmeier Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of MedicineIndianapolis, IN 46202, USA
Justin A. Hamilton Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of MedicineIndianapolis, IN 46202, USA
Tanguy LeGall Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of MedicineIndianapolis, IN 46202, USA
Vladimir Vacic Computer Science and Engineering, University of California RiversideRiverside, CA 92521, USA
Marc S. Cortese Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of MedicineIndianapolis, IN 46202, USA
Agnes Tantos Institute of Enzymology, Biological Research CenterHungarian Academy of Sciences, Budapest, Hungary
Beata Szabo Institute of Enzymology, Biological Research CenterHungarian Academy of Sciences, Budapest, Hungary
Peter Tompa Institute of Enzymology, Biological Research CenterHungarian Academy of Sciences, Budapest, Hungary
Jake Chen School of Informatics, Indiana UniversityIndianapolis, IN 46202, USA
Vladimir N. Uversky Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of MedicineIndianapolis, IN 46202, USA Institute for Biological Instrumentation, Russian Academy of Sciences142292 Pushchino, Moscow Region, Russia
Zoran Obradovic Center for Information Science and Technology, Temple UniversityPhiladelphia, PA 19122, USA
A. Keith Dunker Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of MedicineIndianapolis, IN 46202, USA School of Informatics, Indiana UniversityIndianapolis, IN 46202, USA To whom correspondence should be addressed. Tel: +1 317 278 9650; Fax: +1 317 278 9217;

Collapse

Zhi D, Krishna SS, Cao H, Pevzner P, Godzik A. Representing and comparing protein structures as paths in three-dimensional space. BMC Bioinformatics 2006;7:460. [PMID: 17052359 PMCID: PMC1626488 DOI: 10.1186/1471-2105-7-460] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2006] [Accepted: 10/20/2006] [Indexed: 11/10/2022] Open

Chivian D, Baker D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucleic Acids Res 2006;34:e112. [PMID: 16971460 PMCID: PMC1635247 DOI: 10.1093/nar/gkl480] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Debe DA, Danzer JF, Goddard WA, Poleksic A. STRUCTFAST: Protein sequence remote homology detection and alignment using novel dynamic programming and profile-profile scoring. Proteins 2006;64:960-7. [PMID: 16786595 DOI: 10.1002/prot.21049] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Dunbrack RL. Sequence comparison and protein structure prediction. Curr Opin Struct Biol 2006;16:374-84. [PMID: 16713709 DOI: 10.1016/j.sbi.2006.05.006] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2006] [Revised: 03/22/2006] [Accepted: 05/08/2006] [Indexed: 10/24/2022]