1
|
Distinctive attributes for predicted secondary structures at terminal sequences of non-classically secreted proteins from proteobacteria. Open Life Sci 2008. [DOI: 10.2478/s11535-008-0026-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AbstractC- and N-terminal sequences (64 amino acid residues each) of 89 non-classically secreted type I, type III and type IV proteins (Swiss-Prot/TrEMBL) from proteobacteria were transformed into predicted secondary structures. Multivariate analysis of variance (MANOVA) confirmed the significance of location (C- or N-termini) and secretion type as essential factors in respect of quantitative representations of structured (a-helices, b-strands) and unstructured (coils) elements. The profiles of secondary structures were transcripted using unequal property values for helices, strands and coils and corresponding numerical vectors (independent variables) were subjected to multiple discriminant analysis with the types of secreted proteins as the dependent variables. The set of strong predictor variables (21 property values located at the region of 2–49 residues from the C-termini) was capable to classify all three types of non-classically secreted proteins with an accuracy of 93.3% for originally and 89.9% for cross-validated (leave-one-out procedure) grouped cases. The average error rate (0.137 ± 0.015) of k-fold (k = 3; 4; 6; 8; 10; 89) cross validation affirmed an acceptable prediction accuracy of defined discriminant functions with regard to the types of non-classically secreted proteins. The proposed prediction tool could be used to specify the secretome proteins from genomic sequences as well as to assess the compatibility between secretion pathways and secretion substrates of proteobacteria.
Collapse
|
2
|
Distinctive amino acid residue periodicities in terminal sequences of type III and type I secreted proteins from proteobacteria. Open Life Sci 2007. [DOI: 10.2478/s11535-007-0017-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
AbstractThe Fourier transform (FT) method was applied to specify the distribution of 14 predefined groups of amino acids (64 residues) at both termini of annotated type III and type I secreted proteins from proteobacteria. Type I proteins displayed a higher occurrence of significant periodicities at both C-and N-termini, indicating potent features to discriminate between secretion types, particularly by the use of variables selected from the full periodicity profiles at 19 orders of FT. The Fishers linear discriminant analysis, together with the stepwise selection of variables throughout equal pairs of combinations for all predefined groups of residues, revealed the C-terminal harmonics of aromatic (HFWY) and aliphatic (VLIA) residues as a set of strong predictor variables to classify both types of secreted proteins with an accuracy of 100% for original grouped cases and 96.4% for cross-validated grouped cases. The prediction accuracy of proposed discriminant function was estimated by repeated k-fold cross-validation procedures where the original data set was randomly divided into k subsets, with one of the k-subsets serving as the test set and the remaining data forming the training set. The average error rate computed across all k-trials and repeats did not exceed that of leave-one-out procedure. The proposed set of predictor variables could be used to assess the compatibility between secretion pathways and secretion substrates of proteobacteria by means of discriminant analysis.
Collapse
|