1
|
Fang C, Shang Y, Xu D. A deep dense inception network for protein beta-turn prediction. Proteins 2019; 88:143-151. [PMID: 31294886 DOI: 10.1002/prot.25780] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 06/17/2019] [Accepted: 07/06/2019] [Indexed: 12/13/2022]
Abstract
Beta-turn prediction is useful in protein function studies and experimental design. Although recent approaches using machine-learning techniques such as support vector machine (SVM), neural networks, and K nearest neighbor have achieved good results for beta-turn prediction, there is still significant room for improvement. As previous predictors utilized features in a sliding window of 4-20 residues to capture interactions among sequentially neighboring residues, such feature engineering may result in incomplete or biased features and neglect interactions among long-range residues. Deep neural networks provide a new opportunity to address these issues. Here, we proposed a deep dense inception network (DeepDIN) for beta-turn prediction, which takes advantage of the state-of-the-art deep neural network design of dense networks and inception networks. A test on a recent BT6376 benchmark data set shows that DeepDIN outperformed the previous best tool BetaTPred3 significantly in both the overall prediction accuracy and the nine-type beta-turn classification accuracy. A tool, called MUFold-BetaTurn, was developed, which is the first beta-turn prediction tool utilizing deep neural networks. The tool can be downloaded at http://dslsrv8.cs.missouri.edu/~cf797/MUFoldBetaTurn/download.html.
Collapse
Affiliation(s)
- Chao Fang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Yi Shang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri.,Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri
| |
Collapse
|
2
|
Singh H, Singh S, Raghava GPS. In silico
platform for predicting and initiating β-turns in a protein at desired locations. Proteins 2015; 83:910-21. [DOI: 10.1002/prot.24783] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2014] [Revised: 02/09/2015] [Accepted: 02/14/2015] [Indexed: 11/09/2022]
Affiliation(s)
- Harinder Singh
- Bioinformatics Center, Institute of Microbial Technology; Chandigarh India
| | - Sandeep Singh
- Bioinformatics Center, Institute of Microbial Technology; Chandigarh India
| | | |
Collapse
|
3
|
Wang CC, Lai WC, Chuang WJ. Type I and II β-turns prediction using NMR chemical shifts. JOURNAL OF BIOMOLECULAR NMR 2014; 59:175-184. [PMID: 24838372 DOI: 10.1007/s10858-014-9837-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Accepted: 05/02/2014] [Indexed: 06/03/2023]
Abstract
A method for predicting type I and II β-turns using nuclear magnetic resonance (NMR) chemical shifts is proposed. Isolated β-turn chemical-shift data were collected from 1,798 protein chains. One-dimensional statistical analyses on chemical-shift data of three classes β-turn (type I, II, and VIII) showed different distributions at four positions, (i) to (i + 3). Considering the central two residues of type I β-turns, the mean values of Cο, Cα, H(N), and N(H) chemical shifts were generally (i + 1) > (i + 2). The mean values of Cβ and Hα chemical shifts were (i + 1) < (i + 2). The distributions of the central two residues in type II and VIII β-turns were also distinguishable by trends of chemical shift values. Two-dimensional cluster analyses on chemical-shift data show positional distributions more clearly. Based on these propensities of chemical shift classified as a function of position, rules were derived using scoring matrices for four consecutive residues to predict type I and II β-turns. The proposed method achieves an overall prediction accuracy of 83.2 and 84.2% with the Matthews correlation coefficient values of 0.317 and 0.632 for type I and II β-turns, indicating that its higher accuracy for type II turn prediction. The results show that it is feasible to use NMR chemical shifts to predict the β-turn types in proteins. The proposed method can be incorporated into other chemical-shift based protein secondary structure prediction methods.
Collapse
Affiliation(s)
- Ching-Cheng Wang
- Institute of Manufacturing Information and Systems, National Cheng Kung University College of Electrical Engineering and Computer Science, Tainan, 701, Taiwan
| | | | | |
Collapse
|
4
|
WXG100 protein superfamily consists of three subfamilies and exhibits an α-helical C-terminal conserved residue pattern. PLoS One 2014; 9:e89313. [PMID: 24586681 PMCID: PMC3935865 DOI: 10.1371/journal.pone.0089313] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Accepted: 01/21/2014] [Indexed: 11/20/2022] Open
Abstract
Members of the WXG100 protein superfamily form homo- or heterodimeric complexes. The most studied proteins among them are the secreted T-cell antigens CFP-10 (10 kDa culture filtrate protein, EsxB) and ESAT-6 (6 kDa early secreted antigen target, EsxA) from Mycobacterium tuberculosis. They are encoded on an operon within a gene cluster, named as ESX-1, that encodes for the Type VII secretion system (T7SS). WXG100 proteins are secreted in a full-length form and it is known that they adopt a four-helix bundle structure. In the current work we discuss the evolutionary relationship between the homo- and heterodimeric WXG100 proteins, the basis of the oligomeric state and the key structural features of the conserved sequence pattern of WXG100 proteins. We performed an iterative bioinformatics analysis of the WXG100 protein superfamily and correlated this with the atomic structures of the representative WXG100 proteins. We find, firstly, that the WXG100 protein superfamily consists of three subfamilies: CFP-10-, ESAT-6- and sagEsxA-like proteins (EsxA proteins similar to that of Streptococcus agalactiae). Secondly, that the heterodimeric complexes probably evolved from a homodimeric precursor. Thirdly, that the genes of hetero-dimeric WXG100 proteins are always encoded in bi-cistronic operons and finally, by combining the sequence alignments with the X-ray data we identify a conserved C-terminal sequence pattern. The side chains of these conserved residues decorate the same side of the C-terminal α-helix and therefore form a distinct surface. Our results lead to a putatively extended T7SS secretion signal which combines two reported T7SS recognition characteristics: Firstly that the T7SS secretion signal is localized at the C-terminus of T7SS substrates and secondly that the conserved residues YxxxD/E are essential for T7SS activity. Furthermore, we propose that the specific α-helical surface formed by the conserved sequence pattern including YxxxD/E motif is a key component of T7SS-substrate recognition.
Collapse
|
5
|
Abstract
BACKGROUND β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. RESULTS We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. CONCLUSIONS In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.
Collapse
|
6
|
Song Q, Li T, Cong P, Sun J, Li D, Tang S. Predicting turns in proteins with a unified model. PLoS One 2012; 7:e48389. [PMID: 23144872 PMCID: PMC3492357 DOI: 10.1371/journal.pone.0048389] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 09/24/2012] [Indexed: 11/18/2022] Open
Abstract
MOTIVATION Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. RESULTS In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.
Collapse
Affiliation(s)
- Qi Song
- Department of Chemistry, Tongji University, Shanghai, China
| | - Tonghua Li
- Department of Chemistry, Tongji University, Shanghai, China
- * E-mail:
| | - Peisheng Cong
- Department of Chemistry, Tongji University, Shanghai, China
| | - Jiangming Sun
- Department of Chemistry, Tongji University, Shanghai, China
| | - Dapeng Li
- Department of Chemistry, Tongji University, Shanghai, China
| | - Shengnan Tang
- Department of Chemistry, Tongji University, Shanghai, China
| |
Collapse
|
7
|
Using Homology Information From PDB to Improve The Accuracy of Protein β-turn Prediction by NetTurnP*. PROG BIOCHEM BIOPHYS 2012. [DOI: 10.3724/sp.j.1206.2011.00370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
8
|
Koch O. Advances in the Prediction of Turn Structures in Peptides and Proteins. Mol Inform 2012; 31:624-30. [PMID: 27477811 DOI: 10.1002/minf.201200021] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 05/28/2012] [Indexed: 11/07/2022]
Abstract
Turns are essential for protein structure as they allow the polypeptide chain to fold backup on itself. They also occur within protein binding sites, at proteinprotein interfaces and in small bioactive peptides, where they can play a crucial role for molecular recognition. Turn structures are an important class of protein secondary structure, although relatively little attention is paid to them with respect to helices and β-sheets. Protein structure prediction, functional analysis of proteins and peptides, and computer-aided drug design could all benefit from making use of accurately predicted turn structures from amino acid sequence. Here, recent advances of turn structure prediction and the underlying turn classification will be discussed together with their applications.
Collapse
Affiliation(s)
- Oliver Koch
- Intervet Innovation GmbH, Molecular Discovery Sciences, Zur Propstei, 55270 Schwabenheim, Germany phone: +49 (6130) 948 396; fax:+49 (6130) 948 517. .,Molisa GmbH, Brenneckestrasse 20, 39118 Magdeburg, Germany.
| |
Collapse
|
9
|
Shen Y, Bax A. Identification of helix capping and b-turn motifs from NMR chemical shifts. JOURNAL OF BIOMOLECULAR NMR 2012; 52:211-32. [PMID: 22314702 PMCID: PMC3357447 DOI: 10.1007/s10858-012-9602-0] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2011] [Accepted: 01/02/2012] [Indexed: 05/11/2023]
Abstract
We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and (13)C(β) chemical shifts. Elements identified include the N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I', II' and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7-0.9 for the Matthews correlation coefficient of its predictions far exceed those attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures.
Collapse
Affiliation(s)
- Yang Shen
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| | | |
Collapse
|
10
|
Shi X, Hu X, Li S, Liu X. Prediction of β-turn types in protein by using composite vector. J Theor Biol 2011; 286:24-30. [PMID: 21781975 DOI: 10.1016/j.jtbi.2011.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2011] [Revised: 06/23/2011] [Accepted: 07/05/2011] [Indexed: 11/29/2022]
Abstract
Protein secondary structure prediction is an intermediate step in the overall process of tertiary structure prediction. β-turns are important components of the secondary structure of a protein. Development of an accurate method of prediction of β-turn types would be helpful for predicting the overall tertiary structure of proteins. In this work, we constructed a database of 2805 protein chains. Our work improved the previous input parameters and used the support vector machine algorithm to predict the β-turn types; we obtained the overall prediction accuracy of 98.1%, 96.0%, 96.1%, 98.7%, 99.1%, 86.8%, 99.2% and 73.2% with the Matthews Correlation Coefficient values of 0.398, 0.460, 0.043, 0.463, 0.355, 0.172, 0.109 and 0.247, respectively, for types I, II, VIII, I', II', IV, VI and non-β-turn, respectively. In addition, we also used same method to predict the β-turn types in three databases of 426, 547 and 823 protein chains and found that our prediction results were better than other predictions.
Collapse
Affiliation(s)
- Xiaobo Shi
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, PR China
| | | | | | | |
Collapse
|
11
|
Tang Z, Li T, Liu R, Xiong W, Sun J, Zhu Y, Chen G. Improving the performance of β-turn prediction using predicted shape strings and a two-layer support vector machine model. BMC Bioinformatics 2011; 12:283. [PMID: 21749732 PMCID: PMC3155507 DOI: 10.1186/1471-2105-12-283] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Accepted: 07/13/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The β-turn is a secondary protein structure type that plays an important role in protein configuration and function. Development of accurate prediction methods to identify β-turns in protein sequences is valuable. Several methods for β-turn prediction have been developed; however, the prediction quality is still a challenge and there is substantial room for improvement. Innovations of the proposed method focus on discovering effective features, and constructing a new architectural model. RESULTS We utilized predicted secondary structures, predicted shape strings and the position-specific scoring matrix (PSSM) as input features, and proposed a novel two-layer model to enhance the prediction. We achieved the highest values according to four evaluation measures, i.e. Q(total) = 87.2%, MCC = 0.66, Q(observed) = 75.9%, and Q(predicted) = 73.8% on the BT426 dataset. The results show that our proposed two-layer model discriminates better between β-turns and non-β-turns than the single model due to obtaining higher Q(predicted). Moreover, the predicted shape strings based on the structural alignment approach greatly improve the performance, and the same improvements were observed on BT547 and BT823 datasets as well. CONCLUSION In this article, we present a comprehensive method for the prediction of β-turns. Experiments show that the proposed method constitutes a great improvement over the competing prediction methods.
Collapse
Affiliation(s)
- Zehui Tang
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| | - Tonghua Li
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| | - Rida Liu
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| | - Wenwei Xiong
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| | - Jiangming Sun
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| | - Yaojuan Zhu
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| | - Guanyan Chen
- Department of Chemistry, Tongji University, Shanghai, 200092, China
| |
Collapse
|
12
|
Petersen B, Lundegaard C, Petersen TN. NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features. PLoS One 2010; 5:e15079. [PMID: 21152409 PMCID: PMC2994801 DOI: 10.1371/journal.pone.0015079] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2010] [Accepted: 10/19/2010] [Indexed: 11/30/2022] Open
Abstract
β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC = 0.50, Qtotal = 82.1%, sensitivity = 75.6%, PPV = 68.8% and AUC = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 – 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
Collapse
Affiliation(s)
- Bent Petersen
- Department of Systems Biology, Center for Biological Sequence Analysis (CBS), Technical University of Denmark, Lyngby, Denmark
| | - Claus Lundegaard
- Department of Systems Biology, Center for Biological Sequence Analysis (CBS), Technical University of Denmark, Lyngby, Denmark
| | - Thomas Nordahl Petersen
- Department of Systems Biology, Center for Biological Sequence Analysis (CBS), Technical University of Denmark, Lyngby, Denmark
- * E-mail:
| |
Collapse
|
13
|
Abstract
With single-particle electron cryomicroscopy (cryo-EM), it is possible to visualize large, macromolecular assemblies in near-native states. Although subnanometer resolutions have been routinely achieved for many specimens, state of the art cryo-EM has pushed to near-atomic (3.3-4.6 Å) resolutions. At these resolutions, it is now possible to construct reliable atomic models directly from the cryo-EM density map. In this study, we describe our recently developed protocols for performing the three-dimensional reconstruction and modeling of Mm-cpn, a group II chaperonin, determined to 4.3 Å resolution. This protocol, utilizing the software tools EMAN, Gorgon and Coot, can be adapted for use with nearly all specimens imaged with cryo-EM that target beyond 5 Å resolution. Additionally, the feature recognition and computational modeling tools can be applied to any near-atomic resolution density maps, including those from X-ray crystallography.
Collapse
|
14
|
Kountouris P, Hirst JD. Predicting beta-turns and their types using predicted backbone dihedral angles and secondary structures. BMC Bioinformatics 2010; 11:407. [PMID: 20673368 PMCID: PMC2920885 DOI: 10.1186/1471-2105-11-407] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2010] [Accepted: 07/31/2010] [Indexed: 11/29/2022] Open
Abstract
Background β-turns are secondary structure elements usually classified as coil. Their prediction is important, because of their role in protein folding and their frequent occurrence in protein chains. Results We have developed a novel method that predicts β-turns and their types using information from multiple sequence alignments, predicted secondary structures and, for the first time, predicted dihedral angles. Our method uses support vector machines, a supervised classification technique, and is trained and tested on three established datasets of 426, 547 and 823 protein chains. We achieve a Matthews correlation coefficient of up to 0.49, when predicting the location of β-turns, the highest reported value to date. Moreover, the additional dihedral information improves the prediction of β-turn types I, II, IV, VIII and "non-specific", achieving correlation coefficients up to 0.39, 0.33, 0.27, 0.14 and 0.38, respectively. Our results are more accurate than other methods. Conclusions We have created an accurate predictor of β-turns and their types. Our method, called DEBT, is available online at http://comp.chem.nottingham.ac.uk/debt/.
Collapse
Affiliation(s)
- Petros Kountouris
- School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, UK
| | | |
Collapse
|
15
|
Liang G, Zhao W. Using factor analysis scales of generalized amino acid information for prediction and characteristic analysis of β-turns in proteins based on a support vector machine model. Sci China Chem 2010. [DOI: 10.1007/s11426-010-0165-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
16
|
Abstract
Beta-turn is a secondary protein structure type that plays an important role in protein configuration and function. Here, we introduced an approach of beta-turn prediction that used the support vector machine (SVM) algorithm combined with predicted secondary structure information. The secondary structure information was obtained by using E-SSpred, a new secondary protein structure prediction method. A 7-fold cross validation based on the benchmark dataset of 426 non-homologous protein chains was used to evaluate the performance of our method. The prediction results broke the 80% Q (total) barrier and achieved Q (total) = 80.9%, MCC = 0.44, and Q (predicted) higher 0.9% when compared with the best method. The results in our research are coincident with the conclusion that beta-turn prediction accuracy can be improved by inclusion of secondary structure information.
Collapse
|
17
|
Gamma-turn types prediction in proteins using the two-stage hybrid neural discriminant model. J Theor Biol 2009; 259:517-22. [PMID: 19409396 DOI: 10.1016/j.jtbi.2009.04.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2008] [Revised: 02/19/2009] [Accepted: 04/21/2009] [Indexed: 11/20/2022]
Abstract
Due to the slightly success of protein secondary structure prediction using the various algorithmic and non-algorithmic techniques, similar techniques have been developed for predicting gamma-turns in proteins by Kaur and Raghava [2003. A neural-network based method for prediction of gamma-turns in proteins from multiple sequence alignment. Protein Sci. 12, 923-929]. However, the major limitation of previous methods was inability in predicting gamma-turn types. In a recent investigation we introduced a sequence based predictor model for predicting gamma-turn types in proteins [Jahandideh, S., Sabet Sarvestani, A., Abdolmaleki, P., Jahandideh, M., Barfeie, M, 2007a. gamma-turn types prediction in proteins using the support vector machines. J. Theor. Biol. 249, 785-790]. In the present work, in order to analyze the effect of sequence and structure in the formation of gamma-turn types and predicting gamma-turn types in proteins, we applied novel hybrid neural discriminant modeling procedure. As the result, this study clarified the efficiency of using the statistical model preprocessors in determining the effective parameters. Moreover, the optimal structure of neural network can be simplified by a preprocessor in the first stage of hybrid approach, thereby reducing the needed time for neural network training procedure in the second stage and the probability of overfitting occurrence decreased and a high precision and reliability obtained in this way.
Collapse
|
18
|
Zheng C, Kurgan L. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments. BMC Bioinformatics 2008; 9:430. [PMID: 18847492 PMCID: PMC2613158 DOI: 10.1186/1471-2105-9-430] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Accepted: 10/10/2008] [Indexed: 11/10/2022] Open
Abstract
Background β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. Results We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Qtotal, Qpredicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Qtotal barrier and achieves Qtotal = 80.9%, MCC = 0.47, and Qpredicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Conclusion Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at .
Collapse
Affiliation(s)
- Ce Zheng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada.
| | | |
Collapse
|