1
|
PupStruct: Prediction of Pupylated Lysine Residues Using Structural Properties of Amino Acids. Genes (Basel) 2020; 11:genes11121431. [PMID: 33260770 PMCID: PMC7761138 DOI: 10.3390/genes11121431] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 11/23/2020] [Accepted: 11/23/2020] [Indexed: 12/23/2022] Open
Abstract
Post-translational modification (PTM) is a critical biological reaction which adds to the diversification of the proteome. With numerous known modifications being studied, pupylation has gained focus in the scientific community due to its significant role in regulating biological processes. The traditional experimental practice to detect pupylation sites proved to be expensive and requires a lot of time and resources. Thus, there have been many computational predictors developed to challenge this issue. However, performance is still limited. In this study, we propose another computational method, named PupStruct, which uses the structural information of amino acids with a radial basis kernel function Support Vector Machine (SVM) to predict pupylated lysine residues. We compared PupStruct with three state-of-the-art predictors from the literature where PupStruct has validated a significant improvement in performance over them with statistical metrics such as sensitivity (0.9234), specificity (0.9359), accuracy (0.9296), precision (0.9349), and Mathew’s correlation coefficient (0.8616) on a benchmark dataset.
Collapse
|
2
|
Huang G, Zheng Y, Wu YQ, Han GS, Yu ZG. An Information Entropy-Based Approach for Computationally Identifying Histone Lysine Butyrylation. Front Genet 2020; 10:1325. [PMID: 32117407 PMCID: PMC7033570 DOI: 10.3389/fgene.2019.01325] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Accepted: 12/05/2019] [Indexed: 12/14/2022] Open
Abstract
Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, etc. The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.
Collapse
Affiliation(s)
- Guohua Huang
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China
| | - Yang Zheng
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China
| | - Yao-Qun Wu
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, China
| | - Guo-Sheng Han
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, China
| | - Zu-Guo Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, China.,School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, QLD, Australia
| |
Collapse
|
3
|
Li T, Chen Y, Li T, Jia C. Recognition of Protein Pupylation Sites by Adopting Resampling Approach. Molecules 2018; 23:molecules23123097. [PMID: 30486421 PMCID: PMC6321382 DOI: 10.3390/molecules23123097] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2018] [Revised: 11/21/2018] [Accepted: 11/22/2018] [Indexed: 12/28/2022] Open
Abstract
With the in-depth study of posttranslational modification sites, protein ubiquitination has become the key problem to study the molecular mechanism of posttranslational modification. Pupylation is a widely used process in which a prokaryotic ubiquitin-like protein (Pup) is attached to a substrate through a series of biochemical reactions. However, the experimental methods of identifying pupylation sites is often time-consuming and laborious. This study aims to propose an improved approach for predicting pupylation sites. Firstly, the Pearson correlation coefficient was used to reflect the correlation among different amino acid pairs calculated by the frequency of each amino acid. Then according to a descending ranked order, the multiple types of features were filtered separately by values of Pearson correlation coefficient. Thirdly, to get a qualified balanced dataset, the K-means principal component analysis (KPCA) oversampling technique was employed to synthesize new positive samples and Fuzzy undersampling method was employed to reduce the number of negative samples. Finally, the performance of our method was verified by means of jackknife and a 10-fold cross-validation test. The average results of 10-fold cross-validation showed that the sensitivity (Sn) was 90.53%, specificity (Sp) was 99.8%, accuracy (Acc) was 95.09%, and Matthews Correlation Coefficient (MCC) was 0.91. Moreover, an independent test dataset was used to further measure its performance, and the prediction results achieved the Acc of 83.75%, MCC of 0.49, which was superior to previous predictors. The better performance and stability of our proposed method showed it is an effective way to predict pupylation sites.
Collapse
Affiliation(s)
- Tao Li
- School of Transportation Management, Dalian Maritime University, Dalian 116026, China.
- China Waterborne Transport Research Institute, Beijing 100088, China.
| | - Yan Chen
- School of Transportation Management, Dalian Maritime University, Dalian 116026, China.
| | - Taoying Li
- School of Transportation Management, Dalian Maritime University, Dalian 116026, China.
| | - Cangzhi Jia
- College of Science, Dalian Maritime University, Dalian 116026, China.
| |
Collapse
|
4
|
Zhao B, Xue B. Decision-Tree Based Meta-Strategy Improved Accuracy of Disorder Prediction and Identified Novel Disordered Residues Inside Binding Motifs. Int J Mol Sci 2018; 19:E3052. [PMID: 30301243 PMCID: PMC6213717 DOI: 10.3390/ijms19103052] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 09/24/2018] [Accepted: 10/04/2018] [Indexed: 02/06/2023] Open
Abstract
Using computational techniques to identify intrinsically disordered residues is practical and effective in biological studies. Therefore, designing novel high-accuracy strategies is always preferable when existing strategies have a lot of room for improvement. Among many possibilities, a meta-strategy that integrates the results of multiple individual predictors has been broadly used to improve the overall performance of predictors. Nonetheless, a simple and direct integration of individual predictors may not effectively improve the performance. In this project, dual-threshold two-step significance voting and neural networks were used to integrate the predictive results of four individual predictors, including: DisEMBL, IUPred, VSL2, and ESpritz. The new meta-strategy has improved the prediction performance of intrinsically disordered residues significantly, compared to all four individual predictors and another four recently-designed predictors. The improvement was validated using five-fold cross-validation and in independent test datasets.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, FL 33620, USA.
| | - Bin Xue
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, FL 33620, USA.
| |
Collapse
|
5
|
SVM-SulfoSite: A support vector machine based predictor for sulfenylation sites. Sci Rep 2018; 8:11288. [PMID: 30050050 PMCID: PMC6062547 DOI: 10.1038/s41598-018-29126-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/02/2018] [Indexed: 12/15/2022] Open
Abstract
Protein S-sulfenylation, which results from oxidation of free thiols on cysteine residues, has recently emerged as an important post-translational modification that regulates the structure and function of proteins involved in a variety of physiological and pathological processes. By altering the size and physiochemical properties of modified cysteine residues, sulfenylation can impact the cellular function of proteins in several different ways. Thus, the ability to rapidly and accurately identify putative sulfenylation sites in proteins will provide important insights into redox-dependent regulation of protein function in a variety of cellular contexts. Though bottom-up proteomic approaches, such as tandem mass spectrometry (MS/MS), provide a wealth of information about global changes in the sulfenylation state of proteins, MS/MS-based experiments are often labor-intensive, costly and technically challenging. Therefore, to complement existing proteomic approaches, researchers have developed a series of computational tools to identify putative sulfenylation sites on proteins. However, existing methods often suffer from low accuracy, specificity, and/or sensitivity. In this study, we developed SVM-SulfoSite, a novel sulfenylation prediction tool that uses support vector machines (SVM) to identify key determinants of sulfenylation among five feature classes: binary code, physiochemical properties, k-space amino acid pairs, amino acid composition and high-quality physiochemical indices. Using 10-fold cross-validation, SVM-SulfoSite achieved 95% sensitivity and 83% specificity, with an overall accuracy of 89% and Matthew’s correlation coefficient (MCC) of 0.79. Likewise, using an independent test set of experimentally identified sulfenylation sites, our method achieved scores of 74%, 62%, 80% and 0.42 for accuracy, sensitivity, specificity and MCC, with an area under the receiver operator characteristic (ROC) curve of 0.81. Moreover, in side-by-side comparisons, SVM-SulfoSite performed as well as or better than existing sulfenylation prediction tools. Together, these results suggest that our method represents a robust and complementary technique for advanced exploration of protein S-sulfenylation.
Collapse
|
6
|
Bao W, You ZH, Huang DS. CIPPN: computational identification of protein pupylation sites by using neural network. Oncotarget 2017; 8:108867-108879. [PMID: 29312575 PMCID: PMC5752488 DOI: 10.18632/oncotarget.22335] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 09/03/2017] [Indexed: 11/25/2022] Open
Abstract
Recently, experiments revealed the pupylation to be a signal for the selective regulation of proteins in several serious human diseases. As one of the most significant post translational modification in the field of biology and disease, pupylation has the ability to playing the key role in the regulation various diseases’ biological processes. Meanwhile, effectively identification such type modification will be helpful for proteins to perform their biological functions and contribute to understanding the molecular mechanism, which is the foundation of drug design. The existing algorithms of identification such types of modified sites often have some defects, such as low accuracy and time-consuming. In this research, the pupylation sites’ identification model, CIPPN, demonstrates better performance than other existing approaches in this field. The proposed predictor achieves Acc value of 89.12 and Mcc value of 0.7949 in 10-fold cross-validation tests in the Pupdb Database (http://cwtung.kmu.edu.tw/pupdb). Significantly, such algorithm not only investigates the sequential, structural and evolutionary hallmarks around pupylation sites but also compares the differences of pupylation from the environmental, conservative and functional characterization of substrates. Therefore, the proposed feature description approach and algorithm results prove to be useful for further experimental investigation of such modification’s identification.
Collapse
Affiliation(s)
- Wenzheng Bao
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China
| | - De-Shuang Huang
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai, China
| |
Collapse
|
7
|
Gur E, Korman M, Hecht N, Regev O, Schlussel S, Silberberg N, Elharar Y. How to control an intracellular proteolytic system: Coordinated regulatory switches in the mycobacterial Pup-proteasome system. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2017; 1864:2253-2260. [PMID: 28887055 DOI: 10.1016/j.bbamcr.2017.08.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 08/26/2017] [Accepted: 08/31/2017] [Indexed: 10/18/2022]
Abstract
Intracellular proteolysis is critical for the proper functioning of all cells, owing to its involvement in a wide range of processes. Because of the destructive nature of protein degradation, intracellular proteolysis is restricted by control mechanisms at almost every step of the proteolytic process. Understanding the coordination of such mechanisms is a challenging task, especially in systems as complex as the eukaryotic ubiquitin-proteasome system (UPS). In comparison, the bacterial analog of the UPS, the Pup-proteasome system (PPS) is much simpler and, therefore, allows for insight into the control of a proteolytic system. This review integrates available information to present a coherent picture of what is known of PPS regulatory switches and describes how these switches act in concert to enforce regulation at the system level. Finally, open questions regarding PPS regulation are discussed, providing readers with a sense of what lies ahead in the field.
Collapse
Affiliation(s)
- Eyal Gur
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel; The National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel.
| | - Maayan Korman
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Nir Hecht
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Ofir Regev
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Shai Schlussel
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Nimrod Silberberg
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel; The National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Yifat Elharar
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
8
|
Nan X, Bao L, Zhao X, Zhao X, Sangaiah AK, Wang GG, Ma Z. EPuL: An Enhanced Positive-Unlabeled Learning Algorithm for the Prediction of Pupylation Sites. Molecules 2017; 22:molecules22091463. [PMID: 28872627 PMCID: PMC6151806 DOI: 10.3390/molecules22091463] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2017] [Revised: 08/29/2017] [Accepted: 08/30/2017] [Indexed: 01/20/2023] Open
Abstract
Protein pupylation is a type of post-translation modification, which plays a crucial role in cellular function of bacterial organisms in prokaryotes. To have a better insight of the mechanisms underlying pupylation an initial, but important, step is to identify pupylation sites. To date, several computational methods have been established for the prediction of pupylation sites which usually artificially design the negative samples using the verified pupylation proteins to train the classifiers. However, if this process is not properly done it can affect the performance of the final predictor dramatically. In this work, different from previous computational methods, we proposed an enhanced positive-unlabeled learning algorithm (EPuL) to the pupylation site prediction problem, which uses only positive and unlabeled samples. Firstly, we separate the training dataset into the positive dataset and the unlabeled dataset which contains the remaining non-annotated lysine residues. Then, the EPuL algorithm is utilized to select the reliably negative initial dataset and then iteratively pick out the non-pupylation sites. The performance of the proposed method was measured with an accuracy of 90.24%, an Area Under Curve (AUC) of 0.93 and an MCC of 0.81 by 10-fold cross-validation. A user-friendly web server for predicting pupylation sites was developed and was freely available at http://59.73.198.144:8080/EPuL.
Collapse
Affiliation(s)
- Xuanguo Nan
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
| | - Lingling Bao
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
| | - Xiaosa Zhao
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
| | - Xiaowei Zhao
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
| | - Arun Kumar Sangaiah
- School of Computing Science and Engineering, VIT University, Vellore 632014, Tamil Nadu, India.
| | - Gai-Ge Wang
- School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116, China.
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
| |
Collapse
|
9
|
Akhter Y, Thakur S. Targets of ubiquitin like system in mycobacteria and related actinobacterial species. Microbiol Res 2017; 204:9-29. [PMID: 28870295 DOI: 10.1016/j.micres.2017.07.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Revised: 06/22/2017] [Accepted: 07/05/2017] [Indexed: 12/22/2022]
Abstract
Protein turnover and recycling is a prerequisite in all living organisms to maintain normal cellular physiology. Many bacteria are proteasome deficient but they possess typical protease enzymes for carrying out protein turnover. However, several groups of actinobacteria such as mycobacteria harbor both proteasome and proteases. In these bacteria, for cellular protein turnover the target proteins undergo post-translational modification referred as pupylation in which a small protein Pup (prokaryotic ubiquitin-like protein) is tagged to the specific lysine residues of the target proteins and after that those target proteins undergo proteasomal degradation. Thus, Pup serves as a degradation signal, helps in directing proteins toward the bacterial proteasome for a turnover. Although the Pup-proteasome system has a multifaceted role in environmental stresses, pathogenicity and regulation of cellular signaling, but the fate of all types of pupylation such as mono and polypupylation on the proteins is still not completely understood. In this review, we present the mechanisms involved in the activation and conjugation of Pup to the target proteins, describing the structural sketch of pupylation and fundamental differences between the eukaryotic ubiquitin-proteasome and bacterial Pup-proteasome systems. We are also presenting a concise classification and cataloging of the complete battery of experimentally identified Pup-substrates from various species of actinobacteria.
Collapse
Affiliation(s)
- Yusuf Akhter
- School of Life Sciences, Central University of Himachal Pradesh, Shahpur, District-Kangra, Himachal Pradesh, 176206, India.
| | - Shweta Thakur
- School of Life Sciences, Central University of Himachal Pradesh, Shahpur, District-Kangra, Himachal Pradesh, 176206, India
| |
Collapse
|
10
|
Mal-Lys: prediction of lysine malonylation sites in proteins integrated sequence-based features with mRMR feature selection. Sci Rep 2016; 6:38318. [PMID: 27910954 PMCID: PMC5133563 DOI: 10.1038/srep38318] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 11/08/2016] [Indexed: 12/25/2022] Open
Abstract
Lysine malonylation is an important post-translational modification (PTM) in proteins, and has been characterized to be associated with diseases. However, identifying malonyllysine sites still remains to be a great challenge due to the labor-intensive and time-consuming experiments. In view of this situation, the establishment of a useful computational method and the development of an efficient predictor are highly desired. In this study, a predictor Mal-Lys which incorporated residue sequence order information, position-specific amino acid propensity and physicochemical properties was proposed. A feature selection method of minimum Redundancy Maximum Relevance (mRMR) was used to select optimal ones from the whole features. With the leave-one-out validation, the value of the area under the curve (AUC) was calculated as 0.8143, whereas 6-, 8- and 10-fold cross-validations had similar AUC values which showed the robustness of the predictor Mal-Lys. The predictor also showed satisfying performance in the experimental data from the UniProt database. Meanwhile, a user-friendly web-server for Mal-Lys is accessible at http://app.aporc.org/Mal-Lys/.
Collapse
|
11
|
Xu Y, Ding J, Wu LY. iSulf-Cys: Prediction of S-sulfenylation Sites in Proteins with Physicochemical Properties of Amino Acids. PLoS One 2016; 11:e0154237. [PMID: 27104833 PMCID: PMC4841585 DOI: 10.1371/journal.pone.0154237] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Accepted: 04/10/2016] [Indexed: 02/07/2023] Open
Abstract
Cysteine S-sulfenylation is an important post-translational modification (PTM) in proteins, and provides redox regulation of protein functions. Bioinformatics and structural analyses indicated that S-sulfenylation could impact many biological and functional categories and had distinct structural features. However, major limitations for identifying cysteine S-sulfenylation were expensive and low-throughout. In view of this situation, the establishment of a useful computational method and the development of an efficient predictor are highly desired. In this study, a predictor iSulf-Cys which incorporated 14 kinds of physicochemical properties of amino acids was proposed. With the 10-fold cross-validation, the value of area under the curve (AUC) was 0.7155 ± 0.0085, MCC 0.3122 ± 0.0144 on the training dataset for 20 times. iSulf-Cys also showed satisfying performance in the independent testing dataset with AUC 0.7343 and MCC 0.3315. Features which were constructed from physicochemical properties and position were carefully analyzed. Meanwhile, a user-friendly web-server for iSulf-Cys is accessible at http://app.aporc.org/iSulf-Cys/.
Collapse
Affiliation(s)
- Yan Xu
- Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, China
- * E-mail:
| | - Jun Ding
- Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, China
| | - Ling-Yun Wu
- Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
12
|
Prediction of sumoylation sites in proteins using linear discriminant analysis. Gene 2016; 576:99-104. [DOI: 10.1016/j.gene.2015.09.072] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 08/24/2015] [Accepted: 09/28/2015] [Indexed: 01/05/2023]
|
13
|
JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method. BIOMED RESEARCH INTERNATIONAL 2015; 2015:705156. [PMID: 26587542 PMCID: PMC4637456 DOI: 10.1155/2015/705156] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Revised: 10/05/2015] [Accepted: 10/11/2015] [Indexed: 11/17/2022]
Abstract
Different types of J-proteins perform distinct functions in chaperone processes and diseases development. Accurate identification of types of J-proteins will provide significant clues to reveal the mechanism of J-proteins and contribute to developing drugs for diseases. In this study, an ensemble predictor called JPPRED for J-protein prediction is proposed with hybrid features, including split amino acid composition (SAAC), pseudo amino acid composition (PseAAC), and position specific scoring matrix (PSSM). To deal with the imbalanced benchmark dataset, the synthetic minority oversampling technique (SMOTE) and undersampling technique are applied. The average sensitivity of JPPRED based on above-mentioned individual feature spaces lies in the range of 0.744–0.851, indicating the discriminative power of these features. In addition, JPPRED yields the highest average sensitivity of 0.875 using the hybrid feature spaces of SAAC, PseAAC, and PSSM. Compared to individual base classifiers, JPPRED obtains more balanced and better performance for each type of J-proteins. To evaluate the prediction performance objectively, JPPRED is compared with previous study. Encouragingly, JPPRED obtains balanced performance for each type of J-proteins, which is significantly superior to that of the existing method. It is anticipated that JPPRED can be a potential candidate for J-protein prediction.
Collapse
|
14
|
Hasan MM, Zhou Y, Lu X, Li J, Song J, Zhang Z. Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs. PLoS One 2015; 10:e0129635. [PMID: 26080082 PMCID: PMC4469302 DOI: 10.1371/journal.pone.0129635] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Accepted: 05/10/2015] [Indexed: 11/20/2022] Open
Abstract
Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup) tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP)] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.
Collapse
Affiliation(s)
- Md. Mehedi Hasan
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Yuan Zhou
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Xiaotian Lu
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Jinyan Li
- Advanced Analytics Institute and Centre for Health Technologies, University of Technology, Sydney, 81 Broadway, NSW 2007, Australia
| | - Jiangning Song
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- Monash Bioinformatics Platform and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC 3800, Australia
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
- * E-mail:
| |
Collapse
|
15
|
Küberl A, Fränzel B, Eggeling L, Polen T, Wolters DA, Bott M. Pupylated proteins in Corynebacterium glutamicum revealed by MudPIT analysis. Proteomics 2014; 14:1531-42. [PMID: 24737727 DOI: 10.1002/pmic.201300531] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Revised: 03/23/2014] [Accepted: 04/07/2014] [Indexed: 12/31/2022]
Abstract
In a manner similar to ubiquitin, the prokaryotic ubiquitin-like protein (Pup) has been shown to target proteins for degradation via the proteasome in mycobacteria. However, not all actinobacteria possessing the Pup protein also contain a proteasome. In this study, we set out to study pupylation in the proteasome-lacking non-pathogenic model organism Corynebacterium glutamicum. A defined pup deletion mutant of C. glutamicum ATCC 13032 grew aerobically as the parent strain in standard glucose minimal medium, indicating that pupylation is dispensable under these conditions. After expression of a Pup derivative carrying an aminoterminal polyhistidine tag in the Δpup mutant and Ni(2+)-chelate affinity chromatography, pupylated proteins were isolated. Multidimensional protein identification technology (MudPIT) and MALDI-TOF-MS/MS of the elution fraction unraveled 55 proteins being pupylated in C. glutamicum and 66 pupylation sites. Similar to mycobacteria, the majority of pupylated proteins are involved in metabolism or translation. Our results define the first pupylome of an actinobacterial species lacking a proteasome, confirming that other fates besides proteasomal degradation are possible for pupylated proteins.
Collapse
Affiliation(s)
- Andreas Küberl
- IBG-1: Biotechnology, Institute of Bio- and Geosciences, Forschungszentrum Jülich, Jülich, Germany
| | | | | | | | | | | |
Collapse
|