1
|
Bashyal A, Brodbelt JS. Uncommon posttranslational modifications in proteomics: ADP-ribosylation, tyrosine nitration, and tyrosine sulfation. MASS SPECTROMETRY REVIEWS 2024; 43:289-326. [PMID: 36165040 PMCID: PMC10040477 DOI: 10.1002/mas.21811] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 09/06/2022] [Accepted: 09/07/2022] [Indexed: 06/16/2023]
Abstract
Posttranslational modifications (PTMs) are covalent modifications of proteins that modulate the structure and functions of proteins and regulate biological processes. The development of various mass spectrometry-based proteomics workflows has facilitated the identification of hundreds of PTMs and aided the understanding of biological significance in a high throughput manner. Improvements in sample preparation and PTM enrichment techniques, instrumentation for liquid chromatography-tandem mass spectrometry (LC-MS/MS), and advanced data analysis tools enhance the specificity and sensitivity of PTM identification. Highly prevalent PTMs like phosphorylation, glycosylation, acetylation, ubiquitinylation, and methylation are extensively studied. However, the functions and impact of less abundant PTMs are not as well understood and underscore the need for analytical methods that aim to characterize these PTMs. This review focuses on the advancement and analytical challenges associated with the characterization of three less common but biologically relevant PTMs, specifically, adenosine diphosphate-ribosylation, tyrosine sulfation, and tyrosine nitration. The advantages and disadvantages of various enrichment, separation, and MS/MS techniques utilized to identify and localize these PTMs are described.
Collapse
Affiliation(s)
- Aarti Bashyal
- Department of Chemistry, The University of Texas at Austin, Austin, Texas, USA
| | - Jennifer S Brodbelt
- Department of Chemistry, The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
2
|
Kweon HK, Kong AT, Hersberger KE, Huang S, Nesvizhskii AI, Wang Y, Hakansson K, Andrews PC. Sulfoproteomics Workflow with Precursor Ion Accurate Mass Shift Analysis Reveals Novel Tyrosine Sulfoproteins in the Golgi. J Proteome Res 2024; 23:71-83. [PMID: 38112105 PMCID: PMC11218929 DOI: 10.1021/acs.jproteome.3c00323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Tyrosine sulfation in the Golgi of secreted and membrane proteins is an important post-translational modification (PTM). However, its labile nature has limited analysis by mass spectrometry (MS), a major reason why no sulfoproteome studies have been previously reported. Here, we show that a phosphoproteomics experimental workflow, which includes serial enrichment followed by high resolution, high mass accuracy MS, and tandem MS (MS/MS) analysis, enables sulfopeptide coenrichment and identification via accurate precursor ion mass shift open MSFragger database search. This approach, supported by manual validation, allows the confident identification of sulfotyrosine-containing peptides in the presence of high levels of phosphorylated peptides, thus enabling these two sterically and ionically similar isobaric PTMs to be distinguished and annotated in a single proteomic analysis. We applied this approach to isolated interphase and mitotic rat liver Golgi membranes and identified 67 tyrosine sulfopeptides, corresponding to 26 different proteins. This work discovered 23 new sulfoproteins with functions related to, for example, Ca2+-binding, glycan biosynthesis, and exocytosis. In addition, we report the first preliminary evidence for crosstalk between sulfation and phosphorylation in the Golgi, with implications for functional control.
Collapse
Affiliation(s)
- Hye Kyong Kweon
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, United States
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan 48109-0600, United States
| | - Andy T Kong
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109-5602, United States
| | - Katherine E Hersberger
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, United States
| | - Shijiao Huang
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, Michigan 48109-1085, United States
| | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109-5602, United States
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109-2218, United States
| | - Yanzhuang Wang
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, Michigan 48109-1085, United States
| | - Kristina Hakansson
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, United States
| | - Philip C Andrews
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, United States
- Department of Biological Chemistry, University of Michigan Medical School, Ann Arbor, Michigan 48109-0600, United States
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109-2218, United States
| |
Collapse
|
3
|
Dai D, Zhu Z, Han H, Xu T, Feng S, Zhang W, Ding F, Zhang R, Zhu J. Enhanced tyrosine sulfation is associated with chronic kidney disease-related atherosclerosis. BMC Biol 2023; 21:151. [PMID: 37424015 DOI: 10.1186/s12915-023-01641-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 06/02/2023] [Indexed: 07/11/2023] Open
Abstract
BACKGROUND Chronic kidney disease (CKD) accelerates atherosclerosis, but the mechanisms remain unclear. Tyrosine sulfation has been recognized as a key post-translational modification (PTM) in regulation of various cellular processes, and the sulfated adhesion molecules and chemokine receptors have been shown to participate in the pathogenesis of atherosclerosis via enhancement of monocyte/macrophage function. The levels of inorganic sulfate, the essential substrate for the sulfation reaction, are dramatically increased in patients with CKD, which indicates a change of sulfation status in CKD patients. Thus, in the present study, we detected the sulfation status in CKD patients and probed into the impact of sulfation on CKD-related atherosclerosis by targeting tyrosine sulfation function. RESULTS PBMCs from individuals with CKD showed higher amounts of total sulfotyrosine and tyrosylprotein sulfotransferase (TPST) type 1 and 2 protein levels. The plasma level of O-sulfotyrosine, the metabolic end product of tyrosine sulfation, increased significantly in CKD patients. Statistically, O-sulfotyrosine and the coronary atherosclerosis severity SYNTAX score positively correlated. Mechanically, more sulfate-positive nucleated cells in peripheral blood and more abundant infiltration of sulfated macrophages in deteriorated vascular plaques in CKD ApoE null mice were noted. Knockout of TPST1 and TPST2 decreased atherosclerosis and peritoneal macrophage adherence and migration in CKD condition. The sulfation of the chemokine receptors, CCR2 and CCR5, was increased in PBMCs from CKD patients. CONCLUSIONS CKD is associated with increased sulfation status. Increased sulfation contributes to monocyte/macrophage activation and might be involved in CKD-related atherosclerosis. Inhibition of sulfation may suppress CKD-related atherosclerosis and is worthy of further study.
Collapse
Affiliation(s)
- Daopeng Dai
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China
- Institute of Cardiovascular Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhengbin Zhu
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China
| | - Hui Han
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China
| | - Tian Xu
- Department of Nephrology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shuo Feng
- Institute of Cardiovascular Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Wenli Zhang
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China
| | - Fenghua Ding
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China
| | - Ruiyan Zhang
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China.
- Institute of Cardiovascular Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| | - Jinzhou Zhu
- Department of Vascular & Cardiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, 197 Ruijin Road II, Shanghai, 200025, China.
- Institute of Cardiovascular Diseases, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
4
|
Siraj A, Lim DY, Tayara H, Chong KT. UbiComb: A Hybrid Deep Learning Model for Predicting Plant-Specific Protein Ubiquitylation Sites. Genes (Basel) 2021; 12:genes12050717. [PMID: 34064731 PMCID: PMC8151217 DOI: 10.3390/genes12050717] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 05/06/2021] [Accepted: 05/07/2021] [Indexed: 12/11/2022] Open
Abstract
Protein ubiquitylation is an essential post-translational modification process that performs a critical role in a wide range of biological functions, even a degenerative role in certain diseases, and is consequently used as a promising target for the treatment of various diseases. Owing to the significant role of protein ubiquitylation, these sites can be identified by enzymatic approaches, mass spectrometry analysis, and combinations of multidimensional liquid chromatography and tandem mass spectrometry. However, these large-scale experimental screening techniques are time consuming, expensive, and laborious. To overcome the drawbacks of experimental methods, machine learning and deep learning-based predictors were considered for prediction in a timely and cost-effective manner. In the literature, several computational predictors have been published across species; however, predictors are species-specific because of the unclear patterns in different species. In this study, we proposed a novel approach for predicting plant ubiquitylation sites using a hybrid deep learning model by utilizing convolutional neural network and long short-term memory. The proposed method uses the actual protein sequence and physicochemical properties as inputs to the model and provides more robust predictions. The proposed predictor achieved the best result with accuracy values of 80% and 81% and F-scores of 79% and 82% on the 10-fold cross-validation and an independent dataset, respectively. Moreover, we also compared the testing of the independent dataset with popular ubiquitylation predictors; the results demonstrate that our model significantly outperforms the other methods in prediction classification results.
Collapse
Affiliation(s)
- Arslan Siraj
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Korea; (A.S.); (D.Y.L.)
| | - Dae Yeong Lim
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Korea; (A.S.); (D.Y.L.)
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Korea
- Correspondence: (H.T.); (K.T.C.)
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Korea; (A.S.); (D.Y.L.)
- Advanced Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Korea
- Correspondence: (H.T.); (K.T.C.)
| |
Collapse
|
5
|
Ao C, Yu L, Zou Q. Prediction of bio-sequence modifications and the associations with diseases. Brief Funct Genomics 2020; 20:1-18. [PMID: 33313647 DOI: 10.1093/bfgp/elaa023] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 11/09/2020] [Accepted: 11/10/2020] [Indexed: 12/22/2022] Open
Abstract
Modifications of protein, RNA and DNA play an important role in many biological processes and are related to some diseases. Therefore, accurate identification and comprehensive understanding of protein, RNA and DNA modification sites can promote research on disease treatment and prevention. With the development of sequencing technology, the number of known sequences has continued to increase. In the past decade, many computational tools that can be used to predict protein, RNA and DNA modification sites have been developed. In this review, we comprehensively summarized the modification site predictors for three different biological sequences and the association with diseases. The relevant web server is accessible at http://lab.malab.cn/∼acy/PTM_data/ some sample data on protein, RNA and DNA modification can be downloaded from that website.
Collapse
|
6
|
3-Nitrotyrosine and related derivatives in proteins: precursors, radical intermediates and impact in function. Essays Biochem 2020; 64:111-133. [PMID: 32016371 DOI: 10.1042/ebc20190052] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 12/30/2019] [Accepted: 01/03/2020] [Indexed: 12/22/2022]
Abstract
Oxidative post-translational modification of proteins by molecular oxygen (O2)- and nitric oxide (•NO)-derived reactive species is a usual process that occurs in mammalian tissues under both physiological and pathological conditions and can exert either regulatory or cytotoxic effects. Although the side chain of several amino acids is prone to experience oxidative modifications, tyrosine residues are one of the preferred targets of one-electron oxidants, given the ability of their phenolic side chain to undergo reversible one-electron oxidation to the relatively stable tyrosyl radical. Naturally occurring as reversible catalytic intermediates at the active site of a variety of enzymes, tyrosyl radicals can also lead to the formation of several stable oxidative products through radical-radical reactions, as is the case of 3-nitrotyrosine (NO2Tyr). The formation of NO2Tyr mainly occurs through the fast reaction between the tyrosyl radical and nitrogen dioxide (•NO2). One of the key endogenous nitrating agents is peroxynitrite (ONOO-), the product of the reaction of superoxide radical (O2•-) with •NO, but ONOO--independent mechanisms of nitration have been also disclosed. This chemical modification notably affects the physicochemical properties of tyrosine residues and because of this, it can have a remarkable impact on protein structure and function, both in vitro and in vivo. Although low amounts of NO2Tyr are detected under basal conditions, significantly increased levels are found at pathological states related with an overproduction of reactive species, such as cardiovascular and neurodegenerative diseases, inflammation and aging. While NO2Tyr is a well-established stable oxidative stress biomarker and a good predictor of disease progression, its role as a pathogenic mediator has been laboriously defined for just a small number of nitrated proteins and awaits further studies.
Collapse
|
7
|
Ahmed S, Kabir M, Arif M, Khan ZU, Yu DJ. DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information. Anal Biochem 2020; 612:113955. [PMID: 32949607 DOI: 10.1016/j.ab.2020.113955] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 08/30/2020] [Accepted: 09/11/2020] [Indexed: 12/29/2022]
Abstract
Phosphorylation is a ubiquitous type of post-translational modification (PTM) that occurs in both eukaryotic and prokaryotic cells where in a phosphate group binds with amino acid residues. These specific residues, i.e., serine (S), threonine (T), and tyrosine (Y), exhibit diverse functions at the molecular level. Recent studies have determined that some diseases such as cancer, diabetes, and neurodegenerative diseases are caused by abnormal phosphorylation. Based on its potential applications in biological research and drug development, the large-scale identification of phosphorylation sites has attracted interest. Existing wet-lab technologies for targeting phosphorylation sites are overpriced and time consuming. Thus, computational algorithms that can efficiently accelerate the annotation of phosphorylation sites from massive protein sequences are needed. Numerous machine learning-based methods have been implemented for phosphorylation sites prediction. However, despite extensive efforts, existing computational approaches continue to have inadequate performance, particularly in terms of overall ACC, MCC, and AUC. In this paper, we report a novel deep learning-based predictor to overcome these performance hurdles, DeepPPSite, which was constructed using a stacked long short-term memory recurrent network for predicting phosphorylation sites. The proposed technique expediently learns the protein representations from conjoint protein descriptors. The experimental results indicated that our model achieved superior performance on the training dataset for S, T and Y, with MCC values of 0.608, 0.602, and 0.558, respectively, using a 10-fold cross-validation test. We further determined the generalization efficacy of the proposed predictor DeepPPSite by conducting a rigorous independent test. The predictive MCC values were 0.358, 0.356, and 0.350 for the S, T, and Y phosphorylation sites, respectively. Rigorous cross-validation and independent validation tests for the three types of phosphorylation sites demonstrated that the designed DeepPPSite tool significantly outperforms state-of-the-art methods.
Collapse
Affiliation(s)
- Saeed Ahmed
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Muhammad Kabir
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Muhammad Arif
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Zaheer Ullah Khan
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China.
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
8
|
Savage SR, Zhang B. Using phosphoproteomics data to understand cellular signaling: a comprehensive guide to bioinformatics resources. Clin Proteomics 2020; 17:27. [PMID: 32676006 PMCID: PMC7353784 DOI: 10.1186/s12014-020-09290-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 07/04/2020] [Indexed: 12/19/2022] Open
Abstract
Mass spectrometry-based phosphoproteomics is becoming an essential methodology for the study of global cellular signaling. Numerous bioinformatics resources are available to facilitate the translation of phosphopeptide identification and quantification results into novel biological and clinical insights, a critical step in phosphoproteomics data analysis. These resources include knowledge bases of kinases and phosphatases, phosphorylation sites, kinase inhibitors, and sequence variants affecting kinase function, and bioinformatics tools that can predict phosphorylation sites in addition to the kinase that phosphorylates them, infer kinase activity, and predict the effect of mutations on kinase signaling. However, these resources exist in silos and it is challenging to select among multiple resources with similar functions. Therefore, we put together a comprehensive collection of resources related to phosphoproteomics data interpretation, compared the use of tools with similar functions, and assessed the usability from the standpoint of typical biologists or clinicians. Overall, tools could be improved by standardization of enzyme names, flexibility of data input and output format, consistent maintenance, and detailed manuals.
Collapse
Affiliation(s)
- Sara R. Savage
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN USA
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| |
Collapse
|
9
|
Chen G, Cao M, Yu J, Guo X, Shi S. Prediction and functional analysis of prokaryote lysine acetylation site by incorporating six types of features into Chou's general PseAAC. J Theor Biol 2019; 461:92-101. [DOI: 10.1016/j.jtbi.2018.10.047] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 10/09/2018] [Accepted: 10/22/2018] [Indexed: 12/12/2022]
|
10
|
Cao M, Chen G, Yu J, Shi S. Computational prediction and analysis of species-specific fungi phosphorylation via feature optimization strategy. Brief Bioinform 2018; 21:595-608. [DOI: 10.1093/bib/bby122] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 11/16/2018] [Accepted: 11/22/2018] [Indexed: 11/12/2022] Open
Abstract
Abstract
Protein phosphorylation is a reversible and ubiquitous post-translational modification that primarily occurs at serine, threonine and tyrosine residues and regulates a variety of biological processes. In this paper, we first briefly summarized the current progresses in computational prediction of eukaryotic protein phosphorylation sites, which mainly focused on animals and plants, especially on human, with a less extent on fungi. Since the number of identified fungi phosphorylation sites has greatly increased in a wide variety of organisms and their roles in pathological physiology still remain largely unknown, more attention has been paid on the identification of fungi-specific phosphorylation. Here, experimental fungi phosphorylation sites data were collected and most of the sites were classified into different types to be encoded with various features and trained via a two-step feature optimization method. A novel method for prediction of species-specific fungi phosphorylation-PreSSFP was developed, which can identify fungi phosphorylation in seven species for specific serine, threonine and tyrosine residues (http://computbiol.ncu.edu.cn/PreSSFP). Meanwhile, we critically evaluated the performance of PreSSFP and compared it with other existing tools. The satisfying results showed that PreSSFP is a robust predictor. Feature analyses exhibited that there have some significant differences among seven species. The species-specific prediction via two-step feature optimization method to mine important features for training could considerably improve the prediction performance. We anticipate that our study provides a new lead for future computational analysis of fungi phosphorylation.
Collapse
Affiliation(s)
- Man Cao
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Guodong Chen
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Jialin Yu
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Shaoping Shi
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| |
Collapse
|
11
|
Yu J, Shi S, Zhang F, Chen G, Cao M. PredGly: predicting lysine glycation sites for Homo sapiens based on XGboost feature optimization. Bioinformatics 2018; 35:2749-2756. [DOI: 10.1093/bioinformatics/bty1043] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 12/13/2018] [Accepted: 12/20/2018] [Indexed: 01/22/2023] Open
Abstract
Abstract
Motivation
Protein glycation is a familiar post-translational modification (PTM) which is a two-step non-enzymatic reaction. Glycation not only impairs the function but also changes the characteristics of the proteins so that it is related to many human diseases. It is still much more difficult to systematically detect glycation sites due to the glycated residues without crucial patterns. Computational approaches, which can filter supposed sites prior to experimental verification, can extremely increase the efficiency of experiment work. However, the previous lysine glycation prediction method uses a small number of training datasets. Hence, the model is not generalized or pervasive.
Results
By searching from a new database, we collected a large dataset in Homo sapiens. PredGly, a novel software, can predict lysine glycation sites for H.sapiens, which was developed by combining multiple features. In addition, XGboost was adopted to optimize feature vectors and to improve the model performance. Through comparing various classifiers, support vector machine achieved an optimal performance. On the basis of a new independent test set, PredGly outperformed other glycation tools. It suggests that PredGly can provide more instructive guidance for further experimental research of lysine glycation.
Availability and implementation
https://github.com/yujialinncu/PredGly
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jialin Yu
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Shaoping Shi
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Fang Zhang
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Guodong Chen
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| | - Man Cao
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, China
| |
Collapse
|