1
|
Qin Z, Ren H, Zhao P, Wang K, Liu H, Miao C, Du Y, Li J, Wu L, Chen Z. Current computational tools for protein lysine acylation site prediction. Brief Bioinform 2024; 25:bbae469. [PMID: 39316944 PMCID: PMC11421846 DOI: 10.1093/bib/bbae469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/20/2024] [Accepted: 09/07/2024] [Indexed: 09/26/2024] Open
Abstract
As a main subtype of post-translational modification (PTM), protein lysine acylations (PLAs) play crucial roles in regulating diverse functions of proteins. With recent advancements in proteomics technology, the identification of PTM is becoming a data-rich field. A large amount of experimentally verified data is urgently required to be translated into valuable biological insights. With computational approaches, PLA can be accurately detected across the whole proteome, even for organisms with small-scale datasets. Herein, a comprehensive summary of 166 in silico PLA prediction methods is presented, including a single type of PLA site and multiple types of PLA sites. This recapitulation covers important aspects that are critical for the development of a robust predictor, including data collection and preparation, sample selection, feature representation, classification algorithm design, model evaluation, and method availability. Notably, we discuss the application of protein language models and transfer learning to solve the small-sample learning issue. We also highlight the prediction methods developed for functionally relevant PLA sites and species/substrate/cell-type-specific PLA sites. In conclusion, this systematic review could potentially facilitate the development of novel PLA predictors and offer useful insights to researchers from various disciplines.
Collapse
Affiliation(s)
- Zhaohui Qin
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Haoran Ren
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Kaiyuan Wang
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Huixia Liu
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Chunbo Miao
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Yanxiu Du
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Junzhou Li
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Liuji Wu
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| | - Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China
| |
Collapse
|
2
|
Wang GA, Yan X, Li X, Liu Y, Xia J, Zhu X. MSTL-Kace: Prediction of Prokaryotic Lysine Acetylation Sites Based on Multistage Transfer Learning Strategy. ACS OMEGA 2023; 8:41930-41942. [PMID: 37969991 PMCID: PMC10634282 DOI: 10.1021/acsomega.3c07086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/11/2023] [Accepted: 10/13/2023] [Indexed: 11/17/2023]
Abstract
As one of the most important post-translational modifications (PTM), lysine acetylation (Kace) plays an important role in various biological activities. Traditional experimental methods for identifying Kace sites are inefficient and expensive. Instead, several machine learning methods have been developed for Kace site prediction, and hand-crafted features have been used to encode the protein sequences. However, there are still two challenges: the complex biological information may be under-represented by these manmade features and the small sample issue of some species needs to be addressed. We propose a novel model, MSTL-Kace, which was developed based on transfer learning strategy with pretrained bidirectional encoder representations from transformers (BERT) model. In this model, the high-level embeddings were extracted from species-specific BERT models, and a two-stage fine-tuning strategy was used to deal with small sample issue. Specifically, a domain-specific BERT model was pretrained using all of the sequences in our data sets, which was then fine-tuned, or two-stage fine-tuned based on the training data set of each species to obtain the species-specific BERT models. Afterward, the embeddings of residues were extracted from the fine-tuned model and fed to the different downstream learning algorithms. After comparison, the best model for the six prokaryotic species was built by using a random forest. The results for the independent test sets show that our model outperforms the state-of-the-art methods on all six species. The source codes and data for MSTL-Kace are available at https://github.com/leo97king/MSTL-Kace.
Collapse
Affiliation(s)
- Gang-Ao Wang
- School
of Sciences, Anhui Agricultural University, Hefei 230036, Anhui, China
| | - Xiaodi Yan
- School
of Sciences, Anhui Agricultural University, Hefei 230036, Anhui, China
| | - Xiang Li
- School
of Sciences, Anhui Agricultural University, Hefei 230036, Anhui, China
| | - Yinbo Liu
- School
of Sciences, Anhui Agricultural University, Hefei 230036, Anhui, China
| | - Junfeng Xia
- Key
Laboratory of Intelligent Computing and Signal Processing of Ministry
of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, Anhui, China
| | - Xiaolei Zhu
- School
of Sciences, Anhui Agricultural University, Hefei 230036, Anhui, China
| |
Collapse
|
3
|
Stacpoole PW. Clinical physiology and pharmacology of GSTZ1/MAAI. Biochem Pharmacol 2023; 217:115818. [PMID: 37742772 DOI: 10.1016/j.bcp.2023.115818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 09/05/2023] [Accepted: 09/21/2023] [Indexed: 09/26/2023]
Abstract
Herein I summarize the physiological chemistry and pharmacology of the bifunctional enzyme glutathione transferase zeta 1 (GSTZ1)/ maleylacetoacetate isomerase (MAAI) relevant to human physiology, drug metabolism and disease. MAAI is integral to the catabolism of the amino acids phenylalanine and tyrosine. Genetic or pharmacological inhibition of MAAI can be pathological in animals. However, to date, no clinical disease consequences are unequivocally attributable to inborn errors of this enzyme. MAAI is identical to the zeta 1 family isoform of GST, which biotransforms the investigational drug dichloroacetate (DCA) to the endogenous compound glyoxylate. DCA is a mechanism-based inhibitor of GSTZ1 that significantly reduces its rate of metabolism and increases accumulation of potentially harmful tyrosine intermediates and of the heme precursor δ-aminolevulinic acid (δ-ALA). GSTZ1 is most abundant in rodent and human liver, with its concentration several fold higher in cytoplasm than in mitochondria. Its activity and protein expression are dependent on the age of the host and the intracellular level of chloride ions. Gene association studies have linked GSTZ1 or its protein product to various physiological traits and pathologies. Haplotype variations in GSTZ1 influence the rate of DCA metabolism, enabling a genotyping strategy to allow potentially safe, precision-based drug dosing in clinical trials.
Collapse
Affiliation(s)
- Peter W Stacpoole
- Departments of Medicine and Biochemistry and Molecular Biology, University of Florida, College of Medicine, Gainesville, FL 32601, USA.
| |
Collapse
|
4
|
Yu K, Zhang Q, Liu Z, Du Y, Gao X, Zhao Q, Cheng H, Li X, Liu ZX. Deep learning based prediction of reversible HAT/HDAC-specific lysine acetylation. Brief Bioinform 2021; 21:1798-1805. [PMID: 32978618 DOI: 10.1093/bib/bbz107] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 07/18/2019] [Accepted: 07/30/2019] [Indexed: 11/14/2022] Open
Abstract
Protein lysine acetylation regulation is an important molecular mechanism for regulating cellular processes and plays critical physiological and pathological roles in cancers and diseases. Although massive acetylation sites have been identified through experimental identification and high-throughput proteomics techniques, their enzyme-specific regulation remains largely unknown. Here, we developed the deep learning-based protein lysine acetylation modification prediction (Deep-PLA) software for histone acetyltransferase (HAT)/histone deacetylase (HDAC)-specific acetylation prediction based on deep learning. Experimentally identified substrates and sites of several HATs and HDACs were curated from the literature to generate enzyme-specific data sets. We integrated various protein sequence features with deep neural network and optimized the hyperparameters with particle swarm optimization, which achieved satisfactory performance. Through comparisons based on cross-validations and testing data sets, the model outperformed previous studies. Meanwhile, we found that protein-protein interactions could enrich enzyme-specific acetylation regulatory relations and visualized this information in the Deep-PLA web server. Furthermore, a cross-cancer analysis of acetylation-associated mutations revealed that acetylation regulation was intensively disrupted by mutations in cancers and heavily implicated in the regulation of cancer signaling. These prediction and analysis results might provide helpful information to reveal the regulatory mechanism of protein acetylation in various biological processes to promote the research on prognosis and treatment of cancers. Therefore, the Deep-PLA predictor and protein acetylation interaction networks could provide helpful information for studying the regulation of protein acetylation. The web server of Deep-PLA could be accessed at http://deeppla.cancerbio.info.
Collapse
Affiliation(s)
- Kai Yu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Qingfeng Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Zekun Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Yimeng Du
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Xinjiao Gao
- Division of Molecular and Cell Biophysics, Hefei National Science Center for Physical Sciences at the Microscale, Anhui Key Laboratory of Cellular Dynamics and Chemical Biology, School of Life Sciences, University of Science and Technology of the China, Hefei 230027, China
| | - Qi Zhao
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Han Cheng
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Xiaoxing Li
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Ze-Xian Liu
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| |
Collapse
|
5
|
Basith S, Lee G, Manavalan B. STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction. Brief Bioinform 2021; 23:6370848. [PMID: 34532736 PMCID: PMC8769686 DOI: 10.1093/bib/bbab376] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 12/13/2022] Open
Abstract
Protein post-translational modification (PTM) is an important regulatory mechanism that plays a key role in both normal and disease states. Acetylation on lysine residues is one of the most potent PTMs owing to its critical role in cellular metabolism and regulatory processes. Identifying protein lysine acetylation (Kace) sites is a challenging task in bioinformatics. To date, several machine learning-based methods for the in silico identification of Kace sites have been developed. Of those, a few are prokaryotic species-specific. Despite their attractive advantages and performances, these methods have certain limitations. Therefore, this study proposes a novel predictor STALLION (STacking-based Predictor for ProkAryotic Lysine AcetyLatION), containing six prokaryotic species-specific models to identify Kace sites accurately. To extract crucial patterns around Kace sites, we employed 11 different encodings representing three different characteristics. Subsequently, a systematic and rigorous feature selection approach was employed to identify the optimal feature set independently for five tree-based ensemble algorithms and built their respective baseline model for each species. Finally, the predicted values from baseline models were utilized and trained with an appropriate classifier using the stacking strategy to develop STALLION. Comparative benchmarking experiments showed that STALLION significantly outperformed existing predictor on independent tests. To expedite direct accessibility to the STALLION models, a user-friendly online predictor was implemented, which is available at: http://thegleelab.org/STALLION.
Collapse
Affiliation(s)
- Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Republic of Korea
| | - Gwang Lee
- Department of Molecular Science and Technology, Ajou University, Suwon 16499, Republic of Korea
| | | |
Collapse
|
6
|
Ning Q, Yu M, Ji J, Ma Z, Zhao X. Analysis and prediction of human acetylation using a cascade classifier based on support vector machine. BMC Bioinformatics 2019; 20:346. [PMID: 31208321 PMCID: PMC6580503 DOI: 10.1186/s12859-019-2938-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 06/06/2019] [Indexed: 12/24/2022] Open
Abstract
Background Acetylation on lysine is a widespread post-translational modification which is reversible and plays a crucial role in some biological activities. To better understand the mechanism, it is necessary to identify acetylation sites in proteins accurately. Computational methods are popular because they are more convenient and faster than experimental methods. In this study, we proposed a new computational method to predict acetylation sites in human by combining sequence features and structural features including physicochemical property (PCP), position specific score matrix (PSSM), auto covariation (AC), residue composition (RC), secondary structure (SS) and accessible surface area (ASA), which can well characterize the information of acetylated lysine sites. Besides, a two-step feature selection was applied, which combined mRMR and IFS. It finally trained a cascade classifier based on SVM, which successfully solved the imbalance between positive samples and negative samples and covered all negative sample information. Results The performance of this method is measured with a specificity of 72.19% and a sensibility of 76.71% on independent dataset which shows that a cascade SVM classifier outperforms single SVM classifier. Conclusions In addition to the analysis of experimental results, we also made a systematic and comprehensive analysis of the acetylation data.
Collapse
Affiliation(s)
- Qiao Ning
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| | - Miao Yu
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| | - Jinchao Ji
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China.
| | - Xiaowei Zhao
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China.
| |
Collapse
|
7
|
Abbasi Y, Jabbari J, Jabbari R, Glinge C, Izadyar S, Spiekerkoetter E, Zamanian RT, Carlsen J, Tfelt‐Hansen J. Exome data clouds the pathogenicity of genetic variants in Pulmonary Arterial Hypertension. Mol Genet Genomic Med 2018; 6:835-844. [PMID: 30084161 PMCID: PMC6160702 DOI: 10.1002/mgg3.452] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 04/25/2018] [Accepted: 06/03/2018] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND We aimed to provide a set of previously reported PAH-associated missense and nonsense variants, and evaluate the pathogenicity of those variants. METHODS The Human Gene Mutation Database, PubMed, and Google Scholar were searched for previously reported PAH-associated genes and variants. Thereafter, both exome sequencing project and exome aggregation consortium as background population searched for previously reported PAH-associated missense and nonsense variants. The pathogenicity of previously reported PAH-associated missense variants evaluated by using four in silico prediction tools. RESULTS In total, 14 PAH-associated genes and 180 missense and nonsense variants were gathered. The BMPR2, the most frequent reported gene, encompasses 135 of 180 missense and nonsense variants. The exome sequencing project comprised 9, and the exome aggregation consortium counted 25 of 180 PAH-associated missense and nonsense variants. The TOPBP1 and ENG genes are unlikely to be the monogenic cause of PAH pathogenesis based on allele frequency in background population and prediction analysis. CONCLUSION This is the first evaluation of previously reported PAH-associated missense and nonsense variants. The BMPR2 identified as the major gene out of 14 PAH-associated genes. Based on findings, the ENG and TOPBP1 gene are not likely to be the monogenic cause of PAH.
Collapse
Affiliation(s)
- Yeganeh Abbasi
- Heart CentreDepartment of CardiologyCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
- Department of CardiologySection for Pulmonary Hypertension and Right Heart FailureCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | | | - Reza Jabbari
- Heart CentreDepartment of CardiologyCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
- Department of CardiologySection for Pulmonary Hypertension and Right Heart FailureCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | - Charlotte Glinge
- Heart CentreDepartment of CardiologyCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | - Seyed Bahador Izadyar
- Heart CentreDepartment of CardiologyCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | - Edda Spiekerkoetter
- Division of Pulmonary and Critical CareStanford University School of MedicineCalifornia
| | - Roham T. Zamanian
- Division of Pulmonary and Critical CareStanford University School of MedicineCalifornia
| | - Jørn Carlsen
- Heart CentreDepartment of CardiologyCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
- Department of CardiologySection for Pulmonary Hypertension and Right Heart FailureCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | - Jacob Tfelt‐Hansen
- Heart CentreDepartment of CardiologyCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
- Department of CardiologySection for Pulmonary Hypertension and Right Heart FailureCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
- Department of Forensic MedicineFaculty of Medical SciencesUniversity of CopenhagenDenmark
| |
Collapse
|
8
|
Shi S, Wang L, Cao M, Chen G, Yu J. Proteomic analysis and prediction of amino acid variations that influence protein posttranslational modifications. Brief Bioinform 2018; 20:1597-1606. [DOI: 10.1093/bib/bby036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 03/07/2018] [Indexed: 12/18/2022] Open
Abstract
Abstract
Accumulative studies have indicated that amino acid variations through changing the type of residues of the target sites or key flanking residues could directly or indirectly influence protein posttranslational modifications (PTMs) and bring about a detrimental effect on protein function. Computational mutation analysis can greatly narrow down the efforts on experimental work. To increase the utilization of current computational resources, we first provide an overview of computational prediction of amino acid variations that influence protein PTMs and their functional analysis. We also discuss the challenges that are faced while developing novel in silico approaches in the future. The development of better methods for mutation analysis-related protein PTMs will help to facilitate the development of personalized precision medicine.
Collapse
Affiliation(s)
- Shaoping Shi
- Department of Mathematics and Numerical Simulation and High-Performance Computing Laboratory, School of Sciences, Nanchang University, Nanchang, Jiangxi 330031, China
| | - Lina Wang
- Department of Science, Nanchang Institute of Technology, Nanchang, Jiangxi 330031, China
| | - Man Cao
- Department of Mathematics, School of Sciences, Nanchang University, Nanchang, Jiangxi 330031, China
| | - Guodong Chen
- Department of Mathematics, School of Sciences, Nanchang University, Nanchang, Jiangxi 330031, China
| | - Jialin Yu
- Department of Mathematics, School of Sciences, Nanchang University, Nanchang, Jiangxi 330031, China
| |
Collapse
|
9
|
Xu HD, Wang LN, Wen PP, Shi SP, Qiu JD. Site-Specific Systematic Analysis of Lysine Modification Crosstalk. Proteomics 2018. [DOI: 10.1002/pmic.201700292] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Hao-Dong Xu
- Department of Chemistry; Nanchang University; No. 999 Xuefu Road Nanchang Honggutan New District Jiangxi Province 330031 P. R. China
| | - Li-Na Wang
- Department of Chemistry; Nanchang University; No. 999 Xuefu Road Nanchang Honggutan New District Jiangxi Province 330031 P. R. China
| | - Ping-Ping Wen
- Department of Chemistry; Nanchang University; No. 999 Xuefu Road Nanchang Honggutan New District Jiangxi Province 330031 P. R. China
| | - Shao-Ping Shi
- Department of Chemistry; Nanchang University; No. 999 Xuefu Road Nanchang Honggutan New District Jiangxi Province 330031 P. R. China
| | - Jian-Ding Qiu
- Department of Chemistry; Nanchang University; No. 999 Xuefu Road Nanchang Honggutan New District Jiangxi Province 330031 P. R. China
- Department of Materials and Chemical Engineering; Pingxiang University; Pingxiang P. R. China
| |
Collapse
|
10
|
GPS-PAIL: prediction of lysine acetyltransferase-specific modification sites from protein sequences. Sci Rep 2016; 6:39787. [PMID: 28004786 PMCID: PMC5177928 DOI: 10.1038/srep39787] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 11/28/2016] [Indexed: 01/02/2023] Open
Abstract
Protein acetylation catalyzed by specific histone acetyltransferases (HATs) is an essential post-translational modification (PTM) and involved in the regulation a broad spectrum of biological processes in eukaryotes. Although several ten thousands of acetylation sites have been experimentally identified, the upstream HATs for most of the sites are unclear. Thus, the identification of HAT-specific acetylation sites is fundamental for understanding the regulatory mechanisms of protein acetylation. In this work, we first collected 702 known HAT-specific acetylation sites of 205 proteins from the literature and public data resources, and a motif-based analysis demonstrated that different types of HATs exhibit similar but considerably distinct sequence preferences for substrate recognition. Using 544 human HAT-specific sites for training, we constructed a highly useful tool of GPS-PAIL for the prediction of HAT-specific sites for up to seven HATs, including CREBBP, EP300, HAT1, KAT2A, KAT2B, KAT5 and KAT8. The prediction accuracy of GPS-PAIL was critically evaluated, with a satisfying performance. Using GPS-PAIL, we also performed a large-scale prediction of potential HATs for known acetylation sites identified from high-throughput experiments in nine eukaryotes. Both online service and local packages were implemented, and GPS-PAIL is freely available at: http://pail.biocuckoo.org.
Collapse
|
11
|
Gianazza E, Parravicini C, Primi R, Miller I, Eberini I. In silico prediction and characterization of protein post-translational modifications. J Proteomics 2015; 134:65-75. [PMID: 26436211 DOI: 10.1016/j.jprot.2015.09.026] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2015] [Revised: 07/17/2015] [Accepted: 09/23/2015] [Indexed: 01/06/2023]
Abstract
This review outlines the computational approaches and procedures for predicting post translational modification (PTM)-induced changes in protein conformation and their influence on protein function(s), the latter being assessed as differential affinity in interaction with either low (ligands for receptors or transporters, substrates for enzymes) or high molecular mass molecules (proteins or nucleic acids in supramolecular assemblies). The scope for an in silico approach is discussed against a summary of the in vitro evidence on the structural and functional outcome of protein PTM.
Collapse
Affiliation(s)
- Elisabetta Gianazza
- Dipartimento di Scienze Farmacologiche e Biomolecolari, Università degli Studi di Milano, Gruppo di Studio per la Proteomica e la Struttura delle Proteine, Sezione di Scienze Farmacologiche, Via Balzaretti 9, I-20133 Milan, Italy.
| | - Chiara Parravicini
- Dipartimento di Scienze Farmacologiche e Biomolecolari, Università degli Studi di Milano, Laboratorio di Biochimica e Biofisica Computazionale, Sezione di Biochimica, Biofisica, Fisiologia ed Immunopatologia, Via Trentacoste, 2, I-20134 Milan, Italy
| | - Roberto Primi
- Dipartimento di Scienze Farmacologiche e Biomolecolari, Università degli Studi di Milano, Laboratorio di Biochimica e Biofisica Computazionale, Sezione di Biochimica, Biofisica, Fisiologia ed Immunopatologia, Via Trentacoste, 2, I-20134 Milan, Italy
| | - Ingrid Miller
- Institut für Medizinische Biochemie, Veterinärmedizinische Universität Wien, Veterinärplatz 1, A-1210 Vienna, Austria
| | - Ivano Eberini
- Dipartimento di Scienze Farmacologiche e Biomolecolari, Università degli Studi di Milano, Laboratorio di Biochimica e Biofisica Computazionale, Sezione di Biochimica, Biofisica, Fisiologia ed Immunopatologia, Via Trentacoste, 2, I-20134 Milan, Italy
| |
Collapse
|
12
|
Timucin AC, Bodur C, Basaga H. SIRT1 contributes to aldose reductase expression through modulating NFAT5 under osmotic stress: In vitro and in silico insights. Cell Signal 2015; 27:2160-72. [PMID: 26297866 DOI: 10.1016/j.cellsig.2015.08.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 08/18/2015] [Indexed: 12/13/2022]
Abstract
So far, a myriad of molecules were characterized to modulate NFAT5 and its downstream targets. Among these NFAT5 modifiers, SIRT1 was proposed to have a promising role in NFAT5 dependent events, yet the exact underlying mechanism still remains obscure. Hence, the link between SIRT1 and NFAT5-aldose reductase (AR) axis under osmotic stress, was aimed to be delineated in this study. A unique osmotic stress model was generated and its mechanistic components were deciphered in U937 monocytes. In this model, AR expression and nuclear NFAT5 stabilization were revealed to be positively regulated by SIRT1 through utilization of pharmacological modulators. Overexpression and co-transfection studies of NFAT5 and SIRT1 further validated the contribution of SIRT1 to AR and NFAT5. The involvement of SIRT1 activity in these events was mediated via modification of DNA binding of NFAT5 to AR ORE region. Besides, NFAT5 and SIRT1 were also shown to co-immunoprecipitate under isosmotic conditions and this interaction was disrupted by osmotic stress. Further in silico experiments were conducted to investigate if SIRT1 directly targets NFAT5. In this regard, certain lysine residues of NFAT5, when kept deacetylated, were found to contribute to its DNA binding and SIRT1 was shown to directly bind K282 of NFAT5. Based on these in vitro and in silico findings, SIRT1 was identified, for the first time, as a novel positive regulator of NFAT5 dependent AR expression under osmotic stress in U937 monocytes.
Collapse
Affiliation(s)
- Ahmet Can Timucin
- Molecular Biology, Genetics and Bioengineering Program, Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Orhanli, Tuzla, Istanbul, Turkey.
| | - Cagri Bodur
- Molecular Biology, Genetics and Bioengineering Program, Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Orhanli, Tuzla, Istanbul, Turkey.
| | - Huveyda Basaga
- Molecular Biology, Genetics and Bioengineering Program, Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Orhanli, Tuzla, Istanbul, Turkey.
| |
Collapse
|
13
|
Xu HD, Shi SP, Chen X, Qiu JD. Systematic Analysis of the Genetic Variability That Impacts SUMO Conjugation and Their Involvement in Human Diseases. Sci Rep 2015; 5:10900. [PMID: 26154679 PMCID: PMC4495600 DOI: 10.1038/srep10900] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Accepted: 05/05/2015] [Indexed: 12/12/2022] Open
Abstract
Protein function has been observed to rely on select essential sites instead of requiring all sites to be indispensable. Small ubiquitin-related modifier (SUMO) conjugation or sumoylation, which is a highly dynamic reversible process and its outcomes are extremely diverse, ranging from changes in localization to altered activity and, in some cases, stability of the modified, has shown to be especially valuable in cellular biology. Motivated by the significance of SUMO conjugation in biological processes, we report here on the first exploratory assessment whether sumoylation related genetic variability impacts protein functions as well as the occurrence of diseases related to SUMO. Here, we defined the SUMOAMVR as sumoylation related amino acid variations that affect sumoylation sites or enzymes involved in the process of connectivity, and categorized four types of potential SUMOAMVRs. We detected that 17.13% of amino acid variations are potential SUMOAMVRs and 4.83% of disease mutations could lead to SUMOAMVR with our system. More interestingly, the statistical analysis demonstrates that the amino acid variations that directly create new potential lysine sumoylation sites are more likely to cause diseases. It can be anticipated that our method can provide more instructive guidance to identify the mechanisms of genetic diseases.
Collapse
Affiliation(s)
- Hao-Dong Xu
- Department of Chemistry, Nanchang University, Nanchang 330031, P.R.China
| | - Shao-Ping Shi
- Department of Mathematics, Nanchang University, Nanchang 330031, P.R.China
| | - Xiang Chen
- Department of Chemistry, Nanchang University, Nanchang 330031, P.R.China
| | - Jian-Ding Qiu
- 1] Department of Chemistry, Nanchang University, Nanchang 330031, P.R.China [2] Department of Materials and Chemical Engineering, Pingxiang College, Pingxiang 337055, P.R.China
| |
Collapse
|
14
|
Tissue-specific sequence and structural environments of lysine acetylation sites. J Struct Biol 2015; 191:39-48. [DOI: 10.1016/j.jsb.2015.06.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Revised: 05/29/2015] [Accepted: 06/01/2015] [Indexed: 11/22/2022]
|
15
|
Cesaro L, Pinna LA, Salvi M. A Comparative Analysis and Review of lysyl Residues Affected by Posttranslational Modifications. Curr Genomics 2015; 16:128-38. [PMID: 26085811 PMCID: PMC4467303 DOI: 10.2174/1389202916666150216221038] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Revised: 02/09/2015] [Accepted: 02/10/2015] [Indexed: 11/22/2022] Open
Abstract
Post-translational modification is the most common mechanism of regulating protein function. If
phosphorylation is considered a key event in many signal transduction pathways, other modifications must be
considered as well. In particular the side chain of lysine residues is a target of different modifications; notably
acetylation, methylation, ubiquitylation, sumoylation, neddylation, etc. Mass spectrometry approaches combining
highly sensitive instruments and specific enrichment strategies have enabled the identification of modified
sites on a large scale. Here we make a comparative analysis of the most representative lysine modifications
(ubiquitylation, acetylation, sumoylation and methylation) identified in the human proteome. This review focuses on
conserved amino acids, secondary structures preference, subcellular localization of modified proteins, and signaling pathways
where these modifications are implicated. We discuss specific differences and similarities between these modifications,
characteristics of the crosstalk among lysine post translational modifications, and single nucleotide polymorphisms
that could influence lysine post-translational modifications in humans.
Collapse
Affiliation(s)
- Luca Cesaro
- Department of Biomedical Sciences, University of Padova, Via U. Bassi 58/B, Padova, Italy
| | - Lorenzo A Pinna
- Department of Biomedical Sciences, University of Padova, Via U. Bassi 58/B, Padova, Italy ; Institute of Neurosciences, V.le G. Colombo 3, Padova, Italy
| | - Mauro Salvi
- Department of Biomedical Sciences, University of Padova, Via U. Bassi 58/B, Padova, Italy
| |
Collapse
|
16
|
Shi SP, Xu HD, Wen PP, Qiu JD. Progress and challenges in predicting protein methylation sites. MOLECULAR BIOSYSTEMS 2015; 11:2610-9. [DOI: 10.1039/c5mb00259a] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We review the progress in the prediction of protein methylation sites in the past 10 years and discuss the challenges that are faced while developing novel predictors in the future.
Collapse
Affiliation(s)
- Shao-Ping Shi
- Department of Chemistry
- Nanchang University
- Nanchang
- China
- Department of Mathematics
| | - Hao-Dong Xu
- Department of Chemistry
- Nanchang University
- Nanchang
- China
| | - Ping-Ping Wen
- Department of Chemistry
- Nanchang University
- Nanchang
- China
| | - Jian-Ding Qiu
- Department of Chemistry
- Nanchang University
- Nanchang
- China
| |
Collapse
|
17
|
Accurate in silico identification of species-specific acetylation sites by integrating protein sequence-derived and functional features. Sci Rep 2014; 4:5765. [PMID: 25042424 PMCID: PMC4104576 DOI: 10.1038/srep05765] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Accepted: 07/03/2014] [Indexed: 11/08/2022] Open
Abstract
Lysine acetylation is a reversible post-translational modification, playing an important role in cytokine signaling, transcriptional regulation, and apoptosis. To fully understand acetylation mechanisms, identification of substrates and specific acetylation sites is crucial. Experimental identification is often time-consuming and expensive. Alternative bioinformatics methods are cost-effective and can be used in a high-throughput manner to generate relatively precise predictions. Here we develop a method termed as SSPKA for species-specific lysine acetylation prediction, using random forest classifiers that combine sequence-derived and functional features with two-step feature selection. Feature importance analysis indicates functional features, applied for lysine acetylation site prediction for the first time, significantly improve the predictive performance. We apply the SSPKA model to screen the entire human proteome and identify many high-confidence putative substrates that are not previously identified. The results along with the implemented Java tool, serve as useful resources to elucidate the mechanism of lysine acetylation and facilitate hypothesis-driven experimental design and validation.
Collapse
|