1
|
Yang Q, Xu S, Jiang W, Meng F, Wang S, Sun Z, Chen N, Peng D, Liu J, Xing S. Systematic qualitative proteome-wide analysis of lysine malonylation profiling in Platycodon grandiflorus. Amino Acids 2025; 57:9. [PMID: 39812870 PMCID: PMC11735498 DOI: 10.1007/s00726-024-03432-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 11/25/2024] [Indexed: 01/16/2025]
Abstract
In recent years, it was found that lysine malonylation modification can affect biological metabolism and play an important role in plant life activities. Platycodon grandiflorus, an economic crop and medicinal plant, had no reports on malonylation in the related literature. This study qualitatively introduces lysine malonylation in P. grandiflorus. A total of 888 lysine malonylation-modified proteins in P. grandiflorus were identified, with a total of 1755 modification sites. According to the functional annotation, malonylated proteins were closely related to catalysis, binding, and other reactions. Subcellular localization showed that related proteins were enriched in chloroplasts, cytoplasm, and nuclei, indicating that this modification could regulate various metabolic processes. Motif analysis showed the enrichment of Alanine (A), Cysteine (C), Glycine (G), and Valine (V) amino acids surrounding malonylated lysine residues. Metabolic pathway and protein-protein interaction network analyses suggested these modifications are mainly involved in plant photosynthesis. Moreover, malonylated proteins are also involved in stress and defense responses. This study shows that lysine malonylation can affect a variety of biological processes and metabolic pathways, and the contents are reported for the first time in P. grandiflorus, which can provide important information for further research on P. grandiflorus and lysine malonylation's role in environment stress, photosynthesis, and secondary metabolites enrichment.
Collapse
Affiliation(s)
- Qingshan Yang
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China
| | - Shaowei Xu
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China
| | - Weimin Jiang
- Hunan Key Laboratory for Conservation and Utilization of Biological Resources in the Nanyue Mountainous Region, College of Life Sciences and Environment, Hengyang Normal University, Hengyang, 421008, Hunan, China
| | - Fei Meng
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China
- Institute of Traditional Chinese Medicine Resources Protection and Development, Anhui Academy of Chinese Medicine, Hefei, 230012, China
| | - Shuting Wang
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China
| | - Zongping Sun
- Engineering Technology Research Center of Anti-Aging, Chinese Herbal Medicine, Fuyang Normal University, Fuyang, 236037, China
| | - Na Chen
- Joint Research Center for Chinese Herbal Medicine of Anhui of IHM, Hefei Comprehensive National Science Center, Bozhou, 236814, China
| | - Daiyin Peng
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China
- Institute of Traditional Chinese Medicine Resources Protection and Development, Anhui Academy of Chinese Medicine, Hefei, 230012, China
- MOE-Anhui Joint Collaborative Innovation Center for Quality Improvement of Anhui Genuine Chinese Medicinal Materials, Hefei, 230038, China
| | - Juan Liu
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China.
| | - Shihai Xing
- College of Pharmacy, Anhui University of Chinese Medicine, Hefei, 230012, China.
- Institute of Traditional Chinese Medicine Resources Protection and Development, Anhui Academy of Chinese Medicine, Hefei, 230012, China.
- Joint Research Center for Chinese Herbal Medicine of Anhui of IHM, Hefei Comprehensive National Science Center, Bozhou, 236814, China.
| |
Collapse
|
2
|
Guo C, Chen Y, Ma C, Hao S, Song J. A Survey on AI-Driven Mouse Behavior Analysis Applications and Solutions. Bioengineering (Basel) 2024; 11:1121. [PMID: 39593781 PMCID: PMC11591614 DOI: 10.3390/bioengineering11111121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 10/30/2024] [Accepted: 11/04/2024] [Indexed: 11/28/2024] Open
Abstract
The physiological similarities between mice and humans make them vital animal models in biological and medical research. This paper explores the application of artificial intelligence (AI) in analyzing mice behavior, emphasizing AI's potential to identify and classify these behaviors. Traditional methods struggle to capture subtle behavioral features, whereas AI can automatically extract quantitative features from large datasets. Consequently, this study aims to leverage AI to enhance the efficiency and accuracy of mice behavior analysis. The paper reviews various applications of mice behavior analysis, categorizes deep learning tasks based on an AI pyramid, and summarizes AI methods for addressing these tasks. The findings indicate that AI technologies are increasingly applied in mice behavior analysis, including disease detection, assessment of external stimuli effects, social behavior analysis, and neurobehavioral assessment. The selection of AI methods is crucial and must align with specific applications. Despite AI's promising potential in mice behavior analysis, challenges such as insufficient datasets and benchmarks remain. Furthermore, there is a need for a more integrated AI platform, along with standardized datasets and benchmarks, to support these analyses and further advance AI-driven mice behavior analysis.
Collapse
Affiliation(s)
- Chaopeng Guo
- Software College, Northeastern University, Shenyang 110169, China; (C.G.); (Y.C.)
| | - Yuming Chen
- Software College, Northeastern University, Shenyang 110169, China; (C.G.); (Y.C.)
| | - Chengxia Ma
- College of Life and Health Sciences, Northeastern University, Shenyang 110169, China;
| | - Shuang Hao
- College of Life and Health Sciences, Northeastern University, Shenyang 110169, China;
| | - Jie Song
- Software College, Northeastern University, Shenyang 110169, China; (C.G.); (Y.C.)
| |
Collapse
|
3
|
Ramazi S, Tabatabaei SAH, Khalili E, Nia AG, Motarjem K. Analysis and review of techniques and tools based on machine learning and deep learning for prediction of lysine malonylation sites in protein sequences. Database (Oxford) 2024; 2024:baad094. [PMID: 38245002 PMCID: PMC10799748 DOI: 10.1093/database/baad094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 11/30/2023] [Accepted: 12/20/2023] [Indexed: 01/22/2024]
Abstract
The post-translational modifications occur as crucial molecular regulatory mechanisms utilized to regulate diverse cellular processes. Malonylation of proteins, a reversible post-translational modification of lysine/k residues, is linked to a variety of biological functions, such as cellular regulation and pathogenesis. This modification plays a crucial role in metabolic pathways, mitochondrial functions, fatty acid oxidation and other life processes. However, accurately identifying malonylation sites is crucial to understand the molecular mechanism of malonylation, and the experimental identification can be a challenging and costly task. Recently, approaches based on machine learning (ML) have been suggested to address this issue. It has been demonstrated that these procedures improve accuracy while lowering costs and time constraints. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features and inefficient underlying classifiers. As a result, there is an urgent need for effective predictors and calculation methods. In this study, we provide a comprehensive analysis and review of existing prediction models, tools and benchmark datasets for predicting malonylation sites in protein sequences followed by a comparison study. The review consists of the specifications of benchmark datasets, explanation of features and encoding methods, descriptions of the predictions approaches and their embedding ML or deep learning models and the description and comparison of the existing tools in this domain. To evaluate and compare the prediction capability of the tools, a new bunch of data has been extracted based on the most updated database and the tools have been assessed based on the extracted data. Finally, a hybrid architecture consisting of several classifiers including classical ML models and a deep learning model has been proposed to ensemble the prediction results. This approach demonstrates the better performance in comparison with all prediction tools included in this study (the source codes of the models presented in this manuscript are available in https://github.com/Malonylation). Database URL: https://github.com/A-Golshan/Malonylation.
Collapse
Affiliation(s)
| | - Seyed Amir Hossein Tabatabaei
- Department of Computer Science, Faculty of Mathematical Sciences, University of Guilan, Namjoo St. Postal, Rasht 41938-33697, Iran
- Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Jalal AleAhmad, Tehran 14117-13116, Iran
| | - Elham Khalili
- Department of Plant Sciences, Faculty of Science, Tarbiat Modares University, Jalal AleAhmad, Tehran 14117-13116, Iran
| | - Amirhossein Golshan Nia
- Department of Mathematics and Computer Science, Amirkabir University of Technology, No. 350, Hafez Ave, Tehran 15916-34311, Iran
| | - Kiomars Motarjem
- Department of Statistics, Faculty of Mathematical Sciences, Tarbiat Modares University, Jalal AleAhmad, Tehran 14117-13116, Iran
| |
Collapse
|
4
|
Smith BJ, Brandão-Teles C, Zuccoli GS, Reis-de-Oliveira G, Fioramonte M, Saia-Cereda VM, Martins-de-Souza D. Protein Succinylation and Malonylation as Potential Biomarkers in Schizophrenia. J Pers Med 2022; 12:jpm12091408. [PMID: 36143193 PMCID: PMC9500613 DOI: 10.3390/jpm12091408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/24/2022] [Accepted: 08/27/2022] [Indexed: 11/16/2022] Open
Abstract
Two protein post-translational modifications, lysine succinylation and malonylation, are implicated in protein regulation, glycolysis, and energy metabolism. The precursors of these modifications, succinyl-CoA and malonyl-CoA, are key players in central metabolic processes. Both modification profiles have been proven to be responsive to metabolic stimuli, such as hypoxia. As mitochondrial dysfunction and metabolic dysregulation are implicated in schizophrenia and other psychiatric illnesses, these modification profiles have the potential to reveal yet another layer of protein regulation and can furthermore represent targets for biomarkers that are indicative of disease as well as its progression and treatment. In this work, data from shotgun mass spectrometry-based quantitative proteomics were compiled and analyzed to probe the succinylome and malonylome of postmortem brain tissue from patients with schizophrenia against controls and the human oligodendrocyte precursor cell line MO3.13 with the dizocilpine chemical model for schizophrenia, three antipsychotics, and co-treatments. Several changes in the succinylome and malonylome were seen in these comparisons, revealing these modifications to be a largely under-studied yet important form of protein regulation with broad potential applications.
Collapse
Affiliation(s)
- Bradley Joseph Smith
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
- Correspondence: (B.J.S.); (D.M.-d.-S.); Tel.: +55-(19)-3521-6129 (D.M.-d.-S.)
| | - Caroline Brandão-Teles
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
| | - Giuliana S. Zuccoli
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
| | - Guilherme Reis-de-Oliveira
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
| | - Mariana Fioramonte
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
| | - Verônica M. Saia-Cereda
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
| | - Daniel Martins-de-Souza
- Laboratory of Neuroproteomics, Institute of Biology, Department of Biochemistry and Tissue Biology, University of Campinas, Campinas 13083-862, Brazil
- Instituto Nacional de Biomarcadores em Neuropsiquiatria (INBION), Conselho Nacional de Desenvolvimento Científico e Tecnológico, São Paulo 05403-000, Brazil
- Experimental Medicine Research Cluster (EMRC), University of Campinas, Campinas 13083-862, Brazil
- D’Or Institute for Research and Education (IDOR), São Paulo 04501-000, Brazil
- Correspondence: (B.J.S.); (D.M.-d.-S.); Tel.: +55-(19)-3521-6129 (D.M.-d.-S.)
| |
Collapse
|
5
|
Wu LF, Wang DP, Shen J, Gao LJ, Zhou Y, Liu QH, Cao JM. Global profiling of protein lysine malonylation in mouse cardiac hypertrophy. J Proteomics 2022; 266:104667. [PMID: 35788409 DOI: 10.1016/j.jprot.2022.104667] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 06/06/2022] [Accepted: 06/19/2022] [Indexed: 10/17/2022]
Abstract
Lysine malonylation, a novel identified protein posttranslational modification (PTM), is conservative and present in both eukaryotic and prokaryotic cells. Previous studies have reported that malonylation plays an important role in inflammation, angiogenesis, and diabetes. However, its potential role in cardiac remodeling remains unknown. Here, we observed a reduced lysine malonylation in hypertrophic mice hearts created by transverse aortic constriction (TAC) for 8 weeks. We also detected a decreased lysine malonylation in hypertrophic H9C2 cardiomyocytes induced by angiotensin II for 48 h. Using a proteomic method based on affinity purification and LC-MS/MS, we identified total 679 malonylated sites in 330 proteins in the hearts of sham mice and TAC mice. Bioinformatic analysis of the proteomic data revealed enrichment of malonylated proteins involved in cardiac structure and contraction, cGMP-PKG pathway, and metabolism. Specifically, we detected a decreased lysine malonylation in myocardial isocitrate dehydrogenase 2 (IDH2) by immunoprecipitation coupled with Western blotting both in vivo and in vitro. Together, our work suggests an important role and implication of protein lysine malonylation in cardiac hypertrophy, especially the IDH2. SIGNIFICANCE: Heart failure is the terminal stage of cardiac hypertrophy, which imposes an enormous clinical and economic burden worldwide. Despite our knowledge on the pathophysiology of the disease, current therapeutic approaches are still largely limited. Cardiac hypertrophy can be regulated at post-translational modifications (PTMs), and several PTMs have been reported in cardiac hypertrrophy and heart failure. In our study, we first reported a novel PTMs, lysine malonylation, in cardiac hypertophy. we found a reduced lysine malonylation in hypertrophic mice hearts in vivo and H9C2 cardiomyocytes after stimulating with angiotensinII for 48 h in vitro. Using affinity purification and LC-MS/MS, we identified 679 malonylated sites in 330 proteins in the hearts of sham and TAC mice. Compared to the sham group, 5 sites in 2 proteins were quantified as downregulated targets using a 2-fold threshold (downregulation <0.5-fold, P < 0.05). Functional analysis showed a significant enrichment in cardiac structure and contraction, cGMP-PKG pathway and metabolism. Notably, we identified a decreased Kmal level in isocitrate dehydrogenase 2 (IDH2), but the protein level of IDH2 has no changed in cardiac hypertrophy, These results highlight that lysine malonylation is associated with cardiac hypertrophy, and may be a new therapeutic target of the disease.
Collapse
Affiliation(s)
- Li-Fei Wu
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China; Department of Pathophysiology, Shanxi Medical University, Taiyuan, China
| | - De-Ping Wang
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China
| | - Jing Shen
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China
| | - Li-Juan Gao
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China
| | - Ying Zhou
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China
| | - Qing-Hua Liu
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China; Department of Pathophysiology, Shanxi Medical University, Taiyuan, China
| | - Ji-Min Cao
- Key Laboratory of Cellular Physiology at Shanxi Medical University, Ministry of Education, and the Department of Physiology, Shanxi Medical University, Taiyuan, China.
| |
Collapse
|
6
|
Deep Learning-Based Advances In Protein Posttranslational Modification Site and Protein Cleavage Prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2499:285-322. [PMID: 35696087 DOI: 10.1007/978-1-0716-2317-6_15] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Posttranslational modification (PTM ) is a ubiquitous phenomenon in both eukaryotes and prokaryotes which gives rise to enormous proteomic diversity. PTM mostly comes in two flavors: covalent modification to polypeptide chain and proteolytic cleavage. Understanding and characterization of PTM is a fundamental step toward understanding the underpinning of biology. Recent advances in experimental approaches, mainly mass-spectrometry-based approaches, have immensely helped in obtaining and characterizing PTMs. However, experimental approaches are not enough to understand and characterize more than 450 different types of PTMs and complementary computational approaches are becoming popular. Recently, due to the various advancements in the field of Deep Learning (DL), along with the explosion of applications of DL to various fields, the field of computational prediction of PTM has also witnessed the development of a plethora of deep learning (DL)-based approaches. In this book chapter, we first review some recent DL-based approaches in the field of PTM site prediction. In addition, we also review the recent advances in the not-so-studied PTM , that is, proteolytic cleavage predictions. We describe advances in PTM prediction by highlighting the Deep learning architecture, feature encoding, novelty of the approaches, and availability of the tools/approaches. Finally, we provide an outlook and possible future research directions for DL-based approaches for PTM prediction.
Collapse
|
7
|
Hu B, Li D, Zeng Z, Zhang Z, Cao R, Dong X, Yun C, Li L, Krämer B, Morgera S, Hocher B, Tang D, Yin L, Dai Y. Integrated proteome and malonylome analyses reveal the neutrophil extracellular trap formation pathway in rheumatoid arthritis. J Proteomics 2022; 262:104597. [PMID: 35489682 DOI: 10.1016/j.jprot.2022.104597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 04/11/2022] [Accepted: 04/11/2022] [Indexed: 12/09/2022]
Abstract
Rheumatoid arthritis (RA) is an autoimmune inflammatory disease of unknown etiology in which the posttranslational modifications (PTMs) of proteins play an important role. PTMs, such as those involved in the formation of neutrophil extracellular traps (NETs), have been well studied. The excessive formation and release of NETs can mediate inflammation and joint destruction in RA. It has been gradually recognized that lysine malonylation (Kmal) can regulate some biological processes in some prokaryotes and eukaryotes. However, less is known about the role of Kmal in RA. We therefore performed proteome and malonylome analyses to explore the proteomic characteristics of the peripheral blood mononuclear cells from 36 RA patients and 82 healthy subjects. In total, 938 differentially expressed proteins (DEPs) and 42 differentially malonylated proteins (DMPs) with 55 Kmal sites were detected through a liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based analysis. Functional analysis showed that two DEPs with four malonylated sites and one DMP with a malonylated site were identified in the neutrophil extracellular trap formation (NETosis) pathway. Altogether, this study not only describes the characteristics of the malonylome in RA for the first time, but it also reveals that malonylation may be involved in the NETosis pathway. SIGNIFICANCE: This is the first report that reveals the proteomic features of Kmal in RA through a LC-MS/MS-based method. In this study, we found that several key DMPs were associated with the NETosis pathway, which contributes to the development of RA. The present results provide an informative dataset for the future exploration of Kmal in RA.
Collapse
Affiliation(s)
- Biying Hu
- The First Affiliated Hospital, Jinan University, Guangzhou, Guangdong 510632, China
| | - Dandan Li
- The First Affiliated Hospital, Jinan University, Guangzhou, Guangdong 510632, China; Department of Clinical Medical Research Center, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong 518020, China
| | - Zhipeng Zeng
- Department of Clinical Medical Research Center, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong 518020, China
| | - Zeyu Zhang
- The First Affiliated Hospital, Jinan University, Guangzhou, Guangdong 510632, China
| | - Rui Cao
- The First Affiliated Hospital, Jinan University, Guangzhou, Guangdong 510632, China
| | - XiangNan Dong
- The First Affiliated Hospital, Jinan University, Guangzhou, Guangdong 510632, China
| | - Chen Yun
- Guangzhou Enttxs Medical Products Co., Ltd. P.R. Guangzhou, Guangdong, 510663, China; Charité-Universitätsmedizin Berlin, Campus Mitte, Berlin, Germany
| | - Ling Li
- Hospital of South China Agricultural University, Guangzhou, Guangdong 510642, China
| | - Bernhard Krämer
- Department of Medicine Nephrologh, Medical Faculty Mannheim Heideiberg University, 68167 Mannheim, Germany
| | | | - Berthold Hocher
- Department of Medicine Nephrologh, Medical Faculty Mannheim Heideiberg University, 68167 Mannheim, Germany; Charité-Universitätsmedizin Berlin, Campus Mitte, Berlin, Germany
| | - Donge Tang
- Department of Clinical Medical Research Center, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong 518020, China.
| | - Lianghong Yin
- The First Affiliated Hospital, Jinan University, Guangzhou, Guangdong 510632, China; Huangpu Institute of Materials, Guangzhou, Guangdong, 510663, China.
| | - Yong Dai
- Department of Clinical Medical Research Center, The Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong 518020, China.
| |
Collapse
|
8
|
Sorkhi AG, Pirgazi J, Ghasemi V. A hybrid feature extraction scheme for efficient malonylation site prediction. Sci Rep 2022; 12:5756. [PMID: 35388017 PMCID: PMC8987080 DOI: 10.1038/s41598-022-08555-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 03/07/2022] [Indexed: 11/09/2022] Open
Abstract
Lysine malonylation is one of the most important post-translational modifications (PTMs). It affects the functionality of cells. Malonylation site prediction in proteins can unfold the mechanisms of cellular functionalities. Experimental methods are one of the due prediction approaches. But they are typically costly and time-consuming to implement. Recently, methods based on machine-learning solutions have been proposed to tackle this problem. Such practices have been shown to reduce costs and time complexities and increase accuracy. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features, and inefficient underlying classifiers. A machine learning-based method is proposed in this paper to cope with these problems. In the proposed approach, seven different features are extracted. Then, the extracted features are combined, ranked based on the Fisher's score (F-score), and the most efficient ones are selected. Afterward, malonylation sites are predicted using various classifiers. Simulation results show that the proposed method has acceptable performance compared with some state-of-the-art approaches. In addition, the XGBOOST classifier, founded on extracted features such as TFCRF, has a higher prediction rate than the other methods. The codes are publicly available at: https://github.com/jimy2020/Malonylation-site-prediction.
Collapse
Affiliation(s)
- Ali Ghanbari Sorkhi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| | - Jamshid Pirgazi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran.
| | - Vahid Ghasemi
- Department of Computer Engineering, Faculty of Information Technology, Kermanshah University of Technology, Kermanshah, Iran
| |
Collapse
|
9
|
Wang M, Song L, Zhang Y, Gao H, Yan L, Yu B. Malsite-Deep: Prediction of protein malonylation sites through deep learning and multi-information fusion based on NearMiss-2 strategy. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108191] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
10
|
Taherzadeh G, Campbell M, Zhou Y. Computational Prediction of N- and O-Linked Glycosylation Sites for Human and Mouse Proteins. Methods Mol Biol 2022; 2499:177-186. [PMID: 35696081 DOI: 10.1007/978-1-0716-2317-6_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Protein glycosylation is one of the most complex posttranslational modifications (PTM) that play a fundamental role in protein function. Identification and annotation of these sites using experimental approaches are challenging and time consuming. Hence, there is a demand to build fast and efficient computational methods to address this problem. Here, we present the SPRINT-Gly framework containing the largest dataset and a prediction model of glycosylation sites for a given protein sequence. In this framework, we construct a large dataset containing N- and O-linked glycosylation sites of human and mouse proteins, collected from different sources. We then introduce the SPRINT-Gly method to predict putative N- and O-linked sites. SPRINT-Gly is a machine learning-based approach consisting of a number of trained predictive models for glycosylation sites in both human and mouse proteins, separately. The method is built by incorporating sequence-based, predicted structural, and physicochemical information of the neighboring residues of each N- and O-linked glycosylation site and by training deep learning neural network and support vector machine as classifiers. SPRINT-Gly outperformed other existing methods by achieving 18% and 50% higher Matthew's correlation coefficient for N- and O-linked glycosylation site prediction, respectively. SPRINT-Gly is publicly available as an online and stand-alone predictor at https://sparks-lab.org/server/sprint-gly/ .
Collapse
Affiliation(s)
- Ghazaleh Taherzadeh
- Department of Mathematics and Computer Science, Wilkes University, Wilkes-Barre, PA, USA.
| | - Matthew Campbell
- Institute for Glycomics, Griffith University, Southport, QLD, Australia
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, China
| |
Collapse
|
11
|
Lv H, Zhang Y, Wang JS, Yuan SS, Sun ZJ, Dao FY, Guan ZX, Lin H, Deng KJ. iRice-MS: An integrated XGBoost model for detecting multitype post-translational modification sites in rice. Brief Bioinform 2021; 23:6447435. [PMID: 34864888 DOI: 10.1093/bib/bbab486] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 10/05/2021] [Accepted: 10/23/2021] [Indexed: 12/13/2022] Open
Abstract
Post-translational modification (PTM) refers to the covalent and enzymatic modification of proteins after protein biosynthesis, which orchestrates a variety of biological processes. Detecting PTM sites in proteome scale is one of the key steps to in-depth understanding their regulation mechanisms. In this study, we presented an integrated method based on eXtreme Gradient Boosting (XGBoost), called iRice-MS, to identify 2-hydroxyisobutyrylation, crotonylation, malonylation, ubiquitination, succinylation and acetylation in rice. For each PTM-specific model, we adopted eight feature encoding schemes, including sequence-based features, physicochemical property-based features and spatial mapping information-based features. The optimal feature set was identified from each encoding, and their respective models were established. Extensive experimental results show that iRice-MS always display excellent performance on 5-fold cross-validation and independent dataset test. In addition, our novel approach provides the superiority to other existing tools in terms of AUC value. Based on the proposed model, a web server named iRice-MS was established and is freely accessible at http://lin-group.cn/server/iRice-MS.
Collapse
Affiliation(s)
- Hao Lv
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Yang Zhang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, China
| | - Jia-Shu Wang
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Shi-Shi Yuan
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Zi-Jie Sun
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Fu-Ying Dao
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Zheng-Xing Guan
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Hao Lin
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| | - Ke-Jun Deng
- Center for Informational Biology at University of Electronic Science and Technology of China, China
| |
Collapse
|
12
|
Xu M, Tian X, Ku T, Wang G, Zhang E. Global Identification and Systematic Analysis of Lysine Malonylation in Maize ( Zea mays L.). FRONTIERS IN PLANT SCIENCE 2021; 12:728338. [PMID: 34490025 PMCID: PMC8417889 DOI: 10.3389/fpls.2021.728338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 08/02/2021] [Indexed: 05/27/2023]
Abstract
Lysine malonylation is a kind of post-translational modifications (PTMs) discovered in recent years, which plays an important regulatory role in plants. Maize (Zea mays L.) is a major global cereal crop. Immunoblotting revealed that maize was rich in malonylated proteins. We therefore performed a qualitative malonylome analysis to globally identify malonylated proteins in maize. In total, 1,722 uniquely malonylated lysine residues were obtained in 810 proteins. The modified proteins were involved in various biological processes such as photosynthesis, ribosome and oxidative phosphorylation. Notably, a large proportion of the modified proteins (45%) were located in chloroplast. Further functional analysis revealed that 30 proteins in photosynthesis and 15 key enzymes in the Calvin cycle were malonylated, suggesting an indispensable regulatory role of malonylation in photosynthesis and carbon fixation. This work represents the first comprehensive survey of malonylome in maize and provides an important resource for exploring the function of lysine malonylation in physiological regulation of maize.
Collapse
Affiliation(s)
- Min Xu
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Xiaomin Tian
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Tingting Ku
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Guangyuan Wang
- Shandong Province Key Laboratory of Applied Mycology, College of Life Sciences, Qingdao Agricultural University, Qingdao, China
| | - Enying Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
13
|
Elmassry MM, Bisht K, Colmer-Hamood JA, Wakeman CA, San Francisco MJ, Hamood AN. Malonate utilization by Pseudomonas aeruginosa affects quorum-sensing and virulence and leads to formation of mineralized biofilm-like structures. Mol Microbiol 2021; 116:516-537. [PMID: 33892520 DOI: 10.1111/mmi.14729] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 04/06/2021] [Accepted: 04/16/2021] [Indexed: 01/02/2023]
Abstract
Pseudomonas aeruginosa is an opportunistic pathogen that uses malonate among its many carbon sources. We recently reported that, when grown in blood from trauma patients, P. aeruginosa expression of malonate utilization genes was upregulated. In this study, we explored the role of malonate utilization and its contribution to P. aeruginosa virulence. We grew P. aeruginosa strain PA14 in M9 minimal medium containing malonate (MM9) or glycerol (GM9) as a sole carbon source and assessed the effect of the growth on quorum sensing, virulence factors, and antibiotic resistance. Growth of PA14 in MM9, compared to GM9, reduced the production of elastases, rhamnolipids, and pyoverdine; enhanced the production of pyocyanin and catalase; and increased its sensitivity to norfloxacin. Growth in MM9 decreased extracellular levels of N-acylhomoserine lactone autoinducers, an effect likely associated with increased pH of the culture medium; but had little effect on extracellular levels of PQS. At 18 hr of growth in MM9, PA14 formed biofilm-like structures or aggregates that were associated with biomineralization, which was related to increased pH of the culture medium. These results suggest that malonate significantly impacts P. aeruginosa pathogenesis by influencing the quorum sensing systems, the production of virulence factors, biofilm formation, and antibiotic resistance.
Collapse
Affiliation(s)
- Moamen M Elmassry
- Department of Biological Sciences, Texas Tech University, Lubbock, TX, USA
| | - Karishma Bisht
- Department of Biological Sciences, Texas Tech University, Lubbock, TX, USA
| | - Jane A Colmer-Hamood
- Department of Immunology and Molecular Microbiology, Texas Tech University Health Sciences Center, Lubbock, TX, USA.,Department of Medical Education, Texas Tech University Health Sciences Center, Lubbock, TX, USA
| | | | - Michael J San Francisco
- Department of Biological Sciences, Texas Tech University, Lubbock, TX, USA.,Honors College, Texas Tech University, Lubbock, TX, USA
| | - Abdul N Hamood
- Department of Immunology and Molecular Microbiology, Texas Tech University Health Sciences Center, Lubbock, TX, USA.,Department of Surgery, Texas Tech University Health Sciences Center, Lubbock, TX, USA
| |
Collapse
|
14
|
Liu X, Wang L, Li J, Hu J, Zhang X. Mal-Prec: computational prediction of protein Malonylation sites via machine learning based feature integration : Malonylation site prediction. BMC Genomics 2020; 21:812. [PMID: 33225896 PMCID: PMC7682087 DOI: 10.1186/s12864-020-07166-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 10/20/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Malonylation is a recently discovered post-translational modification that is associated with a variety of diseases such as Type 2 Diabetes Mellitus and different types of cancers. Compared with experimental identification of malonylation sites, computational method is a time-effective process with comparatively low costs. RESULTS In this study, we proposed a novel computational model called Mal-Prec (Malonylation Prediction) for malonylation site prediction through the combination of Principal Component Analysis and Support Vector Machine. One-hot encoding, physio-chemical properties, and composition of k-spaced acid pairs were initially performed to extract sequence features. PCA was then applied to select optimal feature subsets while SVM was adopted to predict malonylation sites. Five-fold cross-validation results showed that Mal-Prec can achieve better prediction performance compared with other approaches. AUC (area under the receiver operating characteristic curves) analysis achieved 96.47 and 90.72% on 5-fold cross-validation of independent data sets, respectively. CONCLUSION Mal-Prec is a computationally reliable method for identifying malonylation sites in protein sequences. It outperforms existing prediction tools and can serve as a useful tool for identifying and discovering novel malonylation sites in human proteins. Mal-Prec is coded in MATLAB and is publicly available at https://github.com/flyinsky6/Mal-Prec , together with the data sets used in this study.
Collapse
Affiliation(s)
- Xin Liu
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, 221004 Jiangsu China
| | - Liang Wang
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, 221004 Jiangsu China
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, School of Pharmacy, Xuzhou Medical University, Xuzhou, 221000 Jiangsu China
| | - Jian Li
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA 70118 USA
| | - Junfeng Hu
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, 221004 Jiangsu China
| | - Xiao Zhang
- Department of Bioinformatics, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, 221004 Jiangsu China
| |
Collapse
|
15
|
Dipta SR, Taherzadeh G, Ahmad MW, Arafat ME, Shatabda S, Dehzangi A. SEMal: Accurate protein malonylation site predictor using structural and evolutionary information. Comput Biol Med 2020; 125:104022. [PMID: 33022522 DOI: 10.1016/j.compbiomed.2020.104022] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 09/24/2020] [Accepted: 09/25/2020] [Indexed: 10/23/2022]
Abstract
Post Transactional Modification (PTM) is a vital process which plays an important role in a wide range of biological interactions. One of the most recently identified PTMs is Malonylation. It has been shown that Malonylation has an important impact on different biological pathways including glucose and fatty acid metabolism. Malonylation can be detected experimentally using mass spectrometry. However, this process is both costly and time-consuming which has inspired research to find more efficient and fast computational methods to solve this problem. This paper proposes a novel approach, called SEMal, to identify Malonylation sites in protein sequences. It uses both structural and evolutionary-based features to solve this problem. It also uses Rotation Forest (RoF) as its classification technique to predict Malonylation sites. To the best of our knowledge, our extracted features as well as our employed classifier have never been used for this problem. Compared to the previously proposed methods, SEMal outperforms them in all metrics such as sensitivity (0.94 and 0.89), accuracy (0.94 and 0.91), and Matthews correlation coefficient (0.88 and 0.82), for Homo Sapiens and Mus Musculus species, respectively. SEMal is publicly available as an online predictor at: http://brl.uiu.ac.bd/SEMal/.
Collapse
Affiliation(s)
- Shubhashis Roy Dipta
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Ghazaleh Taherzadeh
- Institute for Bioscience and Biotechnology Research, University of Maryland, College Park, MD, 20742, USA
| | - Md Wakil Ahmad
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh
| | - Md Easin Arafat
- Institute of Information Technology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - Swakkhar Shatabda
- Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.
| | - Abdollah Dehzangi
- Department of Computer Science, Rutgers University, Camden, NJ, 08102, USA; Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08102, USA.
| |
Collapse
|
16
|
Chung CR, Chang YP, Hsu YL, Chen S, Wu LC, Horng JT, Lee TY. Incorporating hybrid models into lysine malonylation sites prediction on mammalian and plant proteins. Sci Rep 2020; 10:10541. [PMID: 32601280 PMCID: PMC7324624 DOI: 10.1038/s41598-020-67384-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 06/03/2020] [Indexed: 12/22/2022] Open
Abstract
Protein malonylation, a reversible post-translational modification of lysine residues, is associated with various biological functions, such as cellular regulation and pathogenesis. In proteomics, to improve our understanding of the mechanisms of malonylation at the molecular level,
the identification of malonylation sites via an efficient methodology is essential. However, experimental identification of malonylated substrates via mass spectrometry is time-consuming, labor-intensive, and expensive. Although numerous methods have been developed to predict malonylation sites in mammalian proteins, the computational resource for identifying plant malonylation sites is very limited. In this study, a hybrid model incorporating multiple convolutional neural networks (CNNs) with physicochemical properties, evolutionary information,
and sequenced-based features was developed for identifying protein malonylation sites in mammals. For plant malonylation, multiple CNNs and random forests were integrated into a secondary modeling phase using a support vector machine. The independent testing has demonstrated that the mammalian and plant malonylation models can yield the area under the receiver operating characteristic curves (AUC) at 0.943 and 0.772, respectively. The proposed scheme has been implemented as a web-based tool, Kmalo (https://fdblab.csie.ncu.edu.tw/kmalo/home.html), which can help facilitate the functional investigation of protein malonylation on mammals and plants.
Collapse
Affiliation(s)
- Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, 32001, Taiwan
| | - Ya-Ping Chang
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, 32001, Taiwan
| | - Yu-Lin Hsu
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, 32001, Taiwan
| | - Siyu Chen
- School of Life and Health Sciences, The Chinese University of Hong Kong, Shenzhen, 518172, China
| | - Li-Ching Wu
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, 32001, Taiwan
| | - Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, 32001, Taiwan. .,Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41359, Taiwan.
| | - Tzong-Yi Lee
- School of Life and Health Sciences, The Chinese University of Hong Kong, Shenzhen, 518172, China. .,Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, 518172, China.
| |
Collapse
|
17
|
Taherzadeh G, Dehzangi A, Golchin M, Zhou Y, Campbell MP. SPRINT-Gly: predicting N- and O-linked glycosylation sites of human and mouse proteins by using sequence and predicted structural properties. Bioinformatics 2020; 35:4140-4146. [PMID: 30903686 DOI: 10.1093/bioinformatics/btz215] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 03/03/2019] [Accepted: 03/21/2019] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION Protein glycosylation is one of the most abundant post-translational modifications that plays an important role in immune responses, intercellular signaling, inflammation and host-pathogen interactions. However, due to the poor ionization efficiency and microheterogeneity of glycopeptides identifying glycosylation sites is a challenging task, and there is a demand for computational methods. Here, we constructed the largest dataset of human and mouse glycosylation sites to train deep learning neural networks and support vector machine classifiers to predict N-/O-linked glycosylation sites, respectively. RESULTS The method, called SPRINT-Gly, achieved consistent results between ten-fold cross validation and independent test for predicting human and mouse glycosylation sites. For N-glycosylation, a mouse-trained model performs equally well in human glycoproteins and vice versa, however, due to significant differences in O-linked sites separate models were generated. Overall, SPRINT-Gly is 18% and 50% higher in Matthews correlation coefficient than the next best method compared in N-linked and O-linked sites, respectively. This improved performance is due to the inclusion of novel structure and sequence-based features. AVAILABILITY AND IMPLEMENTATION http://sparks-lab.org/server/SPRINT-Gly/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ghazaleh Taherzadeh
- School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD, USA
| | - Maryam Golchin
- School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Yaoqi Zhou
- School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia.,Institute for Glycomics, Griffith University, Parklands Drive, Gold Coast, QLD, Australia
| | - Matthew P Campbell
- Institute for Glycomics, Griffith University, Parklands Drive, Gold Coast, QLD, Australia
| |
Collapse
|
18
|
Li F, Fan C, Marquez-Lago TT, Leier A, Revote J, Jia C, Zhu Y, Smith AI, Webb GI, Liu Q, Wei L, Li J, Song J. PRISMOID: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact. Brief Bioinform 2020; 21:1069-1079. [PMID: 31161204 PMCID: PMC7299293 DOI: 10.1093/bib/bbz050] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 03/26/2019] [Accepted: 03/29/2019] [Indexed: 12/26/2022] Open
Abstract
Post-translational modifications (PTMs) play very important roles in various cell signaling pathways and biological process. Due to PTMs' extremely important roles, many major PTMs have been studied, while the functional and mechanical characterization of major PTMs is well documented in several databases. However, most currently available databases mainly focus on protein sequences, while the real 3D structures of PTMs have been largely ignored. Therefore, studies of PTMs 3D structural signatures have been severely limited by the deficiency of the data. Here, we develop PRISMOID, a novel publicly available and free 3D structure database for a wide range of PTMs. PRISMOID represents an up-to-date and interactive online knowledge base with specific focus on 3D structural contexts of PTMs sites and mutations that occur on PTMs and in the close proximity of PTM sites with functional impact. The first version of PRISMOID encompasses 17 145 non-redundant modification sites on 3919 related protein 3D structure entries pertaining to 37 different types of PTMs. Our entry web page is organized in a comprehensive manner, including detailed PTM annotation on the 3D structure and biological information in terms of mutations affecting PTMs, secondary structure features and per-residue solvent accessibility features of PTM sites, domain context, predicted natively disordered regions and sequence alignments. In addition, high-definition JavaScript packages are employed to enhance information visualization in PRISMOID. PRISMOID equips a variety of interactive and customizable search options and data browsing functions; these capabilities allow users to access data via keyword, ID and advanced options combination search in an efficient and user-friendly way. A download page is also provided to enable users to download the SQL file, computational structural features and PTM sites' data. We anticipate PRISMOID will swiftly become an invaluable online resource, assisting both biologists and bioinformaticians to conduct experiments and develop applications supporting discovery efforts in the sequence-structural-functional relationship of PTMs and providing important insight into mutations and PTM sites interaction mechanisms. The PRISMOID database is freely accessible at http://prismoid.erc.monash.edu/. The database and web interface are implemented in MySQL, JSP, JavaScript and HTML with all major browsers supported.
Collapse
Affiliation(s)
- Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Cunshuo Fan
- College of Information Engineering, Northwest A&F University, Yangling, China
| | - Tatiana T Marquez-Lago
- Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - André Leier
- Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - Jerico Revote
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
| | - Cangzhi Jia
- College of Science, Dalian Maritime University, Dalian, China
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Yan Zhu
- Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, Victoria, Australia
| | - A Ian Smith
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, China
| | - Leyi Wei
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jian Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
19
|
Xu HD, Liang RP, Wang YG, Qiu JD. mUSP: a high-accuracy map of the in situ crosstalk of ubiquitylation and SUMOylation proteome predicted via the feature enhancement approach. Brief Bioinform 2020; 22:5831925. [PMID: 32382739 DOI: 10.1093/bib/bbaa050] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 02/19/2020] [Indexed: 01/02/2023] Open
Abstract
Reversible post-translational modification (PTM) orchestrates various biological processes by changing the properties of proteins. Since many proteins are multiply modified by PTMs, identification of PTM crosstalk site has emerged to be an intriguing topic and attracted much attention. In this study, we systematically deciphered the in situ crosstalk of ubiquitylation and SUMOylation that co-occurs on the same lysine residue. We first collected 3363 ubiquitylation-SUMOylation (UBS) crosstalk site on 1302 proteins and then investigated the prime sequence motifs, the local evolutionary degree and the distribution of structural annotations at the residue and sequence levels between the UBS crosstalk and the single modification sites. Given the properties of UBS crosstalk sites, we thus developed the mUSP classifier to predict UBS crosstalk site by integrating different types of features with two-step feature optimization by recursive feature elimination approach. By using various cross-validations, the mUSP model achieved an average area under the curve (AUC) value of 0.8416, indicating its promising accuracy and robustness. By comparison, the mUSP has significantly better performance with the improvement of 38.41 and 51.48% AUC values compared to the cross-results by the previous single predictor. The mUSP was implemented as a web server available at http://bioinfo.ncu.edu.cn/mUSP/index.html to facilitate the query of our high-accuracy UBS crosstalk results for experimental design and validation.
Collapse
Affiliation(s)
- Hao-Dong Xu
- Department of Chemistry, Nanchang University, 999 Xuefu Road, Nanchang, Jiangxi, China
| | - Ru-Ping Liang
- Department of Chemistry, Nanchang University, 999 Xuefu Road, Nanchang, Jiangxi, China
| | - You-Gan Wang
- Department of Chemistry, Nanchang University, 999 Xuefu Road, Nanchang, Jiangxi, China
| | - Jian-Ding Qiu
- Department of Chemistry, Nanchang University, 999 Xuefu Road, Nanchang, Jiangxi, China
| |
Collapse
|
20
|
AHMAD WAKIL, ARAFAT EASIN, TAHERZADEH GHAZALEH, SHARMA ALOK, DIPTA SHUBHASHISROY, DEHZANGI ABDOLLAH, SHATABDA SWAKKHAR. Mal-Light: Enhancing Lysine Malonylation Sites Prediction Problem Using Evolutionary-based Features. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 8:77888-77902. [PMID: 33354488 PMCID: PMC7751949 DOI: 10.1109/access.2020.2989713] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Post Translational Modification (PTM) is considered an important biological process with a tremendous impact on the function of proteins in both eukaryotes, and prokaryotes cells. During the past decades, a wide range of PTMs has been identified. Among them, malonylation is a recently identified PTM which plays a vital role in a wide range of biological interactions. Notwithstanding, this modification plays a potential role in energy metabolism in different species including Homo Sapiens. The identification of PTM sites using experimental methods is time-consuming and costly. Hence, there is a demand for introducing fast and cost-effective computational methods. In this study, we propose a new machine learning method, called Mal-Light, to address this problem. To build this model, we extract local evolutionary-based information according to the interaction of neighboring amino acids using a bi-peptide based method. We then use Light Gradient Boosting (LightGBM) as our classifier to predict malonylation sites. Our results demonstrate that Mal-Light is able to significantly improve malonylation site prediction performance compared to previous studies found in the literature. Using Mal-Light we achieve Matthew's correlation coefficient (MCC) of 0.74 and 0.60, Accuracy of 86.66% and 79.51%, Sensitivity of 78.26% and 67.27%, and Specificity of 95.05% and 91.75%, for Homo Sapiens and Mus Musculus proteins, respectively. Mal-Light is implemented as an online predictor which is publicly available at: (http://brl.uiu.ac.bd/MalLight/).
Collapse
Affiliation(s)
- WAKIL AHMAD
- Department of Computer Science and Engineering, United International University, United City, Madani Avenue, Dhaka 1212, Bangladesh
| | - EASIN ARAFAT
- Department of Computer Science and Engineering, United International University, United City, Madani Avenue, Dhaka 1212, Bangladesh
| | - GHAZALEH TAHERZADEH
- Institute for Bioscience and Biotechnology Research, University of Maryland, College Park, MD, 20742, USA
| | - ALOK SHARMA
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, QLD-4111, Australia
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Kanagawa, Japan
- School of Engineering and Physics, Faculty of Science Technology and Environment, University of the South Pacific, Suva, Fiji
- CREST, JST, Tokyo, 102-8666, Japan
| | - SHUBHASHIS ROY DIPTA
- Department of Computer Science and Engineering, United International University, United City, Madani Avenue, Dhaka 1212, Bangladesh
| | - ABDOLLAH DEHZANGI
- Department of Computer Science, Morgan State University, Baltimore, MD, 21251, USA
| | - SWAKKHAR SHATABDA
- Department of Computer Science and Engineering, United International University, United City, Madani Avenue, Dhaka 1212, Bangladesh
| |
Collapse
|
21
|
RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites. Comput Struct Biotechnol J 2020; 18:852-860. [PMID: 32322367 PMCID: PMC7160427 DOI: 10.1016/j.csbj.2020.02.012] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 01/27/2020] [Accepted: 02/19/2020] [Indexed: 12/19/2022] Open
Abstract
Malonylation, which has recently emerged as an important lysine modification, regulates diverse biological activities and has been implicated in several pervasive disorders, including cardiovascular disease and cancer. However, conventional global proteomics analysis using tandem mass spectrometry can be time-consuming, expensive and technically challenging. Therefore, to complement and extend existing experimental methods for malonylation site identification, we developed two novel computational methods for malonylation site prediction based on random forest and deep learning machine learning algorithms, RF-MaloSite and DL-MaloSite, respectively. DL-MaloSite requires the primary amino acid sequence as an input and RF-MaloSite utilizes a diverse set of biochemical, physiochemical and sequence-based features. While systematic assessment of performance metrics suggests that both ‘RF-MaloSite’ and ‘DL-MaloSite’ perform well in all metrics tested, our methods perform particularly well in the areas of accuracy, sensitivity and overall method performance (assessed by the Matthew’s Correlation Coefficient). For instance, RF-MaloSite exhibited MCC scores of 0.42 and 0.40 using 10-fold cross-validation and an independent test set, respectively. Meanwhile, DL-MaloSite was characterized by MCC scores of 0.51 and 0.49 based on 10-fold cross-validation and an independent set, respectively. Importantly, both methods exhibited efficiency scores that were on par or better than those achieved by existing malonylation site prediction methods. The identification of these sites may also provide important insights into the mechanisms of crosstalk between malonylation and other lysine modifications, such as acetylation, glutarylation and succinylation. To facilitate their use, both methods have been made freely available to the research community at https://github.com/dukkakc/DL-MaloSite-and-RF-MaloSite.
Collapse
|
22
|
Zhang Y, Xie R, Wang J, Leier A, Marquez-Lago TT, Akutsu T, Webb GI, Chou KC, Song J. Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework. Brief Bioinform 2019; 20:2185-2199. [PMID: 30351377 PMCID: PMC6954445 DOI: 10.1093/bib/bby079] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 07/28/2018] [Accepted: 08/01/2018] [Indexed: 11/15/2022] Open
Abstract
As a newly discovered post-translational modification (PTM), lysine malonylation (Kmal) regulates a myriad of cellular processes from prokaryotes to eukaryotes and has important implications in human diseases. Despite its functional significance, computational methods to accurately identify malonylation sites are still lacking and urgently needed. In particular, there is currently no comprehensive analysis and assessment of different features and machine learning (ML) methods that are required for constructing the necessary prediction models. Here, we review, analyze and compare 11 different feature encoding methods, with the goal of extracting key patterns and characteristics from residue sequences of Kmal sites. We identify optimized feature sets, with which four commonly used ML methods (random forest, support vector machines, K-nearest neighbor and logistic regression) and one recently proposed [Light Gradient Boosting Machine (LightGBM)] are trained on data from three species, namely, Escherichia coli, Mus musculus and Homo sapiens, and compared using randomized 10-fold cross-validation tests. We show that integration of the single method-based models through ensemble learning further improves the prediction performance and model robustness on the independent test. When compared to the existing state-of-the-art predictor, MaloPred, the optimal ensemble models were more accurate for all three species (AUC: 0.930, 0.923 and 0.944 for E. coli, M. musculus and H. sapiens, respectively). Using the ensemble models, we developed an accessible online predictor, kmal-sp, available at http://kmalsp.erc.monash.edu/. We hope that this comprehensive survey and the proposed strategy for building more accurate models can serve as a useful guide for inspiring future developments of computational methods for PTM site prediction, expedite the discovery of new malonylation and other PTM types and facilitate hypothesis-driven experimental validation of novel malonylated substrates and malonylation sites.
Collapse
Affiliation(s)
- Yanju Zhang
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Ruopeng Xie
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Jiawei Wang
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, VIC 3800, Australia
| | - André Leier
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - Tatiana T Marquez-Lago
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, VIC 3800, Australia
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, USA
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Jiangning Song
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, VIC 3800, Australia
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, VIC 3800, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, VIC 3800, Australia
| |
Collapse
|
23
|
He W, Wei L, Zou Q. Research progress in protein posttranslational modification site prediction. Brief Funct Genomics 2018; 18:220-229. [DOI: 10.1093/bfgp/ely039] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 11/15/2018] [Accepted: 11/22/2018] [Indexed: 01/24/2023] Open
Abstract
AbstractPosttranslational modifications (PTMs) play an important role in regulating protein folding, activity and function and are involved in almost all cellular processes. Identification of PTMs of proteins is the basis for elucidating the mechanisms of cell biology and disease treatments. Compared with the laboriousness of equivalent experimental work, PTM prediction using various machine-learning methods can provide accurate, simple and rapid research solutions and generate valuable information for further laboratory studies. In this review, we manually curate most of the bioinformatics tools published since 2008. We also summarize the approaches for predicting ubiquitination sites and glycosylation sites. Moreover, we discuss the challenges of current PTM bioinformatics tools and look forward to future research possibilities.
Collapse
Affiliation(s)
- Wenying He
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Leyi Wei
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
24
|
Dehzangi A, López Y, Taherzadeh G, Sharma A, Tsunoda T. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules 2018; 23:E3260. [PMID: 30544729 PMCID: PMC6320791 DOI: 10.3390/molecules23123260] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2018] [Revised: 11/30/2018] [Accepted: 12/05/2018] [Indexed: 12/13/2022] Open
Abstract
Post Translational Modification (PTM) is defined as the modification of amino acids along the protein sequences after the translation process. These modifications significantly impact on the functioning of proteins. Therefore, having a comprehensive understanding of the underlying mechanism of PTMs turns out to be critical in studying the biological roles of proteins. Among a wide range of PTMs, sumoylation is one of the most important modifications due to its known cellular functions which include transcriptional regulation, protein stability, and protein subcellular localization. Despite its importance, determining sumoylation sites via experimental methods is time-consuming and costly. This has led to a great demand for the development of fast computational methods able to accurately determine sumoylation sites in proteins. In this study, we present a new machine learning-based method for predicting sumoylation sites called SumSec. To do this, we employed the predicted secondary structure of amino acids to extract two types of structural features from neighboring amino acids along the protein sequence which has never been used for this task. As a result, our proposed method is able to enhance the sumoylation site prediction task, outperforming previously proposed methods in the literature. SumSec demonstrated high sensitivity (0.91), accuracy (0.94) and MCC (0.88). The prediction accuracy achieved in this study is 21% better than those reported in previous studies. The script and extracted features are publicly available at: https://github.com/YosvanyLopez/SumSec.
Collapse
Affiliation(s)
- Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA.
| | - Yosvany López
- Genesis Institute of Genetic Research, Genesis Healthcare Co., Tokyo 150-6015, Japan.
| | - Ghazaleh Taherzadeh
- School of Information and Communication Technology, Griffith University, Gold Coast 4222, Australia.
| | - Alok Sharma
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane 4111, Australia.
- School of Engineering & Physics, University of the South Pacific, Suva, Fiji.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
- CREST, JST, Tokyo 102-0076, Japan.
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.
- CREST, JST, Tokyo 102-0076, Japan.
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
| |
Collapse
|