1
|
Ke J, Zhao J, Li H, Yuan L, Dong G, Wang G. Prediction of protein N-terminal acetylation modification sites based on CNN-BiLSTM-attention model. Comput Biol Med 2024; 174:108330. [PMID: 38588617 DOI: 10.1016/j.compbiomed.2024.108330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/06/2024] [Accepted: 03/17/2024] [Indexed: 04/10/2024]
Abstract
N-terminal acetylation is one of the most common and important post-translational modifications (PTM) of eukaryotic proteins. PTM plays a crucial role in various cellular processes and disease pathogenesis. Thus, the accurate identification of N-terminal acetylation modifications is important to gain insight into cellular processes and other possible functional mechanisms. Although some algorithmic models have been proposed, most have been developed based on traditional machine learning algorithms and small training datasets. Their practical applications are limited. Nevertheless, deep learning algorithmic models are better at handling high-throughput and complex data. In this study, DeepCBA, a model based on the hybrid framework of convolutional neural network (CNN), bidirectional long short-term memory network (BiLSTM), and attention mechanism deep learning, was constructed to detect the N-terminal acetylation sites. The DeepCBA was built as follows: First, a benchmark dataset was generated by selecting low-redundant protein sequences from the Uniport database and further reducing the redundancy of the protein sequences using the CD-HIT tool. Subsequently, based on the skip-gram model in the word2vec algorithm, tripeptide word vector features were generated on the benchmark dataset. Finally, the CNN, BiLSTM, and attention mechanism were combined, and the tripeptide word vector features were fed into the stacked model for multiple rounds of training. The model performed excellently on independent dataset test, with accuracy and area under the curve of 80.51% and 87.36%, respectively. Altogether, DeepCBA achieved superior performance compared with the baseline model, and significantly outperformed most existing predictors. Additionally, our model can be used to identify disease loci and drug targets.
Collapse
Affiliation(s)
- Jinsong Ke
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Jianmei Zhao
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, 150040, China; College of Life Science, Northeast Forestry University, Harbin, 150040, China
| | - Hongfei Li
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, 150040, China; College of Life Science, Northeast Forestry University, Harbin, 150040, China
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, Quzhou, 324000, China
| | - Guanghui Dong
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, 150040, China.
| |
Collapse
|
2
|
Zhu R, Chen M, Luo Y, Cheng H, Zhao Z, Zhang M. The role of N-acetyltransferases in cancers. Gene 2024; 892:147866. [PMID: 37783298 DOI: 10.1016/j.gene.2023.147866] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/25/2023] [Accepted: 09/29/2023] [Indexed: 10/04/2023]
Abstract
Cancer is a major global health problem that disrupts the balance of normal cellular growth and behavior. Mounting evidence has shown that epigenetic modification, specifically N-terminal acetylation, play a crucial role in the regulation of cell growth and function. Acetylation is a co- or post-translational modification to regulate important cellular progresses such as cell proliferation, cell cycle progress, and energy metabolism. Recently, N-acetyltransferases (NATs), enzymes responsible for acetylation, regulate signal transduction pathway in various cancers including hepatocellular carcinoma, breast cancer, lung cancer, colorectal cancer and prostate cancer. In this review, we clarify the regulatory role of NATs in cancer progression, such as cell proliferation, metastasis, cell apoptosis, autophagy, cell cycle arrest and energy metabolism. Furthermore, the mechanism of NATs on cancer remains to be further studied, and few drugs have been developed. This provides us with a new idea that targeting acetylation, especially NAT-mediated acetylation, may be an attractive way for inhibiting cancer progression.
Collapse
Affiliation(s)
- Rongrong Zhu
- Institute of Cardiovascular Disease, Key Laboratory for Arteriosclerology of Hunan Province, Hunan International Scientific and Technological Cooperation Base of Arteriosclerotic Disease, Department of Bioinformatics and Medical Big Data, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, PR China
| | - Mengjiao Chen
- Institute of Cardiovascular Disease, Key Laboratory for Arteriosclerology of Hunan Province, Hunan International Scientific and Technological Cooperation Base of Arteriosclerotic Disease, Department of Bioinformatics and Medical Big Data, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, PR China
| | - Yongjia Luo
- Institute of Cardiovascular Disease, Key Laboratory for Arteriosclerology of Hunan Province, Hunan International Scientific and Technological Cooperation Base of Arteriosclerotic Disease, Department of Bioinformatics and Medical Big Data, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, PR China; Department of Medicine, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, PR China
| | - Haipeng Cheng
- Department of Pathology, The Second Xiangya Hospital, Central South University, Changsha, Hunan 410008, PR China
| | - Zhenwang Zhao
- Department of Pathology and Pathophysiology, School of Basic Medicine, Health Science Center, Hubei University of Arts and Science, Xiangyang, Hubei 441053, PR China.
| | - Min Zhang
- Institute of Cardiovascular Disease, Key Laboratory for Arteriosclerology of Hunan Province, Hunan International Scientific and Technological Cooperation Base of Arteriosclerotic Disease, Department of Bioinformatics and Medical Big Data, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, PR China.
| |
Collapse
|
3
|
Sharma A, Garg A, Ramana J, Gupta D. VirulentPred 2.0: An improved method for prediction of virulent proteins in bacterial pathogens. Protein Sci 2023; 32:e4808. [PMID: 37872744 PMCID: PMC10659933 DOI: 10.1002/pro.4808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 09/27/2023] [Accepted: 10/15/2023] [Indexed: 10/25/2023]
Abstract
Virulence proteins in pathogens are essential for causing disease in a host. They enable the pathogen to invade, survive and multiply within the host, thus enhancing its potential to cause disease while also causing evasion of host defense mechanisms. Identifying these factors, especially potential vaccine candidates or drug targets, is critical for vaccine or drug development research. In this context, we present an improved version of VirulentPred 1.0 for rapidly identifying virulent proteins. The VirulentPred 2.0 is based on training machine learning models with experimentally validated virulent protein sequences. VirulentPred 2.0 achieved 84.71% accuracy with the validation dataset and 85.18% on an independent test dataset. The models are trained and evaluated with the latest sequence datasets of virulent proteins, which are three times greater in number than the proteins used in the earlier version of VirulentPred. Moreover, a significant improvement of 11% in the prediction accuracy over the earlier version is achieved with the best position-specific scoring matrix (PSSM)-based model for the latest test dataset. VirulentPred 2.0 is available as a user-friendly web interface at https://bioinfo.icgeb.res.in/virulent2/ and a standalone application suitable for bulk predictions. With higher efficiency and availability as a standalone tool, VirulentPred 2.0 holds immense potential for high throughput yet efficient identification of virulent proteins in bacterial pathogens.
Collapse
Affiliation(s)
- Arun Sharma
- Translational Bioinformatics GroupInternational Centre for Genetic Engineering and Biotechnology (ICGEB)New DelhiIndia
| | - Aarti Garg
- Translational Bioinformatics GroupInternational Centre for Genetic Engineering and Biotechnology (ICGEB)New DelhiIndia
| | - Jayashree Ramana
- Translational Bioinformatics GroupInternational Centre for Genetic Engineering and Biotechnology (ICGEB)New DelhiIndia
| | - Dinesh Gupta
- Translational Bioinformatics GroupInternational Centre for Genetic Engineering and Biotechnology (ICGEB)New DelhiIndia
| |
Collapse
|
4
|
Sugaya N, Tanaka S, Keyamura K, Noda S, Akanuma G, Hishida T. N-terminal acetyltransferase NatB regulates Rad51-dependent repair of double-strand breaks in Saccharomyces cerevisiae. Genes Genet Syst 2023; 98:61-72. [PMID: 37331807 DOI: 10.1266/ggs.23-00013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2023] Open
Abstract
Homologous recombination (HR) is a highly accurate mechanism for repairing DNA double-strand breaks (DSBs) that arise from various genotoxic insults and blocked replication forks. Defects in HR and unscheduled HR can interfere with other cellular processes such as DNA replication and chromosome segregation, leading to genome instability and cell death. Therefore, the HR process has to be tightly controlled. Protein N-terminal acetylation is one of the most common modifications in eukaryotic organisms. Studies in budding yeast implicate a role for NatB acetyltransferase in HR repair, but precisely how this modification regulates HR repair and genome integrity is unknown. In this study, we show that cells lacking NatB, a dimeric complex composed of Nat3 and Mdm2, are sensitive to the DNA alkylating agent methyl methanesulfonate (MMS), and that overexpression of Rad51 suppresses the MMS sensitivity of nat3Δ cells. Nat3-deficient cells have increased levels of Rad52-yellow fluorescent protein foci and fail to repair DSBs after release from MMS exposure. We also found that Nat3 is required for HR-dependent gene conversion and gene targeting. Importantly, we observed that nat3Δ mutation partially suppressed MMS sensitivity in srs2Δ cells and the synthetic sickness of srs2Δ sgs1Δ cells. Altogether, our results indicate that NatB functions upstream of Srs2 to activate the Rad51-dependent HR pathway for DSB repair.
Collapse
Affiliation(s)
- Natsuki Sugaya
- Department of Molecular Biology, Graduate School of Science, Gakushuin University
| | - Shion Tanaka
- Department of Molecular Biology, Graduate School of Science, Gakushuin University
| | - Kenji Keyamura
- Department of Molecular Biology, Graduate School of Science, Gakushuin University
| | - Shunsuke Noda
- Department of Molecular Biology, Graduate School of Science, Gakushuin University
| | - Genki Akanuma
- Department of Molecular Biology, Graduate School of Science, Gakushuin University
| | - Takashi Hishida
- Department of Molecular Biology, Graduate School of Science, Gakushuin University
| |
Collapse
|
5
|
Donnarumma F, Tucci V, Ambrosino C, Altucci L, Carafa V. NAA60 (HAT4): the newly discovered bi-functional Golgi member of the acetyltransferase family. Clin Epigenetics 2022; 14:182. [PMID: 36539894 PMCID: PMC9769039 DOI: 10.1186/s13148-022-01402-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open
Abstract
Chromatin structural organization, gene expression and proteostasis are intricately regulated in a wide range of biological processes, both physiological and pathological. Protein acetylation, a major post-translational modification, is tightly involved in interconnected biological networks, modulating the activation of gene transcription and protein action in cells. A very large number of studies describe the pivotal role of the so-called acetylome (accounting for more than 80% of the human proteome) in orchestrating different pathways in response to stimuli and triggering severe diseases, including cancer. NAA60/NatF (N-terminal acetyltransferase F), also named HAT4 (histone acetyltransferase type B protein 4), is a newly discovered acetyltransferase in humans modifying N-termini of transmembrane proteins starting with M-K/M-A/M-V/M-M residues and is also thought to modify lysine residues of histone H4. Because of its enzymatic features and unusual cell localization on the Golgi membrane, NAA60 is an intriguing acetyltransferase that warrants biochemical and clinical investigation. Although it is still poorly studied, this review summarizes current findings concerning the structural hallmarks and biological role of this novel targetable epigenetic enzyme.
Collapse
Affiliation(s)
- Federica Donnarumma
- grid.428067.f0000 0004 4674 1402Biogem, Molecular Biology and Genetics Research Institute, Ariano Irpino, Italy
| | - Valeria Tucci
- grid.428067.f0000 0004 4674 1402Biogem, Molecular Biology and Genetics Research Institute, Ariano Irpino, Italy ,grid.9841.40000 0001 2200 8888Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, Vico De Crecchio7, 80138 Naples, Italy
| | - Concetta Ambrosino
- grid.428067.f0000 0004 4674 1402Biogem, Molecular Biology and Genetics Research Institute, Ariano Irpino, Italy ,grid.47422.370000 0001 0724 3038Department of Science and Technology, University of Sannio, Benevento, Italy
| | - Lucia Altucci
- grid.428067.f0000 0004 4674 1402Biogem, Molecular Biology and Genetics Research Institute, Ariano Irpino, Italy ,grid.9841.40000 0001 2200 8888Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, Vico De Crecchio7, 80138 Naples, Italy
| | - Vincenzo Carafa
- grid.9841.40000 0001 2200 8888Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, Vico De Crecchio7, 80138 Naples, Italy
| |
Collapse
|
6
|
Kaushal P, Lee C. N-terminomics - its past and recent advancements. J Proteomics 2020; 233:104089. [PMID: 33359939 DOI: 10.1016/j.jprot.2020.104089] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 07/22/2020] [Accepted: 12/20/2020] [Indexed: 02/06/2023]
Abstract
N-terminomics is a rapidly evolving branch of proteomics that encompasses the study of protein N-terminal sequence. A proteome-wide collection of such sequences has been widely used to understand the proteolytic cascades and in annotating the genome. Over the last two decades, various N-terminomic strategies have been developed for achieving high sensitivity, greater depth of coverage, and high-throughputness. We, in this review, cover how the field of N-terminomics has evolved to date, including discussion on various sample preparation and N-terminal peptide enrichment strategies. We also compare different N-terminomic methods and highlight their relative benefits and shortcomings in their implementation. In addition, an overview of the currently available bioinformatics tools and data analysis pipelines for the annotation of N-terminomic datasets is also included. SIGNIFICANCE: It has been recognized that proteins undergo several post-translational modifications (PTM), and a number of perturbed biological pathways are directly associated with modifications at the terminal sites of a protein. In this regard, N-terminomics can be applied to generate a proteome-wide landscape of mature N-terminal sequences, annotate their source of generation, and recognize their significance in the biological pathways. Besides, a system-wide study can be used to study complicated proteolytic machinery and protease cleavage patterns for potential therapeutic targets. Moreover, due to unprecedented improvements in the analytical methods and mass spectrometry instrumentation in recent times, the N-terminomic methodologies now offers an unparalleled ability to study proteoforms and their implications in clinical conditions. Such approaches can further be applied for the detection of low abundant proteoforms, annotation of non-canonical protein coding sites, identification of candidate disease biomarkers, and, last but not least, the discovery of novel drug targets.
Collapse
Affiliation(s)
- Prashant Kaushal
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea; Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology, Seoul 02792, Republic of Korea
| | - Cheolju Lee
- Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea; Division of Bio-Medical Science & Technology, KIST School, Korea University of Science and Technology, Seoul 02792, Republic of Korea; KHU-KIST Department of Converging Science and Technology, Kyung Hee University, 26 Kyunghee-daero, Dongdaemun-gu, Seoul 02447, Republic of Korea.
| |
Collapse
|
7
|
Gottard A, Vannucci G, Marchetti GM. A note on the interpretation of tree-based regression models. Biom J 2020; 62:1564-1573. [PMID: 32449821 DOI: 10.1002/bimj.201900195] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 02/21/2020] [Accepted: 03/12/2020] [Indexed: 02/02/2023]
Abstract
Tree-based models are a popular tool for predicting a response given a set of explanatory variables when the regression function is characterized by a certain degree of complexity. Sometimes, they are also used to identify important variables and for variable selection. We show that if the generating model contains chains of direct and indirect effects, then the typical variable importance measures suggest selecting as important mainly the background variables, which have a strong indirect effect, disregarding the variables that directly influence the response. This is attributable mainly to the variable choice in the first steps of the algorithm selecting the splitting variable and to the greedy nature of such search. This pitfall could be relevant when using tree-based algorithms for understanding the underlying generating process, for population segmentation and for causal inference.
Collapse
Affiliation(s)
- Anna Gottard
- Department of Statistics, Computer Science, Applications, University of Florence, Florence, Italy.,Florence Center for Data Science, University of Florence, Florence, Italy
| | - Giulia Vannucci
- Department of Statistics, Computer Science, Applications, University of Florence, Florence, Italy
| | - Giovanni Maria Marchetti
- Department of Statistics, Computer Science, Applications, University of Florence, Florence, Italy.,Florence Center for Data Science, University of Florence, Florence, Italy
| |
Collapse
|
8
|
Lapteva YS, Vologzhannikova AA, Sokolov AS, Ismailov RG, Uversky VN, Permyakov SE. In Vitro N-Terminal Acetylation of Bacterially Expressed Parvalbumins by N-Terminal Acetyltransferases from Escherichia coli. Appl Biochem Biotechnol 2020; 193:1365-1378. [PMID: 32394317 DOI: 10.1007/s12010-020-03324-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 04/23/2020] [Indexed: 11/28/2022]
Abstract
Most eukaryotic proteins are N-terminally acetylated (Nt-acetylated) by specific N-terminal acetyltransferases (NATs). Although this co-/post-translational protein modification may affect different aspects of protein functioning, it is typically neglected in studies of bacterially expressed eukaryotic proteins, lacking this modification. To overcome this limitation of bacterial expression, we have probed the efficiency of recombinant Escherichia coli NATs (RimI, RimJ, and RimL) with regard to in vitro Nt-acetylation of several parvalbumins (PAs) expressed in E. coli. PA is a calcium-binding protein of vertebrates, which is sensitive to Nt-acetylation. Our analyses revealed that only metal-free PAs were prone to Nt-acetylation (up to 100%), whereas Ca2+ binding abolished this modification, thereby indicating that Ca2+-induced structural stabilization of PAs impedes their Nt-acetylation. RimJ and RimL were active towards all PAs with N-terminal serine. Their activity towards PAs beginning with alanine was PA-specific, suggesting the importance of the subsequent residues. RimI showed the least activity regardless of the PA studied. Overall, NATs from E. coli are suited for post-translational Nt-acetylation of bacterially expressed eukaryotic proteins with decreased structural stability.
Collapse
Affiliation(s)
- Yulia S Lapteva
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, 142290, Russia.
| | - Alisa A Vologzhannikova
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, 142290, Russia
| | - Andrey S Sokolov
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, 142290, Russia
| | - Ramis G Ismailov
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, 142290, Russia
| | - Vladimir N Uversky
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, 142290, Russia. .,Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
| | - Sergei E Permyakov
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, 142290, Russia
| |
Collapse
|
9
|
Prediction of Extracellular Matrix Proteins by Fusing Multiple Feature Information, Elastic Net, and Random Forest Algorithm. MATHEMATICS 2020. [DOI: 10.3390/math8020169] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Extracellular matrix (ECM) proteins play an important role in a series of biological processes of cells. The study of ECM proteins is helpful to further comprehend their biological functions. We propose ECMP-RF (extracellular matrix proteins prediction by random forest) to predict ECM proteins. Firstly, the features of the protein sequence are extracted by combining encoding based on grouped weight, pseudo amino-acid composition, pseudo position-specific scoring matrix, a local descriptor, and an autocorrelation descriptor. Secondly, the synthetic minority oversampling technique (SMOTE) algorithm is employed to process the class imbalance data, and the elastic net (EN) is used to reduce the dimension of the feature vectors. Finally, the random forest (RF) classifier is used to predict the ECM proteins. Leave-one-out cross-validation shows that the balanced accuracy of the training and testing datasets is 97.3% and 97.9%, respectively. Compared with other state-of-the-art methods, ECMP-RF is significantly better than other predictors.
Collapse
|
10
|
Lin X, Quan Z, Wang ZJ, Huang H, Zeng X. A novel molecular representation with BiGRU neural networks for learning atom. Brief Bioinform 2019; 21:2099-2111. [DOI: 10.1093/bib/bbz125] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 08/15/2019] [Accepted: 08/31/2019] [Indexed: 12/20/2022] Open
Abstract
Abstract
Molecular representations play critical roles in researching drug design and properties, and effective methods are beneficial to assisting in the calculation of molecules and solving related problem in drug discovery. In previous years, most of the traditional molecular representations are based on hand-crafted features and rely heavily on biological experimentations, which are often costly and time consuming. However, recent researches achieve promising results using machine learning on various domains. In this article, we present a novel method named Smi2Vec-BiGRU that is designed for learning atoms and solving the single- and multitask binary classification problems in the field of drug discovery, which are the basic and also key problems in this field. Specifically, our approach transforms the molecule data in the SMILES format into a set of sample vectors and then feeds them into the bidirectional gated recurrent unit neural networks for training, which learns low-dimensional vector representations for molecular drug. We conduct extensive experiments on several widely used benchmarks including Tox21, SIDER and ClinTox. The experimental results show that our approach can achieve state-of-the-art performance on these benchmarking datasets, demonstrating the feasibility and competitiveness of our proposed approach.
Collapse
Affiliation(s)
- Xuan Lin
- College of Computer Science and Technology, Hunan University, Changsha, 410082, China
| | - Zhe Quan
- College of Computer Science and Technology, Hunan University, Changsha, 410082, China
| | - Zhi-Jie Wang
- College of Computer Science and Technology, Hunan University, Changsha, 410082, China
| | - Huang Huang
- College of Computer, National University of Defense Technology, Changsha, 410073,China
| | - Xiangxiang Zeng
- College of Computer Science and Technology, Hunan University, Changsha, 410082, China
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510275, China
| |
Collapse
|
11
|
Zhang P, Liu P, Xu Y, Liang Y, Wang PG, Cheng J. N-acetyltransferases from three different organisms displaying distinct selectivity toward hexosamines and N-terminal amine of peptides. Carbohydr Res 2018; 472:72-75. [PMID: 30500476 DOI: 10.1016/j.carres.2018.11.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2018] [Revised: 11/18/2018] [Accepted: 11/21/2018] [Indexed: 12/31/2022]
Abstract
N-acetyltransferases are a family of enzymes that catalyze the transfer of the acetyl moiety (COCH3) from acetyl coenzyme A (Acetyl-CoA) to a primary amine of acceptor substrates from small molecules such as aminoglycoside to macromolecules of various proteins. In this study, the substrate selectivity of three N-acetyltransferases falling into different phylogenetic groups was probed against a series of hexosamines and synthetic peptides. GlmA from Clostridium acetobutylicum and RmNag from Rhizomucor miehei, which have been defined as glucosamine N-acetyltransferases, were herein demonstrated to be also capable of acetylating the free amino group on the very first glycine residue of peptide in spite of varied catalytic efficiency. The human recombinant N-acetyltransferase of Naa10p, however, prefers primary amine groups in the peptides as opposed to glucosamine. The varied preference of GlmA, RmNag and Naa10p probably arose from the divergent evolution of these N-acetyltransferases. The expanded knowledge of acceptor specificity would as well facilitate the application of these N-acetyltransferases in the acetylation of hexosamines or peptides.
Collapse
Affiliation(s)
- Peiru Zhang
- College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin, 300353, PR China
| | - Pei Liu
- College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin, 300353, PR China
| | - Yangyang Xu
- College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin, 300353, PR China
| | - Yulu Liang
- College of Chemistry, Nankai University, Tianjin, 300071, PR China
| | - Peng George Wang
- College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin, 300353, PR China
| | - Jiansong Cheng
- College of Pharmacy and Tianjin Key Laboratory of Molecular Drug Research, Nankai University, Haihe Education Park, 38 Tongyan Road, Tianjin, 300353, PR China.
| |
Collapse
|