Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang D, Liang Y, Xu D. Capsule network for protein post-translational modification site prediction. Bioinformatics 2020;35:2386-2394. [PMID: 30520972 DOI: 10.1093/bioinformatics/bty977] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 10/13/2018] [Accepted: 12/05/2018] [Indexed: 11/12/2022] Open

For:	Wang D, Liang Y, Xu D. Capsule network for protein post-translational modification site prediction. Bioinformatics 2020;35:2386-2394. [PMID: 30520972 DOI: 10.1093/bioinformatics/bty977] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 10/13/2018] [Accepted: 12/05/2018] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Chen Y, Sheng G, Wang G. CapsNet-TIS: Predicting translation initiation site based on multi-feature fusion and improved capsule network. Gene 2024;924:148598. [PMID: 38782224 DOI: 10.1016/j.gene.2024.148598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/22/2024] [Accepted: 05/20/2024] [Indexed: 05/25/2024]

Yao L, Xie P, Guan J, Chung CR, Huang Y, Pang Y, Wu H, Chiang YC, Lee TY. CapsEnhancer: An Effective Computational Framework for Identifying Enhancers Based on Chaos Game Representation and Capsule Network. J Chem Inf Model 2024;64:5725-5736. [PMID: 38946113 PMCID: PMC11267569 DOI: 10.1021/acs.jcim.4c00546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 06/21/2024] [Accepted: 06/21/2024] [Indexed: 07/02/2024]

Hu F, Gao J, Zheng J, Kwoh C, Jia C. N-GlycoPred: A hybrid deep learning model for accurate identification of N-glycosylation sites. Methods 2024;227:48-57. [PMID: 38734394 DOI: 10.1016/j.ymeth.2024.05.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/16/2024] [Accepted: 05/03/2024] [Indexed: 05/13/2024] Open

Gutierrez CS, Kassim AA, Gutierrez BD, Raines RT. Sitetack: A Deep Learning Model that Improves PTM Prediction by Using Known PTMs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.03.596298. [PMID: 38895359 PMCID: PMC11185516 DOI: 10.1101/2024.06.03.596298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]

Ke J, Zhao J, Li H, Yuan L, Dong G, Wang G. Prediction of protein N-terminal acetylation modification sites based on CNN-BiLSTM-attention model. Comput Biol Med 2024;174:108330. [PMID: 38588617 DOI: 10.1016/j.compbiomed.2024.108330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/06/2024] [Accepted: 03/17/2024] [Indexed: 04/10/2024]

Abstract

N-terminal acetylation is one of the most common and important post-translational modifications (PTM) of eukaryotic proteins. PTM plays a crucial role in various cellular processes and disease pathogenesis. Thus, the accurate identification of N-terminal acetylation modifications is important to gain insight into cellular processes and other possible functional mechanisms. Although some algorithmic models have been proposed, most have been developed based on traditional machine learning algorithms and small training datasets. Their practical applications are limited. Nevertheless, deep learning algorithmic models are better at handling high-throughput and complex data. In this study, DeepCBA, a model based on the hybrid framework of convolutional neural network (CNN), bidirectional long short-term memory network (BiLSTM), and attention mechanism deep learning, was constructed to detect the N-terminal acetylation sites. The DeepCBA was built as follows: First, a benchmark dataset was generated by selecting low-redundant protein sequences from the Uniport database and further reducing the redundancy of the protein sequences using the CD-HIT tool. Subsequently, based on the skip-gram model in the word2vec algorithm, tripeptide word vector features were generated on the benchmark dataset. Finally, the CNN, BiLSTM, and attention mechanism were combined, and the tripeptide word vector features were fed into the stacked model for multiple rounds of training. The model performed excellently on independent dataset test, with accuracy and area under the curve of 80.51% and 87.36%, respectively. Altogether, DeepCBA achieved superior performance compared with the baseline model, and significantly outperformed most existing predictors. Additionally, our model can be used to identify disease loci and drug targets.

Collapse

Gao J, Zhao Y, Chen C, Ning Q. MVNN-HNHC:A multi-view neural network for identification of human non-histone crotonylation sites. Anal Biochem 2024;687:115426. [PMID: 38141798 DOI: 10.1016/j.ab.2023.115426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 11/21/2023] [Accepted: 12/06/2023] [Indexed: 12/25/2023]

Zahiri Z, Mehrshad N, Mehrshad M. DF-Phos: Prediction of Protein Phosphorylation Sites by Deep Forest. J Biochem 2024;175:447-456. [PMID: 38153271 DOI: 10.1093/jb/mvad116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 12/10/2023] [Accepted: 12/12/2023] [Indexed: 12/29/2023] Open

León-García F, García-Laynes F, Estrada-Tapia G, Monforte-González M, Martínez-Estevez M, Echevarría-Machado I. In Silico Analysis of Glutamate Receptors in Capsicum chinense: Structure, Evolution, and Molecular Interactions. PLANTS (BASEL, SWITZERLAND) 2024;13:812. [PMID: 38592787 PMCID: PMC10975470 DOI: 10.3390/plants13060812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 02/27/2024] [Accepted: 03/06/2024] [Indexed: 04/11/2024]

Behairy MY, Tawfik NZ, Eid RA, Nasser Binjawhar D, Alshaya DS, Fayad E, Elkhatib WF, Abdallah HY. Mannose-binding lectin gene polymorphism in psoriasis and vitiligo: an observational study and computational analysis. Front Med (Lausanne) 2024;10:1340703. [PMID: 38404462 PMCID: PMC10885344 DOI: 10.3389/fmed.2023.1340703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 12/28/2023] [Indexed: 02/27/2024] Open

Abstract

Introduction

Psoriasis and vitiligo are inflammatory autoimmune skin disorders with remarkable genetic involvement. Mannose-binding lectin (MBL) represents a significant immune molecule with one of its gene variants strongly linked to autoimmune diseases. Therefore, in this study, we investigated the role of the MBL variant, rs1800450, in psoriasis and vitiligo disease susceptibility.

Methods

The study comprised performing in silico analysis, performing an observational study regarding psoriasis patients, and performing an observational study regarding vitiligo patients. Various in silico tools were used to investigate the impact of the selected mutation on the function, stability, post-translational modifications (PTMs), and secondary structures of the protein. In addition, a total of 489 subjects were enrolled in this study, including their demographic and clinicopathological data. Genotyping analysis was performed using real-time PCR for the single nucleotide polymorphism (SNP) rs1800450 on codon 54 of the MBL gene, utilizing TaqMan genotyping technology. In addition, implications of the studied variant on disease susceptibility and various clinicopathological data were analyzed.

Results

Computational analysis demonstrated the anticipated effects of the mutation on MBL protein. Furthermore, regarding the observational studies, rs1800450 SNP on codon 54 displayed comparable results in our population relative to global frequencies reported via the 1,000 Genomes Project. This SNP showed no significant association with either psoriasis or vitiligo disease risk in all genetic association models. Furthermore, rs1800450 SNP did not significantly correlate with any of the demographic or clinicopathological features of both psoriasis and vitiligo.

Discussion

Our findings highlighted that the rs1800450 SNP on the MBL2 gene has no role in the disease susceptibility to autoimmune skin diseases, such as psoriasis and vitiligo, among Egyptian patients. In addition, our analysis advocated the notion of the redundancy of MBL and revealed the lack of significant impact on both psoriasis and vitiligo disorders.

Collapse

Jia J, Lv P, Wei X, Qiu W. SNO-DCA: A model for predicting S-nitrosylation sites based on densely connected convolutional networks and attention mechanism. Heliyon 2024;10:e23187. [PMID: 38148797 PMCID: PMC10750070 DOI: 10.1016/j.heliyon.2023.e23187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 12/28/2023] Open

Hu F, Li W, Li Y, Hou C, Ma J, Jia C. O-GlcNAcPRED-DL: Prediction of Protein O-GlcNAcylation Sites Based on an Ensemble Model of Deep Learning. J Proteome Res 2024;23:95-106. [PMID: 38054441 DOI: 10.1021/acs.jproteome.3c00458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]

Waury K, Gogishvili D, Nieuwland R, Chatterjee M, Teunissen CE, Abeln S. Proteome encoded determinants of protein sorting into extracellular vesicles. JOURNAL OF EXTRACELLULAR BIOLOGY 2024;3:e120. [PMID: 38938677 PMCID: PMC11080751 DOI: 10.1002/jex2.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 09/13/2023] [Accepted: 10/05/2023] [Indexed: 06/29/2024]

Abstract

Extracellular vesicles (EVs) are membranous structures released by cells into the extracellular space and are thought to be involved in cell-to-cell communication. While EVs and their cargo are promising biomarker candidates, sorting mechanisms of proteins to EVs remain unclear. In this study, we ask if it is possible to determine EV association based on the protein sequence. Additionally, we ask what the most important determinants are for EV association. We answer these questions with explainable AI models, using human proteome data from EV databases to train and validate the model. It is essential to correct the datasets for contaminants introduced by coarse EV isolation workflows and for experimental bias caused by mass spectrometry. In this study, we show that it is indeed possible to predict EV association from the protein sequence: a simple sequence-based model for predicting EV proteins achieved an area under the curve of 0.77 ± 0.01, which increased further to 0.84 ± 0.00 when incorporating curated post-translational modification (PTM) annotations. Feature analysis shows that EV-associated proteins are stable, polar, and structured with low isoelectric point compared to non-EV proteins. PTM annotations emerged as the most important features for correct classification; specifically, palmitoylation is one of the most prevalent EV sorting mechanisms for unique proteins. Palmitoylation and nitrosylation sites are especially prevalent in EV proteins that are determined by very strict isolation protocols, indicating they could potentially serve as quality control criteria for future studies. This computational study offers an effective sequence-based predictor of EV associated proteins with extensive characterisation of the human EV proteome that can explain for individual proteins which factors contribute to their EV association.

Collapse

Xie J, Quan L, Wang X, Wu H, Jin Z, Pan D, Chen T, Wu T, Lyu Q. DeepMPSF: A Deep Learning Network for Predicting General Protein Phosphorylation Sites Based on Multiple Protein Sequence Features. J Chem Inf Model 2023;63:7258-7271. [PMID: 37931253 DOI: 10.1021/acs.jcim.3c00996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]

Abstract

Phosphorylation, as one of the most important post-translational modifications, plays a key role in various cellular physiological processes and disease occurrences. In recent years, computer technology has been gradually applied to the prediction of protein phosphorylation sites. However, most existing methods rely on simple protein sequence features that provide limited contextual information. To overcome this limitation, we propose DeepMPSF, a phosphorylation site prediction model based on multiple protein sequence features. There are two types of features: sequence semantic features, which comprise protein residue type information and relative position information within protein sequence, and protein background biophysical features, which include global semantic information containing more comprehensive protein background information obtained from pretrained models. To extract these features, DeepMPSF employs two separate subnetworks: the S71SFE module and the BBFE module, which automatically extract high-level semantic features. Our model incorporates a learning strategy for handling imbalanced datasets through ensemble learning during training and prediction. DeepMPSF is trained and evaluated on a well-established dataset of human proteins. Comparing the analysis with other benchmark methods reveals that DeepMPSF outperforms in predicting both S/T residues and Y residues. In particular, DeepMPSF showed excellent generalization performance in cross-species blind test performance, with an average improvement of 5.63%/5.72%, 22.28%/25.94%, 20.11%/17.49%, and 26.40%/28.33% for Mus musculus/Rattus norvegicus test sets in area under curves (AUCs) of ROC curve, AUC of the PR curve, F1-score, and MCC metrics, respectively. Furthermore, it also shows excellent performance in the latest updated case of natural proteins with functional phosphorylation sites. Through an ablation study and visual analysis, we uncover that the design of different feature modules significantly contributes to the accurate classification of DeepMPSF, which provides valuable insights for predicting phosphorylation sites and offers effective support for future downstream research.

Collapse

Shen Z, Liu W, Zhao S, Zhang Q, Wang S, Yuan L. Nucleotide-level prediction of CircRNA-protein binding based on fully convolutional neural network. Front Genet 2023;14:1283404. [PMID: 37867600 PMCID: PMC10587422 DOI: 10.3389/fgene.2023.1283404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 09/21/2023] [Indexed: 10/24/2023] Open

Liang Z, Liu T, Li Q, Zhang G, Zhang B, Du X, Liu J, Chen Z, Ding H, Hu G, Lin H, Zhu F, Luo C. Deciphering the functional landscape of phosphosites with deep neural network. Cell Rep 2023;42:113048. [PMID: 37659078 DOI: 10.1016/j.celrep.2023.113048] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/11/2023] [Accepted: 08/11/2023] [Indexed: 09/04/2023] Open

Affiliation(s)

Zhongjie Liang Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou 215123, China
Tonghai Liu Zhongshan Institute for Drug Discovery, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Zhongshan 528437, China; State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
Qi Li Zhongshan Institute for Drug Discovery, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Zhongshan 528437, China; State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
Guangyu Zhang School of Computer Science and Technology, Soochow University, Suzhou 215006, China
Bei Zhang State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
Xikun Du Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China
Jingqiu Liu State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
Zhifeng Chen State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
Hong Ding State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
Guang Hu Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou 215123, China
Hao Lin School of Computer Science and Technology, Soochow University, Suzhou 215006, China
Fei Zhu School of Computer Science and Technology, Soochow University, Suzhou 215006, China.
Cheng Luo Zhongshan Institute for Drug Discovery, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Zhongshan 528437, China; State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China; School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China; School of Life Science and Technology, Shanghai Tech University, 100 Haike Road, Shanghai 201210, China; School of Pharmacy, Fujian Medical University, Fuzhou 350122, China.

Collapse

Shoombuatong W, Schaduangrat N, Nikom J. Empirical comparison and analysis of machine learning-based approaches for druggable protein identification. EXCLI JOURNAL 2023;22:915-927. [PMID: 37780939 PMCID: PMC10539545 DOI: 10.17179/excli2023-6410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 08/15/2023] [Indexed: 10/03/2023]

Abstract

Efficiently and precisely identifying drug targets is crucial for developing and discovering potential medications. While conventional experimental approaches can accurately pinpoint these targets, they suffer from time constraints and are not easily adaptable to high-throughput processes. On the other hand, computational approaches, particularly those utilizing machine learning (ML), offer an efficient means to accelerate the prediction of druggable proteins based solely on their primary sequences. Recently, several state-of-the-art computational methods have been developed for predicting and analyzing druggable proteins. These computational methods showed high diversity in terms of benchmark datasets, feature extraction schemes, ML algorithms, evaluation strategies and webserver/software usability. Thus, our objective is to reexamine these computational approaches and conduct a comprehensive assessment of their strengths and weaknesses across multiple aspects. In this study, we deliver the first comprehensive survey regarding the state-of-the-art computational approaches for in silico prediction of druggable proteins. First, we provided information regarding the existing benchmark datasets and the types of ML methods employed. Second, we investigated the effectiveness of these computational methods in druggable protein identification for each benchmark dataset. Third, we summarized the important features used in this field and the existing webserver/software. Finally, we addressed the present constraints of the existing methods and offer valuable guidance to the scientific community in designing and developing novel prediction models. We anticipate that this comprehensive review will provide crucial information for the development of more accurate and efficient druggable protein predictors.

Collapse

Zhang Z, Li F, Zhao J, Zheng C. CapsNetYY1: identifying YY1-mediated chromatin loops based on a capsule network architecture. BMC Genomics 2023;24:448. [PMID: 37559017 PMCID: PMC10410878 DOI: 10.1186/s12864-023-09217-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 02/28/2023] [Indexed: 08/11/2023] Open

Pakhrin SC, Pokharel S, Pratyush P, Chaudhari M, Ismail HD, Kc DB. LMPhosSite: A Deep Learning-Based Approach for General Protein Phosphorylation Site Prediction Using Embeddings from the Local Window Sequence and Pretrained Protein Language Model. J Proteome Res 2023;22:2548-2557. [PMID: 37459437 DOI: 10.1021/acs.jproteome.2c00667] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]

Ibtehaz N, Sourav SMSH, Bayzid MS, Rahman MS. Align-gram: Rethinking the Skip-gram Model for Protein Sequence Analysis. Protein J 2023;42:135-146. [PMID: 36977849 DOI: 10.1007/s10930-023-10096-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2023] [Indexed: 03/29/2023]

Liver CT Image Recognition Method Based on Capsule Network. INFORMATION 2023. [DOI: 10.3390/info14030183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023] Open

In Silico Examination of Single Nucleotide Missense Mutations in NHLH2, a Gene Linked to Infertility and Obesity. Int J Mol Sci 2023;24:ijms24043193. [PMID: 36834605 PMCID: PMC9968165 DOI: 10.3390/ijms24043193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/25/2023] [Accepted: 01/30/2023] [Indexed: 02/09/2023] Open

Abstract

Continual advances in our understanding of the human genome have led to exponential increases in known single nucleotide variants. The characterization of each of the variants lags behind. For researchers needing to study a single gene, or multiple genes in a pathway, there must be ways to narrow down pathogenic variants from those that are silent or pose less pathogenicity. In this study, we use the NHLH2 gene which encodes the nescient helix-loop-helix 2 (Nhlh2) transcription factor in a systematic analysis of all missense mutations to date in the gene. The NHLH2 gene was first described in 1992. Knockout mice created in 1997 indicated a role for this protein in body weight control, puberty, and fertility, as well as the motivation for sex and exercise. Only recently have human carriers of NHLH2 missense variants been characterized. Over 300 missense variants for the NHLH2 gene are listed in the NCBI single nucleotide polymorphism database (dbSNP). Using in silico tools, predicted pathogenicity of the variants narrowed the missense variants to 37 which were predicted to affect NHLH2 function. These 37 variants cluster around the basic-helix-loop-helix and DNA binding domains of the transcription factor, and further analysis using in silico tools provided 21 SNV resulting in 22 amino acid changes for future wet lab analysis. The tools used, findings, and predictions for the variants are discussed considering the known function of the NHLH2 transcription factor. Overall use of these in silico tools and analysis of these data contribute to our knowledge of a protein which is both involved in the human genetic syndrome, Prader-Willi syndrome, and in controlling genes involved in body weight control, fertility, puberty, and behavior in the general population, and may provide a systematic methodology for others to characterize variants for their gene of interest.

Collapse

Zhou Z, Yeung W, Gravel N, Salcedo M, Soleymani S, Li S, Kannan N. Phosformer: an explainable transformer model for protein kinase-specific phosphorylation predictions. Bioinformatics 2023;39:7000331. [PMID: 36692152 PMCID: PMC9900213 DOI: 10.1093/bioinformatics/btad046] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 01/16/2023] [Accepted: 01/23/2023] [Indexed: 01/25/2023] Open

Abstract

MOTIVATION

The human genome encodes over 500 distinct protein kinases which regulate nearly all cellular processes by the specific phosphorylation of protein substrates. While advances in mass spectrometry and proteomics studies have identified thousands of phosphorylation sites across species, information on the specific kinases that phosphorylate these sites is currently lacking for the vast majority of phosphosites. Recently, there has been a major focus on the development of computational models for predicting kinase-substrate associations. However, most current models only allow predictions on a subset of well-studied kinases. Furthermore, the utilization of hand-curated features and imbalances in training and testing datasets pose unique challenges in the development of accurate predictive models for kinase-specific phosphorylation prediction. Motivated by the recent development of universal protein language models which automatically generate context-aware features from primary sequence information, we sought to develop a unified framework for kinase-specific phosphosite prediction, allowing for greater investigative utility and enabling substrate predictions at the whole kinome level.

RESULTS

We present a deep learning model for kinase-specific phosphosite prediction, termed Phosformer, which predicts the probability of phosphorylation given an arbitrary pair of unaligned kinase and substrate peptide sequences. We demonstrate that Phosformer implicitly learns evolutionary and functional features during training, removing the need for feature curation and engineering. Further analyses reveal that Phosformer also learns substrate specificity motifs and is able to distinguish between functionally distinct kinase families. Benchmarks indicate that Phosformer exhibits significant improvements compared to the state-of-the-art models, while also presenting a more generalized, unified, and interpretable predictive framework.

AVAILABILITY AND IMPLEMENTATION

Code and data are available at https://github.com/esbgkannan/phosformer.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Patton BK, Madadi S, Briley SM, Ahmed AA, Pangas SA. Sumoylation regulates functional properties of the oocyte transcription factors SOHLH1 and NOBOX. FASEB J 2023;37:e22747. [PMID: 36607631 PMCID: PMC10129296 DOI: 10.1096/fj.202201481r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/02/2022] [Accepted: 12/19/2022] [Indexed: 01/07/2023]

Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023;3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Affiliation(s)

Zhongxiao Li Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Elva Gao The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Juexiao Zhou Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Wenkai Han Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Xiaopeng Xu Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Xin Gao Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia

Collapse

A Novel Capsule Network with Attention Routing to Identify Prokaryote Phosphorylation Sites. Biomolecules 2022;12:biom12121854. [PMID: 36551282 PMCID: PMC9775645 DOI: 10.3390/biom12121854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/07/2022] [Accepted: 12/09/2022] [Indexed: 12/14/2022] Open

Abstract

By denaturing proteins and promoting the formation of multiprotein complexes, protein phosphorylation has important effects on the activity of protein functional molecules and cell signaling. The regulation of protein phosphorylation allows microbes to respond rapidly and reversibly to specific environmental stimuli or niches, which is closely related to the molecular mechanisms of bacterial drug resistance. Accurate prediction of phosphorylation sites (p-site) of prokaryotes can contribute to addressing bacterial resistance and providing new perspectives for developing novel antibacterial drugs. Most existing studies focus on human phosphorylation sites, while tools targeting phosphorylation site identification of prokaryotic proteins are still relatively scarce. This study designs a capsule network-based prediction technique for p-site in prokaryotes. To address the poor scalability and unreliability of dynamic routing processes in the output space of capsule networks, a more reliable way is introduced to learn the consistency between capsules. We incorporate a self-attention mechanism into the routing algorithm to capture the global information of the capsule, reducing the computational effort while enriching the representation capability of the capsule. Aiming at the weak robustness of the model, EcapsP improves the prediction accuracy and stability by introducing shortcuts and unconditional reconfiguration. In addition, the study compares and analyzes the prediction performance based on word vectors, physicochemical properties, and mixing characteristics in predicting serine (Ser/S), threonine (Thr/T), and tyrosine (Tyr/Y) p-site. The comprehensive experimental results show that the accuracy of the developed technique is close to 70% for the identification of the three phosphorylation sites in prokaryotes. Importantly, in side-by-side comparisons with other state-of-the-art predictors, our method improves the Matthews correlation coefficient (MCC) by approximately 7%. The results demonstrate the superiority of EcapsP in terms of high performance and reliability.

Collapse

Li W, Wang J, Luo Y, Bezabih TT. Multi-dimensional feature recognition model based on capsule network for ubiquitination site prediction. PeerJ 2022;10:e14427. [PMID: 36523471 PMCID: PMC9745908 DOI: 10.7717/peerj.14427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 10/30/2022] [Indexed: 12/12/2022] Open

Khanal J, Kandel J, Tayara H, Chong KT. CapsNh-Kcr: Capsule network-based prediction of lysine crotonylation sites in human non-histone proteins. Comput Struct Biotechnol J 2022;21:120-127. [PMID: 36544479 PMCID: PMC9735261 DOI: 10.1016/j.csbj.2022.11.056] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/10/2022] [Accepted: 11/26/2022] [Indexed: 12/04/2022] Open

Jia J, Wu G, Li M, Qiu W. pSuc-EDBAM: Predicting lysine succinylation sites in proteins based on ensemble dense blocks and an attention module. BMC Bioinformatics 2022;23:450. [PMCID: PMC9620660 DOI: 10.1186/s12859-022-05001-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/25/2022] [Indexed: 11/10/2022] Open

Zhao J, Jiang H, Zou G, Lin Q, Wang Q, Liu J, Ma L. CNNArginineMe: A CNN structure for training models for predicting arginine methylation sites based on the One-Hot encoding of peptide sequence. Front Genet 2022;13:1036862. [PMID: 36324513 PMCID: PMC9618650 DOI: 10.3389/fgene.2022.1036862] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 10/04/2022] [Indexed: 11/30/2022] Open

Behairy MY, Soltan MA, Adam MS, Refaat AM, Ezz EM, Albogami S, Fayad E, Althobaiti F, Gouda AM, Sileem AE, Elfaky MA, Darwish KM, Alaa Eldeen M. Computational Analysis of Deleterious SNPs in NRAS to Assess Their Potential Correlation With Carcinogenesis. Front Genet 2022;13:872845. [PMID: 36051694 PMCID: PMC9424727 DOI: 10.3389/fgene.2022.872845] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/03/2022] [Indexed: 12/12/2022] Open

Abstract

The NRAS gene is a well-known oncogene that acts as a major player in carcinogenesis. Mutations in the NRAS gene have been linked to multiple types of human tumors. Therefore, the identification of the most deleterious single nucleotide polymorphisms (SNPs) in the NRAS gene is necessary to understand the key factors of tumor pathogenesis and therapy. We aimed to retrieve NRAS missense SNPs and analyze them comprehensively using sequence and structure approaches to determine the most deleterious SNPs that could increase the risk of carcinogenesis. We also adopted structural biology methods and docking tools to investigate the behavior of the filtered SNPs. After retrieving missense SNPs and analyzing them using six in silico tools, 17 mutations were found to be the most deleterious mutations in NRAS. All SNPs except S145L were found to decrease NRAS stability, and all SNPs were found on highly conserved residues and important functional domains, except R164C. In addition, all mutations except G60E and S145L showed a higher binding affinity to GTP, implicating an increase in malignancy tendency. As a consequence, all other 14 mutations were expected to increase the risk of carcinogenesis, with 5 mutations (G13R, G13C, G13V, P34R, and V152F) expected to have the highest risk. Thermodynamic stability was ensured for these SNP models through molecular dynamics simulation based on trajectory analysis. Free binding affinity toward the natural substrate, GTP, was higher for these models as compared to the native NRAS protein. The Gly13 SNP proteins depict a differential conformational state that could favor nucleotide exchange and catalytic potentiality. A further application of experimental methods with all these 14 mutations could reveal new insights into the pathogenesis and management of different types of tumors.

Collapse

Mini-review: Recent advances in post-translational modification site prediction based on deep learning. Comput Struct Biotechnol J 2022;20:3522-3532. [PMID: 35860402 PMCID: PMC9284371 DOI: 10.1016/j.csbj.2022.06.045] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/21/2022] [Accepted: 06/21/2022] [Indexed: 11/23/2022] Open

DeepDA-Ace: A Novel Domain Adaptation Method for Species-Specific Acetylation Site Prediction. MATHEMATICS 2022. [DOI: 10.3390/math10142364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Kang W, Liu L, Yu P, Zhang T, Lei C, Nie Z. A switchable Cas12a enabling CRISPR-based direct histone deacetylase activity detection. Biosens Bioelectron 2022;213:114468. [PMID: 35700604 DOI: 10.1016/j.bios.2022.114468] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/30/2022] [Accepted: 06/06/2022] [Indexed: 11/02/2022]

Classification of Blood Cells Using Optimized Capsule Networks. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10833-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

A Comprehensive Review of Computation-Based Metal-Binding Prediction Approaches at the Residue Level. BIOMED RESEARCH INTERNATIONAL 2022;2022:8965712. [PMID: 35402609 PMCID: PMC8989566 DOI: 10.1155/2022/8965712] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 03/04/2022] [Indexed: 12/29/2022]

Wang H, Zhao H, Zhang J, Han J, Liu Z. A parallel model of DenseCNN and ordered-neuron LSTM for generic and species-specific succinylation site prediction. Biotechnol Bioeng 2022;119:1755-1767. [PMID: 35320585 DOI: 10.1002/bit.28091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 03/12/2022] [Accepted: 03/19/2022] [Indexed: 11/07/2022]

Dou L, Zhang Z, Xu L, Zou Q. iKcr_CNN: A novel computational tool for imbalance classification of human nonhistone crotonylation sites based on convolutional neural networks with focal loss. Comput Struct Biotechnol J 2022;20:3268-3279. [PMID: 35832615 PMCID: PMC9251780 DOI: 10.1016/j.csbj.2022.06.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 06/13/2022] [Accepted: 06/13/2022] [Indexed: 11/26/2022] Open

Iannetta AA, Hicks LM. Maximizing Depth of PTM Coverage: Generating Robust MS Datasets for Computational Prediction Modeling. Methods Mol Biol 2022;2499:1-41. [PMID: 35696073 DOI: 10.1007/978-1-0716-2317-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Arico DS, Beati P, Wengier DL, Mazzella MA. A novel strategy to uncover specific GO terms/phosphorylation pathways in phosphoproteomic data in Arabidopsis thaliana. BMC PLANT BIOLOGY 2021;21:592. [PMID: 34906086 PMCID: PMC8670200 DOI: 10.1186/s12870-021-03377-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 11/29/2021] [Indexed: 06/14/2023]

Abstract

BACKGROUND

Proteins are the workforce of the cell and their phosphorylation status tailors specific responses efficiently. One of the main challenges of phosphoproteomic approaches is to deconvolute biological processes that specifically respond to an experimental query from a list of phosphoproteins. Comparison of the frequency distribution of GO (Gene Ontology) terms in a given phosphoproteome set with that observed in the genome reference set (GenRS) is the most widely used tool to infer biological significance. Yet, this comparison assumes that GO term distribution between the phosphoproteome and the genome are identical. However, this hypothesis has not been tested due to the lack of a comprehensive phosphoproteome database.

RESULTS

In this study, we test this hypothesis by constructing three phosphoproteome databases in Arabidopsis thaliana: one based in experimental data (ExpRS), another based in in silico phosphorylation protein prediction (PredRS) and a third that is the union of both (UnRS). Our results show that the three phosphoproteome reference sets show default enrichment of several GO terms compared to GenRS, indicating that GO term distribution in the phosphoproteomes does not match that of the genome. Moreover, these differences overshadow the identification of GO terms that are specifically enriched in a particular condition. To overcome this limitation, we present an additional comparison of the sample of interest with UnRS to uncover GO terms specifically enriched in a particular phosphoproteome experiment. Using this strategy, we found that mRNA splicing and cytoplasmic microtubule compounds are important processes specifically enriched in the phosphoproteome of dark-grown Arabidopsis seedlings.

CONCLUSIONS

This study provides a novel strategy to uncover GO specific terms in phosphoproteome data of Arabidopsis that could be applied to any other organism. We also highlight the importance of specific phosphorylation pathways that take place during dark-grown Arabidopsis development.

Collapse

Khanal J, Tayara H, Zou Q, To Chong K. DeepCap-Kcr: accurate identification and investigation of protein lysine crotonylation sites based on capsule network. Brief Bioinform 2021;23:6457166. [PMID: 34882222 DOI: 10.1093/bib/bbab492] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 10/13/2021] [Accepted: 10/25/2021] [Indexed: 12/22/2022] Open

Gong Y, Xue D, Chuai G, Yu J, Liu Q. DeepReac+: deep active learning for quantitative modeling of organic chemical reactions. Chem Sci 2021;12:14459-14472. [PMID: 34880997 PMCID: PMC8580052 DOI: 10.1039/d1sc02087k] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 10/08/2021] [Indexed: 11/21/2022] Open

He F, Li J, Wang R, Zhao X, Han Y. An Ensemble Deep Learning based Predictor for Simultaneously Identifying Protein Ubiquitylation and SUMOylation Sites. BMC Bioinformatics 2021;22:519. [PMID: 34689734 PMCID: PMC8543953 DOI: 10.1186/s12859-021-04445-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 10/15/2021] [Indexed: 11/23/2022] Open

Wang H, Zhao J, Zhao H, Li H, Wang J. CL-ACP: a parallel combination of CNN and LSTM anticancer peptide recognition model. BMC Bioinformatics 2021;22:512. [PMID: 34670488 PMCID: PMC8527680 DOI: 10.1186/s12859-021-04433-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 10/05/2021] [Indexed: 01/10/2023] Open

Wang Y, Wang B, Jiang J, Guo J, Lai J, Lian XY, Wu J. Multitask CapsNet: An Imbalanced Data Deep Learning Method for Predicting Toxicants. ACS OMEGA 2021;6:26545-26555. [PMID: 34661009 PMCID: PMC8515573 DOI: 10.1021/acsomega.1c03842] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 09/14/2021] [Indexed: 05/17/2023]

Zheng J, Xiao X, Qiu WR. iCDI-W2vCom: Identifying the Ion Channel-Drug Interaction in Cellular Networking Based on word2vec and node2vec. Front Genet 2021;12:738274. [PMID: 34567088 PMCID: PMC8458815 DOI: 10.3389/fgene.2021.738274] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 08/02/2021] [Indexed: 12/04/2022] Open

Qiu W, Lv Z, Xiao X, Shao S, Lin H. EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks. Comput Struct Biotechnol J 2021;19:4961-4969. [PMID: 34527200 PMCID: PMC8437786 DOI: 10.1016/j.csbj.2021.08.044] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 08/07/2021] [Accepted: 08/27/2021] [Indexed: 11/15/2022] Open

Bandyopadhyay SS, Halder AK, Zaręba-Kozioł M, Bartkowiak-Kaczmarek A, Dutta A, Chatterjee P, Nasipuri M, Wójtowicz T, Wlodarczyk J, Basu S. RFCM-PALM: In-Silico Prediction of S-Palmitoylation Sites in the Synaptic Proteins for Male/Female Mouse Data. Int J Mol Sci 2021;22:ijms22189901. [PMID: 34576064 PMCID: PMC8467992 DOI: 10.3390/ijms22189901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/07/2021] [Accepted: 09/09/2021] [Indexed: 11/16/2022] Open

Affiliation(s)

Soumyendu Sekhar Bandyopadhyay Department of Computer Science and Engineering, Jadvapur University, Kolkata 700032, India; (S.S.B.); (A.K.H.); (A.D.); (M.N.) Department of Computer Science and Engineering, School of Engineering and Technology, Adamas University, Barasat, Kolkata 700126, India
Anup Kumar Halder Department of Computer Science and Engineering, Jadvapur University, Kolkata 700032, India; (S.S.B.); (A.K.H.); (A.D.); (M.N.) Department of Computer Science and Engineering, University of Engineering & Management, Kolkata 700156, India
Monika Zaręba-Kozioł The Nencki Institute of Experimental Biology, Polish Academy of Sciences, 3 Pasteur Street, 02-093 Warsaw, Poland; (M.Z.-K.); (A.B.-K.); (T.W.)
Anna Bartkowiak-Kaczmarek The Nencki Institute of Experimental Biology, Polish Academy of Sciences, 3 Pasteur Street, 02-093 Warsaw, Poland; (M.Z.-K.); (A.B.-K.); (T.W.)
Aviinandaan Dutta Department of Computer Science and Engineering, Jadvapur University, Kolkata 700032, India; (S.S.B.); (A.K.H.); (A.D.); (M.N.)
Piyali Chatterjee Department of Computer Science and Engineering, Netaji Subhash Engineering College, Kolkata 700152, India;
Mita Nasipuri Department of Computer Science and Engineering, Jadvapur University, Kolkata 700032, India; (S.S.B.); (A.K.H.); (A.D.); (M.N.)
Tomasz Wójtowicz The Nencki Institute of Experimental Biology, Polish Academy of Sciences, 3 Pasteur Street, 02-093 Warsaw, Poland; (M.Z.-K.); (A.B.-K.); (T.W.)
Jakub Wlodarczyk The Nencki Institute of Experimental Biology, Polish Academy of Sciences, 3 Pasteur Street, 02-093 Warsaw, Poland; (M.Z.-K.); (A.B.-K.); (T.W.) Correspondence: (J.W.); (S.B.)
Subhadip Basu Department of Computer Science and Engineering, Jadvapur University, Kolkata 700032, India; (S.S.B.); (A.K.H.); (A.D.); (M.N.) Correspondence: (J.W.); (S.B.)

Collapse

Li Y, Pu F, Wang J, Zhou Z, Zhang C, He F, Ma Z, Zhang J. Machine Learning Methods in Prediction of Protein Palmitoylation Sites: A Brief Review. Curr Pharm Des 2021;27:2189-2198. [PMID: 33183190 DOI: 10.2174/1381612826666201112142826] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 07/27/2020] [Indexed: 11/22/2022]

Perpetuo L, Klein J, Ferreira R, Guedes S, Amado F, Leite-Moreira A, Silva AMS, Thongboonkerd V, Vitorino R. How can artificial intelligence be used for peptidomics? Expert Rev Proteomics 2021;18:527-556. [PMID: 34343059 DOI: 10.1080/14789450.2021.1962303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Yang H, Wang M, Liu X, Zhao XM, Li A. PhosIDN: an integrated deep neural network for improving protein phosphorylation site prediction by combining sequence and protein-protein interaction information. Bioinformatics 2021;37:4668-4676. [PMID: 34320631 PMCID: PMC8665744 DOI: 10.1093/bioinformatics/btab551] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 06/22/2021] [Accepted: 07/27/2021] [Indexed: 11/29/2022] Open

Abstract

Motivation

Phosphorylation is one of the most studied post-translational modifications, which plays a pivotal role in various cellular processes. Recently, deep learning methods have achieved great success in prediction of phosphorylation sites, but most of them are based on convolutional neural network that may not capture enough information about long-range dependencies between residues in a protein sequence. In addition, existing deep learning methods only make use of sequence information for predicting phosphorylation sites, and it is highly desirable to develop a deep learning architecture that can combine heterogeneous sequence and protein–protein interaction (PPI) information for more accurate phosphorylation site prediction.

Results

We present a novel integrated deep neural network named PhosIDN, for phosphorylation site prediction by extracting and combining sequence and PPI information. In PhosIDN, a sequence feature encoding sub-network is proposed to capture not only local patterns but also long-range dependencies from protein sequences. Meanwhile, useful PPI features are also extracted in PhosIDN by a PPI feature encoding sub-network adopting a multi-layer deep neural network. Moreover, to effectively combine sequence and PPI information, a heterogeneous feature combination sub-network is introduced to fully exploit the complex associations between sequence and PPI features, and their combined features are used for final prediction. Comprehensive experiment results demonstrate that the proposed PhosIDN significantly improves the prediction performance of phosphorylation sites and compares favorably with existing general and kinase-specific phosphorylation site prediction methods.

Availability and implementation

PhosIDN is freely available at https://github.com/ustchangyuanyang/PhosIDN.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse