Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Anteghini M, Martins dos Santos V, Saccenti E. In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins. Int J Mol Sci 2021;22:6409. [PMID: 34203866 PMCID: PMC8232616 DOI: 10.3390/ijms22126409] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 05/31/2021] [Accepted: 06/09/2021] [Indexed: 01/28/2023] Open

For:	Anteghini M, Martins dos Santos V, Saccenti E. In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins. Int J Mol Sci 2021;22:6409. [PMID: 34203866 PMCID: PMC8232616 DOI: 10.3390/ijms22126409] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 05/31/2021] [Accepted: 06/09/2021] [Indexed: 01/28/2023] Open

Number

Cited by Other Article(s)

Gillani M, Pollastri G. Protein subcellular localization prediction tools. Comput Struct Biotechnol J 2024;23:1796-1807. [PMID: 38707539 PMCID: PMC11066471 DOI: 10.1016/j.csbj.2024.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/11/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open

Xiao C, Zhou Z, She J, Yin J, Cui F, Zhang Z. PEL-PVP: Application of plant vacuolar protein discriminator based on PEFT ESM-2 and bilayer LSTM in an unbalanced dataset. Int J Biol Macromol 2024;277:134317. [PMID: 39094861 DOI: 10.1016/j.ijbiomac.2024.134317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 07/10/2024] [Accepted: 07/28/2024] [Indexed: 08/04/2024]

García Sánchez N, Ugarte Carro E, Prieto-Santamaría L, Rodríguez-González A. Protein sequence analysis in the context of drug repurposing. BMC Med Inform Decis Mak 2024;24:122. [PMID: 38741115 DOI: 10.1186/s12911-024-02531-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/08/2024] [Indexed: 05/16/2024] Open

Anteghini M, Santos VAMD, Saccenti E. PortPred: Exploiting deep learning embeddings of amino acid sequences for the identification of transporter proteins and their substrates. J Cell Biochem 2023;124:1803-1824. [PMID: 37877557 DOI: 10.1002/jcb.30490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/29/2023] [Accepted: 10/03/2023] [Indexed: 10/26/2023]

Sui J, Chen J, Chen Y, Iwamori N, Sun J. Identification of plant vacuole proteins by using graph neural network and contact maps. BMC Bioinformatics 2023;24:357. [PMID: 37740195 PMCID: PMC10517492 DOI: 10.1186/s12859-023-05475-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 09/12/2023] [Indexed: 09/24/2023] Open

Savojardo C, Martelli PL, Casadio R. Finding functional motifs in protein sequences with deep learning and natural language models. Curr Opin Struct Biol 2023;81:102641. [PMID: 37385080 DOI: 10.1016/j.sbi.2023.102641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 04/17/2023] [Accepted: 05/24/2023] [Indexed: 07/01/2023]

Shi Z, Deng R, Yuan Q, Mao Z, Wang R, Li H, Liao X, Ma H. Enzyme Commission Number Prediction and Benchmarking with Hierarchical Dual-core Multitask Learning Framework. RESEARCH (WASHINGTON, D.C.) 2023;6:0153. [PMID: 37275124 PMCID: PMC10232324 DOI: 10.34133/research.0153] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 04/28/2023] [Indexed: 06/07/2023]

Abstract

Enzyme commission (EC) numbers, which associate a protein sequence with the biochemical reactions it catalyzes, are essential for the accurate understanding of enzyme functions and cellular metabolism. Many ab initio computational approaches were proposed to predict EC numbers for given input protein sequences. However, the prediction performance (accuracy, recall, and precision), usability, and efficiency of existing methods decreased seriously when dealing with recently discovered proteins, thus still having much room to be improved. Here, we report HDMLF, a hierarchical dual-core multitask learning framework for accurately predicting EC numbers based on novel deep learning techniques. HDMLF is composed of an embedding core and a learning core; the embedding core adopts the latest protein language model for protein sequence embedding, and the learning core conducts the EC number prediction. Specifically, HDMLF is designed on the basis of a gated recurrent unit framework to perform EC number prediction in the multi-objective hierarchy, multitasking manner. Additionally, we introduced an attention layer to optimize the EC prediction and employed a greedy strategy to integrate and fine-tune the final model. Comparative analyses against 4 representative methods demonstrate that HDMLF stably delivers the highest performance, which improves accuracy and F1 score by 60% and 40% over the state of the art, respectively. An additional case study of tyrB predicted to compensate for the loss of aspartate aminotransferase aspC, as reported in a previous experimental study, shows that our model can also be used to uncover the enzyme promiscuity. Finally, we established a web platform, namely, ECRECer (https://ecrecer.biodesign.ac.cn), using an entirely could-based serverless architecture and provided an offline bundle to improve usability.

Collapse

Affiliation(s)

Zhenkun Shi Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China
Rui Deng Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China College of Biotechnology, Tianjin University of Science & Technology, Tianjin, China
Qianqian Yuan Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China
Zhitao Mao Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China
Ruoyu Wang Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China
Haoran Li Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China
Xiaoping Liao Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China Haihe Laboratory of Synthetic Biology, 300308, Tianjin, China
Hongwu Ma Biodesign Center, Key Laboratory of Engineering Biology for Low-carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, 300308, Tianjin, China National Center of Technology Innovation for Synthetic Biology, 300308, Tianjin, China

Collapse

Anteghini M, Martins Dos Santos VAP. Computational Approaches for Peroxisomal Protein Localization. Methods Mol Biol 2023;2643:405-411. [PMID: 36952202 DOI: 10.1007/978-1-0716-3048-8_29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]

Anteghini M, Haja A, Martins dos Santos VA, Schomaker L, Saccenti E. OrganelX web server for sub-peroxisomal and sub-mitochondrial protein localization and peroxisomal target signal detection. Comput Struct Biotechnol J 2022;21:128-133. [PMID: 36544474 PMCID: PMC9747352 DOI: 10.1016/j.csbj.2022.11.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 11/28/2022] [Accepted: 11/28/2022] [Indexed: 12/12/2022] Open

Nakai K, Wei L. Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics. FRONTIERS IN BIOINFORMATICS 2022;2:910531. [PMID: 36304291 PMCID: PMC9580943 DOI: 10.3389/fbinf.2022.910531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open

Kamoshita M, Kumar R, Anteghini M, Kunze M, Islinger M, Martins dos Santos V, Schrader M. Insights Into the Peroxisomal Protein Inventory of Zebrafish. Front Physiol 2022;13:822509. [PMID: 35295584 PMCID: PMC8919083 DOI: 10.3389/fphys.2022.822509] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 02/07/2022] [Indexed: 12/19/2022] Open

Jiao S, Zou Q. Identification of plant vacuole proteins by exploiting deep representation learning features. Comput Struct Biotechnol J 2022;20:2921-2927. [PMID: 35765653 PMCID: PMC9207291 DOI: 10.1016/j.csbj.2022.06.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 05/30/2022] [Accepted: 06/01/2022] [Indexed: 12/04/2022] Open

Abstract

Plant vacuoles are the most important organelles for plant growth, development, and defense, and they play an important role in many types of stress responses. An important function of vacuole proteins is the transport of various classes of amino acids, ions, sugars, and other molecules. Accurate identification of vacuole proteins is crucial for revealing their biological functions. Several automatic and rapid computational tools have been proposed for the subcellular localization of proteins. Regrettably, they are not specific for the identification of plant vacuole proteins. To the best of our knowledge, there is only one computational software specifically trained for plant vacuolar proteins. Although its accuracy is acceptable, the prediction performance and stability of this method in practical applications can still be improved. Hence, in this study, a new predictor named iPVP-DRLF was developed to identify plant vacuole proteins specifically and effectively. This prediction software is designed using the light gradient boosting machine (LGBM) algorithm and hybrid features composed of classic sequence features and deep representation learning features. iPVP-DRLF achieved fivefold cross-validation and independent test accuracy values of 88.25 % and 87.16 %, respectively, both outperforming previous state-of-the-art predictors. Moreover, the blind dataset test results also showed that the performance of iPVP-DRLF was significantly better than the existing tools. The results of comparative experiments confirmed that deep representation learning features have an advantage over other classic sequence features in the identification of plant vacuole proteins. We believe that iPVP-DRLF would serve as an effective computational technique for plant vacuole protein prediction and facilitate related future research. The online server is freely accessible at https://lab.malab.cn/~acy/iPVP-DRLF. In addition, the source code and datasets are also accessible at https://github.com/jiaoshihu/iPVP-DRLF.

Collapse