Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Charoenkwan P, Nantasenamat C, Hasan MM, Manavalan B, Shoombuatong W. BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides. Bioinformatics 2021;37:2556-2562. [PMID: 33638635 DOI: 10.1093/bioinformatics/btab133] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 02/08/2021] [Accepted: 02/24/2021] [Indexed: 12/11/2022] Open

For:	Charoenkwan P, Nantasenamat C, Hasan MM, Manavalan B, Shoombuatong W. BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides. Bioinformatics 2021;37:2556-2562. [PMID: 33638635 DOI: 10.1093/bioinformatics/btab133] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 02/08/2021] [Accepted: 02/24/2021] [Indexed: 12/11/2022] Open

Number

Cited by Other Article(s)

Hu Y, Badar IH, Liu Y, Zhu Y, Yang L, Kong B, Xu B. Advancements in production, assessment, and food applications of salty and saltiness-enhancing peptides: A review. Food Chem 2024;453:139664. [PMID: 38761739 DOI: 10.1016/j.foodchem.2024.139664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/01/2024] [Accepted: 05/12/2024] [Indexed: 05/20/2024]

Richter P, Sebald K, Fischer K, Schnieke A, Jlilati M, Mittermeier-Klessinger V, Somoza V. Gastric digestion of the sweet-tasting plant protein thaumatin releases bitter peptides that reduce H. pylori induced pro-inflammatory IL-17A release via the TAS2R16 bitter taste receptor. Food Chem 2024;448:139157. [PMID: 38569411 DOI: 10.1016/j.foodchem.2024.139157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 03/08/2024] [Accepted: 03/25/2024] [Indexed: 04/05/2024]

Zhai S, Tan Y, Zhu C, Zhang C, Gao Y, Mao Q, Zhang Y, Duan H, Yin Y. PepExplainer: An explainable deep learning model for selection-based macrocyclic peptide bioactivity prediction and optimization. Eur J Med Chem 2024;275:116628. [PMID: 38944933 DOI: 10.1016/j.ejmech.2024.116628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/21/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024]

Zhang J, Zhao L, Wang W, Zhang Q, Wang XT, Xing DF, Ren NQ, Lee DJ, Chen C. Large language model for horizontal transfer of resistance gene: From resistance gene prevalence detection to plasmid conjugation rate evaluation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024;931:172466. [PMID: 38626826 DOI: 10.1016/j.scitotenv.2024.172466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 04/10/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024]

Lavecchia A. Advancing drug discovery with deep attention neural networks. Drug Discov Today 2024;29:104067. [PMID: 38925473 DOI: 10.1016/j.drudis.2024.104067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/10/2024] [Accepted: 06/19/2024] [Indexed: 06/28/2024]

Nguyen VN, Ho TT, Doan TD, Le NQK. Using a hybrid neural network architecture for DNA sequence representation: A study on N⁴-methylcytosine sites. Comput Biol Med 2024;178:108664. [PMID: 38875905 DOI: 10.1016/j.compbiomed.2024.108664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 05/11/2024] [Accepted: 05/26/2024] [Indexed: 06/16/2024]

Park JH, Prasad V, Newsom S, Najar F, Rajan R. IdMotif: An Interactive Motif Identification in Protein Sequences. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2024;44:114-125. [PMID: 38127603 DOI: 10.1109/mcg.2023.3345742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]

Fang C, He J, Yamana H. MoRF_ESM: Prediction of MoRFs in disordered proteins based on a deep transformer protein language model. J Bioinform Comput Biol 2024;22:2450006. [PMID: 38812466 DOI: 10.1142/s0219720024500069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]

Abstract

Molecular recognition features (MoRFs) are particular functional segments of disordered proteins, which play crucial roles in regulating the phase transition of membrane-less organelles and frequently serve as central sites in cellular interaction networks. As the association between disordered proteins and severe diseases continues to be discovered, identifying MoRFs has gained growing significance. Due to the limited number of experimentally validated MoRFs, the performance of existing MoRF's prediction algorithms is not good enough and still needs to be improved. In this research, we present a model named MoRF_ESM, which utilizes deep-learning protein representations to predict MoRFs in disordered proteins. This approach employs a pretrained ESM-2 protein language model to generate embedding representations of residues in the form of attention map matrices. These representations are combined with a self-learned TextCNN model for feature extraction and prediction. In addition, an averaging step was incorporated at the end of the MoRF_ESM model to refine the output and generate final prediction results. In comparison to other impressive methods on benchmark datasets, the MoRF_ESM approach demonstrates state-of-the-art performance, achieving [Formula: see text] higher AUC than other methods when tested on TEST1 and achieving [Formula: see text] higher AUC than other methods when tested on TEST2. These results imply that the combination of ESM-2 and TextCNN can effectively extract deep evolutionary features related to protein structure and function, along with capturing shallow pattern features located in protein sequences, and is well qualified for the prediction task of MoRFs. Given that ESM-2 is a highly versatile protein language model, the methodology proposed in this study can be readily applied to other tasks involving the classification of protein sequences.

Collapse

Hesamzadeh P, Seif A, Mahmoudzadeh K, Ganjali Koli M, Mostafazadeh A, Nayeri K, Mirjafary Z, Saeidian H. De novo antioxidant peptide design via machine learning and DFT studies. Sci Rep 2024;14:6473. [PMID: 38499731 PMCID: PMC10948870 DOI: 10.1038/s41598-024-57247-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 03/15/2024] [Indexed: 03/20/2024] Open

He Y, Liu K, Liu Y, Han W. Prediction of bitterness based on modular designed graph neural network. BIOINFORMATICS ADVANCES 2024;4:vbae041. [PMID: 38566918 PMCID: PMC10987211 DOI: 10.1093/bioadv/vbae041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 01/31/2024] [Accepted: 03/11/2024] [Indexed: 04/04/2024]

Banić M, Butorac K, Čuljak N, Butorac A, Novak J, Pavunc AL, Rušanac A, Stanečić Ž, Lovrić M, Šušković J, Kos B. An Integrated Comprehensive Peptidomics and In Silico Analysis of Bioactive Peptide-Rich Milk Fermented by Three Autochthonous Cocci Strains. Int J Mol Sci 2024;25:2431. [PMID: 38397111 PMCID: PMC10888711 DOI: 10.3390/ijms25042431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/12/2024] [Accepted: 02/17/2024] [Indexed: 02/25/2024] Open

Affiliation(s)

Martina Banić Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Katarina Butorac Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Nina Čuljak Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Ana Butorac BICRO Biocentre Ltd., Borongajska cesta 83H, 10000 Zagreb, Croatia; (A.B.); (Ž.S.); (M.L.)
Jasna Novak Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Andreja Leboš Pavunc Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Anamarija Rušanac Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Željka Stanečić BICRO Biocentre Ltd., Borongajska cesta 83H, 10000 Zagreb, Croatia; (A.B.); (Ž.S.); (M.L.)
Marija Lovrić BICRO Biocentre Ltd., Borongajska cesta 83H, 10000 Zagreb, Croatia; (A.B.); (Ž.S.); (M.L.)
Jagoda Šušković Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)
Blaženka Kos Laboratory for Antibiotic, Enzyme, Probiotic and Starter Culture Technologies, Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia; (M.B.); (K.B.); (N.Č.); (J.N.); (A.L.P.); (A.R.); (J.Š.)

Collapse

Zou H. iDPPIV-SI: identifying dipeptidyl peptidase IV inhibitory peptides by using multiple sequence information. J Biomol Struct Dyn 2024;42:2144-2152. [PMID: 37125813 DOI: 10.1080/07391102.2023.2203257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 04/10/2023] [Indexed: 05/02/2023]

Yu Y, Liu S, Zhang X, Yu W, Pei X, Liu L, Jin Y. Identification and prediction of milk-derived bitter taste peptides based on peptidomics technology and machine learning method. Food Chem 2024;433:137288. [PMID: 37683467 DOI: 10.1016/j.foodchem.2023.137288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/19/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023]

Zulfiqar H, Guo Z, Ahmad RM, Ahmed Z, Cai P, Chen X, Zhang Y, Lin H, Shi Z. Deep-STP: a deep learning-based approach to predict snake toxin proteins by using word embeddings. Front Med (Lausanne) 2024;10:1291352. [PMID: 38298505 PMCID: PMC10829051 DOI: 10.3389/fmed.2023.1291352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 12/26/2023] [Indexed: 02/02/2024] Open

Li W, Li G, Sun Y, Zhang L, Cui X, Jia Y, Zhao T. Prediction of SARS-CoV-2 Infection Phosphorylation Sites and Associations of these Modifications with Lung Cancer Development. Curr Gene Ther 2024;24:239-248. [PMID: 37957848 DOI: 10.2174/0115665232268074231026111634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 10/05/2023] [Accepted: 10/07/2023] [Indexed: 11/15/2023]

Dhibar S, Jana B. Accurate Prediction of Antifreeze Protein from Sequences through Natural Language Text Processing and Interpretable Machine Learning Approaches. J Phys Chem Lett 2023;14:10727-10735. [PMID: 38009833 DOI: 10.1021/acs.jpclett.3c02817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]

Dutta P, Jain D, Gupta R, Rai B. Classification of tastants: A deep learning based approach. Mol Inform 2023;42:e202300146. [PMID: 37885360 DOI: 10.1002/minf.202300146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 09/26/2023] [Accepted: 10/26/2023] [Indexed: 10/28/2023]

Le NQK. Leveraging transformers-based language models in proteome bioinformatics. Proteomics 2023;23:e2300011. [PMID: 37381841 DOI: 10.1002/pmic.202300011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/13/2023] [Accepted: 06/13/2023] [Indexed: 06/30/2023]

Zhang J, Yan W, Zhang Q, Li Z, Liang L, Zuo M, Zhang Y. Umami-BERT: An interpretable BERT-based model for umami peptides prediction. Food Res Int 2023;172:113142. [PMID: 37689906 DOI: 10.1016/j.foodres.2023.113142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/12/2023] [Accepted: 06/13/2023] [Indexed: 09/11/2023]

He J, Zhang S, Fang C. AAindex-PPII: Predicting polyproline type II helix structure based on amino acid indexes with an improved BiGRU-TextCNN model. J Bioinform Comput Biol 2023;21:2350022. [PMID: 37899354 DOI: 10.1142/s0219720023500221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]

Zhang X, Guo H, Zhang F, Wang X, Wu K, Qiu S, Liu B, Wang Y, Hu Y, Li J. HNetGO: protein function prediction via heterogeneous network transformer. Brief Bioinform 2023;24:bbab556. [PMID: 37861172 PMCID: PMC10588005 DOI: 10.1093/bib/bbab556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 11/18/2021] [Accepted: 12/04/2021] [Indexed: 10/21/2023] Open

Rojas C, Ballabio D, Consonni V, Suárez-Estrella D, Todeschini R. Classification-based machine learning approaches to predict the taste of molecules: A review. Food Res Int 2023;171:113036. [PMID: 37330849 DOI: 10.1016/j.foodres.2023.113036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/02/2023] [Accepted: 05/22/2023] [Indexed: 06/19/2023]

Zhai S, Tan Y, Zhang C, Hipolito CJ, Song L, Zhu C, Zhang Y, Duan H, Yin Y. PepScaf: Harnessing Machine Learning with In Vitro Selection toward De Novo Macrocyclic Peptides against IL-17C/IL-17RE Interaction. J Med Chem 2023;66:11187-11200. [PMID: 37480587 DOI: 10.1021/acs.jmedchem.3c00627] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]

Charoenkwan P, Schaduangrat N, Shoombuatong W. StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens. BMC Bioinformatics 2023;24:301. [PMID: 37507654 PMCID: PMC10386778 DOI: 10.1186/s12859-023-05421-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 07/19/2023] [Indexed: 07/30/2023] Open

Li F, Liu S, Li K, Zhang Y, Duan M, Yao Z, Zhu G, Guo Y, Wang Y, Huang L, Zhou F. EpiTEAmDNA: Sequence feature representation via transfer learning and ensemble learning for identifying multiple DNA epigenetic modification types across species. Comput Biol Med 2023;160:107030. [PMID: 37196456 DOI: 10.1016/j.compbiomed.2023.107030] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 04/21/2023] [Accepted: 05/10/2023] [Indexed: 05/19/2023]

Affiliation(s)

Fei Li Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Shuai Liu Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Kewei Li Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Yaqi Zhang Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Meiyu Duan Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China.
Zhaomin Yao College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning, 110167, China
Gancheng Zhu Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Yutong Guo College of Life Sciences, Jilin University, Changchun, Jilin, 130012, China
Ying Wang Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Lan Huang Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China
Fengfeng Zhou Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin, 130012, China; College of Computer Science and Technology, Jilin University, Changchun, Jilin, 130012, China.

Collapse

Schaduangrat N, Anuwongcharoen N, Charoenkwan P, Shoombuatong W. DeepAR: a novel deep learning-based hybrid framework for the interpretable prediction of androgen receptor antagonists. J Cheminform 2023;15:50. [PMID: 37149650 PMCID: PMC10163717 DOI: 10.1186/s13321-023-00721-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 04/08/2023] [Indexed: 05/08/2023] Open

Abstract

Drug resistance represents a major obstacle to therapeutic innovations and is a prevalent feature in prostate cancer (PCa). Androgen receptors (ARs) are the hallmark therapeutic target for prostate cancer modulation and AR antagonists have achieved great success. However, rapid emergence of resistance contributing to PCa progression is the ultimate burden of their long-term usage. Hence, the discovery and development of AR antagonists with capability to combat the resistance, remains an avenue for further exploration. Therefore, this study proposes a novel deep learning (DL)-based hybrid framework, named DeepAR, to accurately and rapidly identify AR antagonists by using only the SMILES notation. Specifically, DeepAR is capable of extracting and learning the key information embedded in AR antagonists. Firstly, we established a benchmark dataset by collecting active and inactive compounds against AR from the ChEMBL database. Based on this dataset, we developed and optimized a collection of baseline models by using a comprehensive set of well-known molecular descriptors and machine learning algorithms. Then, these baseline models were utilized for creating probabilistic features. Finally, these probabilistic features were combined and used for the construction of a meta-model based on a one-dimensional convolutional neural network. Experimental results indicated that DeepAR is a more accurate and stable approach for identifying AR antagonists in terms of the independent test dataset, by achieving an accuracy of 0.911 and MCC of 0.823. In addition, our proposed framework is able to provide feature importance information by leveraging a popular computational approach, named SHapley Additive exPlanations (SHAP). In the meanwhile, the characterization and analysis of potential AR antagonist candidates were achieved through the SHAP waterfall plot and molecular docking. The analysis inferred that N-heterocyclic moieties, halogenated substituents, and a cyano functional group were significant determinants of potential AR antagonists. Lastly, we implemented an online web server by using DeepAR (at http://pmlabstack.pythonanywhere.com/DeepAR ). We anticipate that DeepAR could be a useful computational tool for community-wide facilitation of AR candidates from a large number of uncharacterized compounds.

Collapse

Thafar MA, Albaradei S, Uludag M, Alshahrani M, Gojobori T, Essack M, Gao X. OncoRTT: Predicting novel oncology-related therapeutic targets using BERT embeddings and omics features. Front Genet 2023;14:1139626. [PMID: 37091791 PMCID: PMC10117673 DOI: 10.3389/fgene.2023.1139626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Accepted: 03/24/2023] [Indexed: 04/08/2023] Open

Affiliation(s)

Maha A. Thafar Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia College of Computers and Information Technology, Computer Science Department, Taif University, Taif, Saudi Arabia
Somayah Albaradei Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Mahmut Uludag Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Mona Alshahrani National Center for Artificial Intelligence (NCAI), Saudi Data and Artificial Intelligence Authority (SDAIA), Riyadh, Saudi Arabia
Takashi Gojobori Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Magbubah Essack Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia *Correspondence: Xin Gao, ; Magbubah Essack,
Xin Gao Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia *Correspondence: Xin Gao, ; Magbubah Essack,

Collapse

Jiang J, Li J, Li J, Pei H, Li M, Zou Q, Lv Z. A Machine Learning Method to Identify Umami Peptide Sequences by Using Multiplicative LSTM Embedded Features. Foods 2023;12:foods12071498. [PMID: 37048319 PMCID: PMC10094688 DOI: 10.3390/foods12071498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open

Fan Y, Sun G, Pan X. ELMo4m6A: A Contextual Language Embedding-Based Predictor for Detecting RNA N6-Methyladenosine Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:944-954. [PMID: 35536814 DOI: 10.1109/tcbb.2022.3173323] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Charoenkwan P, Schaduangrat N, Pham NT, Manavalan B, Shoombuatong W. Pretoria: An effective computational approach for accurate and high-throughput identification of CD8+ t-cell epitopes of eukaryotic pathogens. Int J Biol Macromol 2023;238:124228. [PMID: 36996953 DOI: 10.1016/j.ijbiomac.2023.124228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 03/11/2023] [Accepted: 03/25/2023] [Indexed: 03/31/2023]

Abstract

T-cells recognize antigenic epitopes present on major histocompatibility complex (MHC) molecules, triggering an adaptive immune response in the host. T-cell epitope (TCE) identification is challenging because of the extensive number of undetermined proteins found in eukaryotic pathogens, as well as MHC polymorphisms. In addition, conventional experimental approaches for TCE identification are time-consuming and expensive. Thus, computational approaches that can accurately and rapidly identify CD8⁺ T-cell epitopes (TCEs) of eukaryotic pathogens based solely on sequence information may facilitate the discovery of novel CD8⁺ TCEs in a cost-effective manner. Here, Pretoria (Predictor of CD8⁺ TCEs of eukaryotic pathogens) is proposed as the first stack-based approach for accurate and large-scale identification of CD8⁺ TCEs of eukaryotic pathogens. In particular, Pretoria enabled the extraction and exploration of crucial information embedded in CD8⁺ TCEs by employing a comprehensive set of 12 well-known feature descriptors extracted from multiple groups, including physicochemical properties, composition-transition-distribution, pseudo-amino acid composition, and amino acid composition. These feature descriptors were then utilized to construct a pool of 144 different machine learning (ML)-based classifiers based on 12 popular ML algorithms. Finally, the feature selection method was used to effectively determine the important ML classifiers for the construction of our stacked model. The experimental results indicated that Pretoria is an accurate and effective computational approach for CD8⁺ TCE prediction; it was superior to several conventional ML classifiers and the existing method in terms of the independent test, with an accuracy of 0.866, MCC of 0.732, and AUC of 0.921. Additionally, to maximize user convenience for high-throughput identification of CD8⁺ TCEs of eukaryotic pathogens, a user-friendly web server of Pretoria (http://pmlabstack.pythonanywhere.com/Pretoria) was developed and made freely available.

Collapse

Deep learning drives efficient discovery of novel antihypertensive peptides from soybean protein isolate. Food Chem 2023;404:134690. [DOI: 10.1016/j.foodchem.2022.134690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/29/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022]

Zhang YF, Wang YH, Gu ZF, Pan XR, Li J, Ding H, Zhang Y, Deng KJ. Bitter-RF: A random forest machine model for recognizing bitter peptides. Front Med (Lausanne) 2023;10:1052923. [PMID: 36778738 PMCID: PMC9909039 DOI: 10.3389/fmed.2023.1052923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 01/05/2023] [Indexed: 01/27/2023] Open

Yan J, Tong H. An overview of bitter compounds in foodstuffs: Classifications, evaluation methods for sensory contribution, separation and identification techniques, and mechanism of bitter taste transduction. Compr Rev Food Sci Food Saf 2023;22:187-232. [PMID: 36382875 DOI: 10.1111/1541-4337.13067] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 08/24/2022] [Accepted: 10/11/2022] [Indexed: 11/17/2022]

Gao W, Xu D, Li H, Du J, Wang G, Li D. Identification of adaptor proteins by incorporating deep learning and PSSM profiles. Methods 2023;209:10-17. [PMID: 36427763 DOI: 10.1016/j.ymeth.2022.11.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 10/25/2022] [Accepted: 11/02/2022] [Indexed: 11/23/2022] Open

Abstract

Adaptor proteins, also known as signal transduction adaptor proteins, are important proteins in signal transduction pathways, and play a role in connecting signal proteins for signal transduction between cells. Studies have shown that adaptor proteins are closely related to some diseases, such as tumors and diabetes. Therefore, it is very meaningful to construct a relevant model to accurately identify adaptor proteins. In recent years, many studies have used a position-specific scoring matrix (PSSM) and neural network methods to identify adaptor proteins. However, ordinary neural network models cannot correlate the contextual information in PSSM profiles well, so these studies usually process 20×N (N > 20) PSSM into 20×20 dimensions, which results in the loss of a large amount of protein information; This research proposes an efficient method that combines one-dimensional convolution (1-D CNN) and a bidirectional long short-term memory network (biLSTM) to identify adaptor proteins. The complete PSSM profiles are the input of the model, and the complete information of the protein is retained during the training process. We perform cross-validation during model training and test the performance of the model on an independent test set; in the data set with 1224 adaptor proteins and 11,078 non-adaptor proteins, five indicators including specificity, sensitivity, accuracy, area under the receiver operating characteristic curve (AUC) metric and Matthews correlation coefficient (MCC), were employed to evaluate model performance. On the independent test set, the specificity, sensitivity, accuracy and MCC were 0.817, 0.865, 0.823 and 0.465, respectively. Those results show that our method is better than the state-of-the art methods. This study is committed to improve the accuracy of adaptor protein identification, and laid a foundation for further research on diseases related to adaptor protein. This research provided a new idea for the application of deep learning related models in bioinformatics and computational biology.

Collapse

PSRTTCA: A new approach for improving the prediction and characterization of tumor T cell antigens using propensity score representation learning. Comput Biol Med 2023;152:106368. [PMID: 36481763 DOI: 10.1016/j.compbiomed.2022.106368] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 10/19/2022] [Accepted: 11/25/2022] [Indexed: 11/27/2022]

Abstract

Despite the arsenal of existing cancer therapies, the ongoing recurrence and new cases of cancer pose a serious health concern that necessitates the development of new and effective treatments. Cancer immunotherapy, which uses the body's immune system to combat cancer, is a promising treatment option. As a result, in silico methods for identifying and characterizing tumor T cell antigens (TTCAs) would be useful for better understanding their functional mechanisms. Although few computational methods for TTCA identification have been developed, their lack of model interpretability is a major drawback. Thus, developing computational methods for the effective identification and characterization of TTCAs is a critical endeavor. PSRTTCA, a new machine learning (ML)-based approach for improving the identification and characterization of TTCAs based on their primary sequences, is proposed in this study. Specifically, we introduce a new propensity score representation learning algorithm that allows one to generate various sets of propensity scores of amino acids, dipeptides, and g-gap dipeptides to be TTCAs. To enhance the predictive performance, optimal sets of variant propensity scores were determined and fed into the final meta-predictor (PSRTTCA). Benchmarking results revealed that PSRTTCA was a more precise and promising tool for the identification and characterization of TTCAs than conventional ML classifiers and existing methods. Furthermore, PSR-derived propensities of amino acids in becoming TTCAs are used to reveal the relationship between TTCAs and their informative physicochemical properties in order to provide insights into TTCA characteristics. Finally, a user-friendly online computational platform of PSRTTCA is publicly available at http://pmlabstack.pythonanywhere.com/PSRTTCA. The PSRTTCA predictor is anticipated to facilitate community-wide efforts in accelerating the discovery of novel TTCAs for cancer immunotherapy and other clinical applications.

Collapse

Naravane T, Tagkopoulos I. Machine learning models to predict micronutrient profile in food after processing. Curr Res Food Sci 2023;6:100500. [PMID: 37151381 PMCID: PMC10160345 DOI: 10.1016/j.crfs.2023.100500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/28/2023] [Accepted: 04/02/2023] [Indexed: 05/09/2023] Open

Pallante L, Korfiati A, Androutsos L, Stojceski F, Bompotas A, Giannikos I, Raftopoulos C, Malavolta M, Grasso G, Mavroudi S, Kalogeras A, Martos V, Amoroso D, Piga D, Theofilatos K, Deriu MA. Toward a general and interpretable umami taste predictor using a multi-objective machine learning approach. Sci Rep 2022;12:21735. [PMID: 36526644 PMCID: PMC9758219 DOI: 10.1038/s41598-022-25935-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022] Open

Affiliation(s)

Lorenzo Pallante grid.4800.c0000 0004 1937 0343Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129 Torino, Italy
Aigli Korfiati InSyBio PC, 265 04 Patras, Greece
Lampros Androutsos InSyBio PC, 265 04 Patras, Greece
Filip Stojceski Department of Innovative Technologies, Dalle Molle Institute for Artificial Intelligence, 6962 Lugano-Viganello, Switzerland
Agorakis Bompotas grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
Ioannis Giannikos grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
Christos Raftopoulos grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
Marta Malavolta grid.8954.00000 0001 0721 6013Faculty of Computer and Information Science, University of Ljubljana, 1000 Ljubljana, Slovenia
Gianvito Grasso Department of Innovative Technologies, Dalle Molle Institute for Artificial Intelligence, 6962 Lugano-Viganello, Switzerland
Seferina Mavroudi InSyBio PC, 265 04 Patras, Greece ,6grid.11047.330000 0004 0576 5395Department of Nursing, University of Patras, 265 04 Patras, Greece
Athanasios Kalogeras grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
Vanessa Martos grid.4489.10000000121678994Department of Plant Physiology, Institute of Biotechnology, University of Granada, 18011 Granada, Spain
Daria Amoroso 7hc srl, 00198 Rome, Italy
Dario Piga Department of Innovative Technologies, Dalle Molle Institute for Artificial Intelligence, 6962 Lugano-Viganello, Switzerland
Konstantinos Theofilatos InSyBio PC, 265 04 Patras, Greece
Marco A. Deriu grid.4800.c0000 0004 1937 0343Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129 Torino, Italy

Collapse

IUP-BERT: Identification of Umami Peptides Based on BERT Features. Foods 2022;11:foods11223742. [PMID: 36429332 PMCID: PMC9689418 DOI: 10.3390/foods11223742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 11/14/2022] [Accepted: 11/16/2022] [Indexed: 11/23/2022] Open

Charoenkwan P, Schaduangrat N, Lio P, Moni MA, Chumnanpuen P, Shoombuatong W. iAMAP-SCM: A Novel Computational Tool for Large-Scale Identification of Antimalarial Peptides Using Estimated Propensity Scores of Dipeptides. ACS OMEGA 2022;7:41082-41095. [PMID: 36406571 PMCID: PMC9670693 DOI: 10.1021/acsomega.2c04465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 10/20/2022] [Indexed: 06/16/2023]

Abstract

Antimalarial peptides (AMAPs) varying in length, amino acid composition, charge, conformational structure, hydrophobicity, and amphipathicity reflect their diversity in antimalarial mechanisms. Due to the worldwide major health problem concerning antimicrobial resistance, these peptides possess great therapeutic value owing to their low incidences of drug resistance as compared to conventional antibiotics. Although well-known experimental methods are able to precisely determine the antimalarial activity of peptides, these methods are still time-consuming and costly. Thus, machine learning (ML)-based methods that are capable of identifying AMAPs rapidly by using only sequence information would be beneficial for the high-throughput identification of AMAPs. In this study, we propose the first computational model (termed iAMAP-SCM) for the large-scale identification and characterization of peptides with antimalarial activity by using only sequence information. Specifically, we employed an interpretable scoring card method (SCM) to develop iAMAP-SCM and estimate propensities of 20 amino acids and 400 dipeptides to be AMAPs in a supervised manner. Experimental results showed that iAMAP-SCM could achieve a maximum accuracy and Matthew's coefficient correlation of 0.957 and 0.834, respectively, on the independent test dataset. In addition, SCM-derived propensities of 20 amino acids and selected physicochemical properties were used to provide an understanding of the functional mechanisms of AMAPs. Finally, a user-friendly online computational platform of iAMAP-SCM is publicly available at http://pmlabstack.pythonanywhere.com/iAMAP-SCM. The iAMAP-SCM predictor is anticipated to assist experimental scientists in the high-throughput identification of potential AMAP candidates for the treatment of malaria and other clinical applications.

Collapse

A TastePeptides-Meta system including an umami/bitter classification model Umami_YYDS, a TastePeptidesDB database and an open-source package Auto_Taste_ML. Food Chem 2022;405:134812. [DOI: 10.1016/j.foodchem.2022.134812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 10/25/2022] [Accepted: 10/28/2022] [Indexed: 11/11/2022]

Improved prediction and characterization of blood-brain barrier penetrating peptides using estimated propensity scores of dipeptides. J Comput Aided Mol Des 2022;36:781-796. [DOI: 10.1007/s10822-022-00476-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 09/15/2022] [Indexed: 11/27/2022]

Phirom K, Charoenkwan P, Shoombuatong W, Charoenkwan P, Sirichotiyakul S, Tongsong T. DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α⁺-Thalassemia Trait Using Red Blood Cell Parameters. J Clin Med 2022;11:jcm11216305. [PMID: 36362531 PMCID: PMC9654007 DOI: 10.3390/jcm11216305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/16/2022] [Accepted: 10/23/2022] [Indexed: 11/16/2022] Open

Abstract

Objectives: To develop a machine learning (ML)-based framework using red blood cell (RBC) parameters for the prediction of the α+-thalassemia trait (α+-thal trait) and to compare the diagnostic performance with a conventional method using a single RBC parameter or a combination of RBC parameters. Methods: A retrospective study was conducted on possible couples at risk for fetus with hemoglobin H (Hb H disease). Subjects with molecularly confirmed normal status (not thalassemia), α+-thal trait, and two-allele α-thalassemia mutation were included. Clinical parameters (age and gender) and RBC parameters (Hb, Hct, MCV, MCH, MCHC, RDW, and RBC count) obtained from their antenatal thalassemia screen were retrieved and analyzed using a machine learning (ML)-based framework and a conventional method. The performance of α+-thal trait prediction was evaluated. Results: In total, 594 cases (female/male: 330/264, mean age: 29.7 ± 6.6 years) were included in the analysis. There were 229 normal controls, 160 cases with the α+-thalassemia trait, and 205 cases in the two-allele α-thalassemia mutation category, respectively. The ML-derived model improved the diagnostic performance, giving a sensitivity of 80% and specificity of 81%. The experimental results indicated that DeepThal achieved a better performance compared with other ML-based methods in terms of the independent test dataset, with an accuracy of 80.77%, sensitivity of 70.59%, and the Matthews correlation coefficient (MCC) of 0.608. Of all the red blood cell parameters, MCH < 28.95 pg as a single parameter had the highest performance in predicting the α+-thal trait with the AUC of 0.857 and 95% CI of 0.816−0.899. The combination model derived from the binary logistic regression analysis exhibited improved performance with the AUC of 0.868 and 95% CI of 0.830−0.906, giving a sensitivity of 80.1% and specificity of 75.1%. Conclusions: The performance of DeepThal in terms of the independent test dataset is sufficient to demonstrate that DeepThal is capable of accurately predicting the α+-thal trait. It is anticipated that DeepThal will be a useful tool for the scientific community in the large-scale prediction of the α+-thal trait.

Collapse

PD-BertEDL: An Ensemble Deep Learning Method Using BERT and Multivariate Representation to Predict Peptide Detectability. Int J Mol Sci 2022;23:ijms232012385. [PMID: 36293242 PMCID: PMC9604182 DOI: 10.3390/ijms232012385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 10/11/2022] [Accepted: 10/12/2022] [Indexed: 12/03/2022] Open

Cross-attention PHV: Prediction of human and virus protein-protein interactions using cross-attention-based neural networks. Comput Struct Biotechnol J 2022;20:5564-5573. [PMID: 36249566 PMCID: PMC9546503 DOI: 10.1016/j.csbj.2022.10.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 10/05/2022] [Accepted: 10/05/2022] [Indexed: 11/30/2022] Open

Abstract

•

Cross-attention PHV implements two key technologies: cross-attention mechanism and 1D-CNN.

•

It accurately predicts PPIs between human and unknown influenza viruses/SARS-CoV-2.

•

It extracts critical taxonomic and evolutionary differences responsible for PPI prediction.

Viral infections represent a major health concern worldwide. The alarming rate at which SARS-CoV-2 spreads, for example, led to a worldwide pandemic. Viruses incorporate genetic material into the host genome to hijack host cell functions such as the cell cycle and apoptosis. In these viral processes, protein–protein interactions (PPIs) play critical roles. Therefore, the identification of PPIs between humans and viruses is crucial for understanding the infection mechanism and host immune responses to viral infections and for discovering effective drugs. Experimental methods including mass spectrometry-based proteomics and yeast two-hybrid assays are widely used to identify human-virus PPIs, but these experimental methods are time-consuming, expensive, and laborious. To overcome this problem, we developed a novel computational predictor, named cross-attention PHV, by implementing two key technologies of the cross-attention mechanism and a one-dimensional convolutional neural network (1D-CNN). The cross-attention mechanisms were very effective in enhancing prediction and generalization abilities. Application of 1D-CNN to the word2vec-generated feature matrices reduced computational costs, thus extending the allowable length of protein sequences to 9000 amino acid residues. Cross-attention PHV outperformed existing state-of-the-art models using a benchmark dataset and accurately predicted PPIs for unknown viruses. Cross-attention PHV also predicted human–SARS-CoV-2 PPIs with area under the curve values >0.95. The Cross-attention PHV web server and source codes are freely available at https://kurata35.bio.kyutech.ac.jp/Cross-attention_PHV/ and https://github.com/kuratahiroyuki/Cross-Attention_PHV, respectively.

Collapse

Richter P, Sebald K, Fischer K, Behrens M, Schnieke A, Somoza V. Bitter Peptides YFYPEL, VAPFPEVF, and YQEPVLGPVRGPFPIIV, Released during Gastric Digestion of Casein, Stimulate Mechanisms of Gastric Acid Secretion via Bitter Taste Receptors TAS2R16 and TAS2R38. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2022;70:11591-11602. [PMID: 36054030 PMCID: PMC9501810 DOI: 10.1021/acs.jafc.2c05228] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 08/13/2022] [Accepted: 08/15/2022] [Indexed: 05/22/2023]

A Statistical Analysis of the Sequence and Structure of Thermophilic and Non-Thermophilic Proteins. Int J Mol Sci 2022;23:ijms231710116. [PMID: 36077513 PMCID: PMC9456548 DOI: 10.3390/ijms231710116] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 08/29/2022] [Accepted: 08/31/2022] [Indexed: 11/17/2022] Open

Watanabe N, Yamamoto M, Murata M, Vavricka CJ, Ogino C, Kondo A, Araki M. Comprehensive Machine Learning Prediction of Extensive Enzymatic Reactions. J Phys Chem B 2022;126:6762-6770. [PMID: 36053051 DOI: 10.1021/acs.jpcb.2c03287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

BERT-PPII: The Polyproline Type II Helix Structure Prediction Model Based on BERT and Multichannel CNN. BIOMED RESEARCH INTERNATIONAL 2022;2022:9015123. [PMID: 36060139 PMCID: PMC9433275 DOI: 10.1155/2022/9015123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 08/01/2022] [Accepted: 08/03/2022] [Indexed: 11/26/2022]

Abstract

Predicting the polyproline type II (PPII) helix structure is crucial important in many research areas, such as the protein folding mechanisms, the drug targets, and the protein functions. However, many existing PPII helix prediction algorithms encode the protein sequence information in a single way, which causes the insufficient learning of protein sequence feature information. To improve the protein sequence encoding performance, this paper proposes a BERT-based PPII helix structure prediction algorithm (BERT-PPII), which learns the protein sequence information based on the BERT model. The BERT model's CLS vector can fairly fuse sample's each amino acid residue information. Thus, we utilize the CLS vector as the global feature to represent the sample's global contextual information. As the interactions among the protein chains' local amino acid residues have an important influence on the formation of PPII helix, we utilize the CNN to extract local amino acid residues' features which can further enhance the information expression of protein sequence samples. In this paper, we fuse the CLS vectors with CNN local features to improve the performance of predicting PPII structure. Compared to the state-of-the-art PPIIPRED method, the experimental results on the unbalanced dataset show that the proposed method improves the accuracy value by 1% on the strict dataset and 2% on the less strict dataset. Correspondingly, the results on the balanced dataset show that the AUCs of the proposed method are 0.826 on the strict dataset and 0.785 on less strict datasets, respectively. For the independent test set, the proposed method has the AUC value of 0.827 on the strict dataset and 0.783 on the less strict dataset. The above experimental results have proved that the proposed BERT-PPII method can achieve a superior performance of predicting the PPII helix.

Collapse

Dubovski N, Fierro F, Margulis E, Ben Shoshan-Galeczki Y, Peri L, Niv MY. Taste GPCRs and their ligands. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2022;193:177-193. [PMID: 36357077 DOI: 10.1016/bs.pmbts.2022.06.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

FRTpred: A novel approach for accurate prediction of protein folding rate and type. Comput Biol Med 2022;149:105911. [DOI: 10.1016/j.compbiomed.2022.105911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 07/08/2022] [Accepted: 07/23/2022] [Indexed: 11/20/2022]