Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, Bhowmik D, Rost B. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE Trans Pattern Anal Mach Intell 2022;44:7112-7127. [PMID: 34232869 DOI: 10.1109/tpami.2021.3095381] [Citation(s) in RCA: 549] [Impact Index Per Article: 183.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]

For:	Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, Bhowmik D, Rost B. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE Trans Pattern Anal Mach Intell 2022;44:7112-7127. [PMID: 34232869 DOI: 10.1109/tpami.2021.3095381] [Citation(s) in RCA: 549] [Impact Index Per Article: 183.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]

Number

Cited by Other Article(s)

Maier A, Cha M, Burgess S, Wang A, Cuellar C, Kim S, Rajan NS, Neyyan J, Sengupta R, O’Connor K, Ott N, Williams A. Predicting purification process fit of monoclonal antibodies using machine learning. MAbs 2025;17:2439988. [PMID: 39782766 PMCID: PMC11730362 DOI: 10.1080/19420862.2024.2439988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 12/03/2024] [Accepted: 12/04/2024] [Indexed: 01/12/2025] Open

Le VT, Malik MS, Lin YJ, Liu YC, Chang YY, Ou YY. ATP_mCNN: Predicting ATP binding sites through pretrained language models and multi-window neural networks. Comput Biol Med 2025;185:109541. [PMID: 39653625 DOI: 10.1016/j.compbiomed.2024.109541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 11/20/2024] [Accepted: 12/05/2024] [Indexed: 01/26/2025]

Lv Z, Wei M, Pei H, Peng S, Li M, Jiang L. PTSP-BERT: Predict the thermal stability of proteins using sequence-based bidirectional representations from transformer-embedded features. Comput Biol Med 2025;185:109598. [PMID: 39708499 DOI: 10.1016/j.compbiomed.2024.109598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 12/16/2024] [Accepted: 12/17/2024] [Indexed: 12/23/2024]

Abstract

Thermophilic proteins, mesophiles proteins and psychrophilic proteins have wide industrial applications, as enzymes with different optimal temperatures are often needed for different purposes. Convenient methods are needed to determine the optimal temperatures for proteins; however, laboratory methods for this purpose are time-consuming and laborious, and existing machine learning methods can only perform binary classification of thermophilic and non-thermophilic proteins, or psychrophilic and non-psychrophilic proteins. Here, we developed a deep learning model, PSTP-BERT, based on protein sequences that can directly perform Three classes identification of thermophilic, mesophilic, and psychrophilic proteins. By comparing BERT-bfd with other deep learning models using five-fold cross-validation, we found that BERT-bfd-extracted features achieved the highest accuracy under six classifiers. Furthermore, to improve the model's accuracy, we used SMOTE (synthetic minority oversampling technique) to balance the dataset and light gradient-boosting machine to rank BERT-bfd-extracted features according to their weights. We obtained the best-performing model with five-fold cross-validation accuracy of 89.59 % and independent test accuracy of 85.42 %. The performance of the PSTP-BERT is significantly better than that of existing models in Three classes identification task. In order to compare with previous binary classification models, we used PSTP-BERT to perform binary classification tasks of thermophilic and non-thermophilic protein, and psychrophilic and non-psychrophilic protein on an independent test set. PSTP-BERT achieved the highest accuracy on both binary classification tasks, with an accuracy of 93.33 % for thermophilic protein binary classification and 88.33 % for psychrophilic protein binary classification. The accuracy of the independent test of the model can reach between 89.8 % and 92.9 % after training and optimization of the training set with different sequence similarities, and the prediction accuracy of the new data can exceed 97 %. For the convenience of future researchers to use and reference, we have uploaded source code of PSTP-BERT to GitHub.

Collapse

Ji S, Wu J, An F, Lou M, Zhang T, Guo J, Wu P, Zhu Y, Wu R. Umami-gcForest: Construction of a predictive model for umami peptides based on deep forest. Food Chem 2025;464:141826. [PMID: 39522377 DOI: 10.1016/j.foodchem.2024.141826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 10/07/2024] [Accepted: 10/27/2024] [Indexed: 11/16/2024]

Hu X, Li J, Liu T. Alg-MFDL: A multi-feature deep learning framework for allergenic proteins prediction. Anal Biochem 2025;697:115701. [PMID: 39481588 DOI: 10.1016/j.ab.2024.115701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 10/26/2024] [Accepted: 10/28/2024] [Indexed: 11/02/2024]

Kumar N, Du Z, Li Y. pLM4CPPs: Protein Language Model-Based Predictor for Cell Penetrating Peptides. J Chem Inf Model 2025. [PMID: 39878455 DOI: 10.1021/acs.jcim.4c01338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2025]

Abstract

Cell-penetrating peptides (CPPs) are short peptides capable of penetrating cell membranes, making them valuable for drug delivery and intracellular targeting. Accurate prediction of CPPs can streamline experimental validation in the lab. This study aims to assess pretrained protein language models (pLMs) for their effectiveness in representing CPPs and develop a reliable model for CPP classification. We evaluated peptide embeddings generated from BEPLER, CPCProt, SeqVec, various ESM variants (ESM, ESM-2 with expanded feature set, ESM-1b, and ESM-1v), ProtT5-XL UniRef50, ProtT5-XL BFD, and ProtBERT. We developed pLM4CCPs, a novel deep learning architecture using convolutional neural networks (CNNs) as the classifier for binary classification of CPPs. pLM4CCPs demonstrated superior performance over existing state-of-the-art CPP prediction models, achieving improvements in accuracy (ACC) by 4.9-5.5%, Matthews correlation coefficient (MCC) by 9.3-10.2%, and sensitivity (Sn) by 14.1-19.6%. Among all the tested models, ESM-1280 and ProtT5-XL BFD demonstrated the highest overall performance on the kelm data set. ESM-1280 achieved an ACC of 0.896, an MCC of 0.796, a Sn of 0.844, and a specificity (Sp) of 0.978. ProtT5-XL BFD exhibited superior performance with an ACC of 0.901, an MCC of 0.802, an Sn of 0.885, and an Sp of 0.917. pLM4CCPs combine predictions from multiple models to provide a consensus on whether a given peptide sequence is classified as a CPP or non-CPP. This approach will enhance prediction reliability by leveraging the strengths of each individual model. A user-friendly web server for bioactivity predictions, along with data sets, is available at https://ry2acnp6ep.us-east-1.awsapprunner.com. The source code and protocol for adapting pLM4CPPs can be accessed on GitHub at https://github.com/drkumarnandan/pLM4CPPs. This platform aims to advance CPP prediction and peptide functionality modeling, aiding researchers in exploring peptide functionality effectively.

Collapse

Yuan Y, Chen S, Hu R, Wang X. MutualDTA: An Interpretable Drug-Target Affinity Prediction Model Leveraging Pretrained Models and Mutual Attention. J Chem Inf Model 2025. [PMID: 39878060 DOI: 10.1021/acs.jcim.4c01893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2025]

Lytras S, Lamb KD, Ito J, Grove J, Yuan K, Sato K, Hughes J, Robertson DL. Pathogen genomic surveillance and the AI revolution. J Virol 2025:e0160124. [PMID: 39878472 DOI: 10.1128/jvi.01601-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2025] Open

Affiliation(s)

Spyros Lytras Division of Systems Virology, Department of Microbiology and Immunology, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, United Kingdom
Kieran D Lamb MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, United Kingdom School of Computing Science, University of Glasgow, Glasgow, Scotland, United Kingdom
Jumpei Ito Division of Systems Virology, Department of Microbiology and Immunology, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan International Research Center for Infectious Diseases, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
Joe Grove MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, United Kingdom
Ke Yuan School of Computing Science, University of Glasgow, Glasgow, Scotland, United Kingdom School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, United Kingdom Cancer Research UK Scotland Institute, Glasgow, Scotland, United Kingdom
Kei Sato Division of Systems Virology, Department of Microbiology and Immunology, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, United Kingdom International Research Center for Infectious Diseases, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan Graduate School of Medicine, The University of Tokyo, Tokyo, Japan Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan International Vaccine Design Center, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan Collaboration Unit for Infection, Joint Research Center for Human Retrovirus Infection, Kumamoto University, Kumamoto, Japan
Joseph Hughes MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, United Kingdom
David L Robertson MRC-University of Glasgow Centre for Virus Research, Glasgow, Scotland, United Kingdom

Collapse

Dosajh A, Agrawal P, Chatterjee P, Priyakumar UD. Modern machine learning methods for protein property prediction. Curr Opin Struct Biol 2025;90:102990. [PMID: 39881454 DOI: 10.1016/j.sbi.2025.102990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 12/06/2024] [Accepted: 01/04/2025] [Indexed: 01/31/2025]

Feller AL, Wilke CO. Peptide-Aware Chemical Language Model Successfully Predicts Membrane Diffusion of Cyclic Peptides. J Chem Inf Model 2025;65:571-579. [PMID: 39772542 DOI: 10.1021/acs.jcim.4c01441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]

Wu J, Liu Y, Zhang Y, Wang X, Yan H, Zhu Y, Song J, Yu DJ. Identifying Protein-Nucleotide Binding Residues via Grouped Multi-task Learning and Pre-trained Protein Language Models. J Chem Inf Model 2025;65:1040-1052. [PMID: 39788787 DOI: 10.1021/acs.jcim.4c02092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]

Creanza TM, Alberga D, Patruno C, Mangiatordi GF, Ancona N. Transformer Decoder Learns from a Pretrained Protein Language Model to Generate Ligands with High Affinity. J Chem Inf Model 2025. [PMID: 39871540 DOI: 10.1021/acs.jcim.4c02019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2025]

Yue Y, Cheng Y, Marquet C, Xiao C, Guo J, Li S, He S. Meta-Learning Enables Complex Cluster-Specific Few-Shot Binding Affinity Prediction for Protein-Protein Interactions. J Chem Inf Model 2025;65:580-588. [PMID: 39772708 DOI: 10.1021/acs.jcim.4c01607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]

Bhat S, Palepu K, Hong L, Mao J, Ye T, Iyer R, Zhao L, Chen T, Vincoff S, Watson R, Wang TZ, Srijay D, Kavirayuni VS, Kholina K, Goel S, Vure P, Deshpande AJ, Soderling SH, DeLisa MP, Chatterjee P. De novo design of peptide binders to conformationally diverse targets with contrastive language modeling. SCIENCE ADVANCES 2025;11:eadr8638. [PMID: 39841846 PMCID: PMC11753435 DOI: 10.1126/sciadv.adr8638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 12/20/2024] [Indexed: 01/24/2025]

Affiliation(s)

Suhaas Bhat Department of Biomedical Engineering, Duke University, Durham, NC, USA
Kalyan Palepu Department of Biomedical Engineering, Duke University, Durham, NC, USA
Lauren Hong Department of Biomedical Engineering, Duke University, Durham, NC, USA
Joey Mao Department of Cell Biology, Duke University, Durham, NC, USA
Tianzheng Ye Robert F. Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA
Rema Iyer Cancer Genome and Epigenetics Program, Sanford Burnham Prebys Institute, San Diego, CA, USA
Lin Zhao Department of Biomedical Engineering, Duke University, Durham, NC, USA
Tianlai Chen Department of Biomedical Engineering, Duke University, Durham, NC, USA
Sophia Vincoff Department of Biomedical Engineering, Duke University, Durham, NC, USA
Rio Watson Department of Biomedical Engineering, Duke University, Durham, NC, USA
Tian Z. Wang Department of Biomedical Engineering, Duke University, Durham, NC, USA
Divya Srijay Department of Biomedical Engineering, Duke University, Durham, NC, USA
Venkata Srikar Kavirayuni Department of Biomedical Engineering, Duke University, Durham, NC, USA
Kseniia Kholina Department of Biomedical Engineering, Duke University, Durham, NC, USA
Shrey Goel Department of Biomedical Engineering, Duke University, Durham, NC, USA
Pranay Vure Department of Biomedical Engineering, Duke University, Durham, NC, USA
Aniruddha J. Deshpande Cancer Genome and Epigenetics Program, Sanford Burnham Prebys Institute, San Diego, CA, USA
Scott H. Soderling Department of Cell Biology, Duke University, Durham, NC, USA
Matthew P. DeLisa Robert F. Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA Cornell Institute of Biotechnology, Cornell University, Ithaca, NY, USA
Pranam Chatterjee Department of Biomedical Engineering, Duke University, Durham, NC, USA Department of Computer Science, Duke University, Durham, NC, USA Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA

Collapse

Jiang K, Yan Z, Di Bernardo M, Sgrizzi SR, Villiger L, Kayabolen A, Kim BJ, Carscadden JK, Hiraizumi M, Nishimasu H, Gootenberg JS, Abudayyeh OO. Rapid in silico directed evolution by a protein language model with EVOLVEpro. Science 2025;387:eadr6006. [PMID: 39571002 DOI: 10.1126/science.adr6006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 11/12/2024] [Indexed: 01/25/2025]

Affiliation(s)

Kaiyi Jiang Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA Department of Bioengineering Massachusetts Institute of Technology, Cambridge, MA, USA
Zhaoqing Yan Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA
Matteo Di Bernardo Whitehead Institute Massachusetts Institute of Technology, Cambridge, MA, USA
Samantha R Sgrizzi Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA
Lukas Villiger Department of Dermatology and Allergology Kantonspital St. Gallen, St. Gallen, Switzerland
Alisan Kayabolen Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA
B J Kim Koch Institute for Integrative Cancer Research at MIT Massachusetts Institute of Technology, Cambridge, MA, USA
Josephine K Carscadden Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA
Masahiro Hiraizumi Department of Chemistry and Biotechnology, Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
Hiroshi Nishimasu Department of Chemistry and Biotechnology, Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan Structural Biology Division, Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo, Japan Inamori Research Institute for Science, 620 Suiginya-cho, Shimogyo-ku, Kyoto, Japan
Jonathan S Gootenberg Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA
Omar O Abudayyeh Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA Center for Virology and Vaccine Research Beth Israel Deaconess Medical Center Harvard Medical School, Boston, MA, USA

Collapse

De Waele G, Menschaert G, Vandamme P, Waegeman W. Pre-trained Maldi Transformers improve MALDI-TOF MS-based prediction. Comput Biol Med 2025;186:109695. [PMID: 39847945 DOI: 10.1016/j.compbiomed.2025.109695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 01/10/2025] [Accepted: 01/13/2025] [Indexed: 01/25/2025]

Majila K, Ullanat V, Viswanath S. A deep learning method for predicting interactions for intrinsically disordered regions of proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.12.19.629373. [PMID: 39763873 PMCID: PMC11702703 DOI: 10.1101/2024.12.19.629373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2025]

Elkin ME, Zhu X. Paying attention to the SARS-CoV-2 dialect : a deep neural network approach to predicting novel protein mutations. Commun Biol 2025;8:98. [PMID: 39838059 PMCID: PMC11751191 DOI: 10.1038/s42003-024-07262-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Accepted: 11/13/2024] [Indexed: 01/23/2025] Open

Meng L, Wei L, Wu R. MVGNN-PPIS: A novel multi-view graph neural network for protein-protein interaction sites prediction based on Alphafold3-predicted structures and transfer learning. Int J Biol Macromol 2025;300:140096. [PMID: 39848362 DOI: 10.1016/j.ijbiomac.2025.140096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 01/04/2025] [Accepted: 01/17/2025] [Indexed: 01/25/2025]

Abstract

Protein-protein interactions (PPI) are crucial for understanding numerous biological processes and pathogenic mechanisms. Identifying interaction sites is essential for biomedical research and targeted drug development. Compared to experimental methods, accurate computational approaches for protein-protein interaction sites (PPIS) prediction can save significant time and costs. In this study, we propose a novel model named MVGNN-PPIS. To the best of our knowledge, it is the first to utilize predicted structures generated by AlphaFold3, and combined with transfer learning techniques, for predicting PPIS. This approach addresses the limitations of traditional methods that depend on native protein structures and multiple sequence alignments (MSA). Additionally, we introduced a multi-view graph framework based on two types of graph structures: the k-nearest neighbor graph and the adjacency matrix. By alternately employing a Graph Transformer and Graph Convolutional Networks (GCN) to aggregate node information, this framework effectively captures both local and global dependencies of each residue in the predicted structures, thereby significantly enhancing the model's sensitivity to binding sites. This framework further integrates direction, distances and angular information between the 3D coordinates of side-chain atom centroids to construct a relative coordinate system, generating enhanced edge features that ensure the model's equivariance to molecular translations and rotations in space. During training, the Focal Loss function is employed to effectively address the class imbalance in the dataset. Experimental results demonstrate that MVGNN outperforms the current state-of-the-art methods across multiple PPIS benchmark datasets. To further validate the model's generalization capability, we extended MVGNN to the domain of predicting protein-nucleic acid interaction sites, where it also achieved superior performance.

Collapse

Howladar N, Kabir MWU, Hoque F, Katebi A, Hoque MT. PPILS: Protein-protein interaction prediction with language of biological coding. Comput Biol Med 2025;186:109678. [PMID: 39832439 DOI: 10.1016/j.compbiomed.2025.109678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 01/03/2025] [Accepted: 01/12/2025] [Indexed: 01/22/2025]

Mall R, Kaushik R, Martinez ZA, Thomson MW, Castiglione F. Benchmarking protein language models for protein crystallization. Sci Rep 2025;15:2381. [PMID: 39827171 PMCID: PMC11743144 DOI: 10.1038/s41598-025-86519-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 01/13/2025] [Indexed: 01/22/2025] Open

Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, Lin Z, Verkuil R, Tran VQ, Deaton J, Wiggert M, Badkundri R, Shafkat I, Gong J, Derry A, Molina RS, Thomas N, Khan YA, Mishra C, Kim C, Bartie LJ, Nemeth M, Hsu PD, Sercu T, Candido S, Rives A. Simulating 500 million years of evolution with a language model. Science 2025:eads0018. [PMID: 39818825 DOI: 10.1126/science.ads0018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 01/07/2025] [Indexed: 01/19/2025]

Yang J, Lal RG, Bowden JC, Astudillo R, Hameedi MA, Kaur S, Hill M, Yue Y, Arnold FH. Active learning-assisted directed evolution. Nat Commun 2025;16:714. [PMID: 39821082 PMCID: PMC11739421 DOI: 10.1038/s41467-025-55987-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 01/02/2025] [Indexed: 01/19/2025] Open

Ovchinnikov V, Karplus M. Phenomenological Modeling of Antibody Response from Vaccine Strain Composition. Antibodies (Basel) 2025;14:6. [PMID: 39846614 PMCID: PMC11755667 DOI: 10.3390/antib14010006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 01/11/2025] [Accepted: 01/14/2025] [Indexed: 01/24/2025] Open

Abstract

The elicitation of broadly neutralizing antibodies (bnAbs) is a major goal of vaccine design for highly mutable pathogens, such as influenza, HIV, and coronavirus. Although many rational vaccine design strategies for eliciting bnAbs have been devised, their efficacies need to be evaluated in preclinical animal models and in clinical trials. To improve outcomes for such vaccines, it would be useful to develop methods that can predict vaccine efficacies against arbitrary pathogen variants. As a step in this direction, here, we describe a simple biologically motivated model of antibody reactivity elicited by nanoparticle-based vaccines using only antigen amino acid sequences, parametrized with a small sample of experimental antibody binding data from influenza or SARS-CoV-2 nanoparticle vaccinations. Results: The model is able to recapitulate the experimental data to within experimental uncertainty, is relatively insensitive to the choice of the parametrization/training set, and provides qualitative predictions about the antigenic epitopes exploited by the vaccine, which are testable by experiment. For the mosaic nanoparticle vaccines considered here, model results suggest indirectly that the sera obtained from vaccinated mice contain bnAbs, rather than simply different strain-specific Abs. Although the present model was motivated by nanoparticle vaccines, we also apply it to a mutlivalent mRNA flu vaccination study, and demonstrate good recapitulation of experimental results. This suggests that the model formalism is, in principle, sufficiently flexible to accommodate different vaccination strategies. Finally, we show how the model could be used to rank the efficacies of vaccines with different antigen compositions. Conclusions: Overall, this study suggests that simple models of vaccine efficacy parametrized with modest amounts of experimental data could be used to compare the effectiveness of designed vaccines.

Collapse

Nagano Y, Pyo AGT, Milighetti M, Henderson J, Shawe-Taylor J, Chain B, Tiffeau-Mayer A. Contrastive learning of T cell receptor representations. Cell Syst 2025;16:101165. [PMID: 39778580 DOI: 10.1016/j.cels.2024.12.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 10/09/2024] [Accepted: 12/06/2024] [Indexed: 01/11/2025]

Gelman S, Johnson B, Freschlin C, Sharma A, D'Costa S, Peters J, Gitter A, Romero PA. Biophysics-based protein language models for protein engineering. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.03.15.585128. [PMID: 38559182 PMCID: PMC10980077 DOI: 10.1101/2024.03.15.585128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Changiarath A, Arya A, Xenidis VA, Padeken J, Stelzl LS. Sequence determinants of protein phase separation and recognition by protein phase-separated condensates through molecular dynamics and active learning. Faraday Discuss 2025;256:235-254. [PMID: 39319382 DOI: 10.1039/d4fd00099d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]

Abstract

Elucidating how protein sequence determines the properties of disordered proteins and their phase-separated condensates is a great challenge in computational chemistry, biology, and biophysics. Quantitative molecular dynamics simulations and derived free energy values can in principle capture how a sequence encodes the chemical and biological properties of a protein. These calculations are, however, computationally demanding, even after reducing the representation by coarse-graining; exploring the large spaces of potentially relevant sequences remains a formidable task. We employ an "active learning" scheme introduced by Yang et al. (bioRxiv, 2022, https://doi.org/10.1101/2022.08.05.502972) to reduce the number of labelled examples needed from simulations, where a neural network-based model suggests the most useful examples for the next training cycle. Applying this Bayesian optimisation framework, we determine properties of protein sequences with coarse-grained molecular dynamics, which enables the network to establish sequence-property relationships for disordered proteins and their self-interactions and their interactions in phase-separated condensates. We show how iterative training with second virial coefficients derived from the simulations of disordered protein sequences leads to a rapid improvement in predicting peptide self-interactions. We employ this Bayesian approach to efficiently search for new sequences that bind to condensates of the disordered C-terminal domain (CTD) of RNA Polymerase II, by simulating molecular recognition of peptides to phase-separated condensates in coarse-grained molecular dynamics. By searching for protein sequences which prefer to self-interact rather than interact with another protein sequence we are able to shape the morphology of protein condensates and design multiphasic protein condensates.

Collapse

Lee J, Bang D, Kim S. Residue-Level Multiview Deep Learning for ATP Binding Site Prediction and Applications in Kinase Inhibitors. J Chem Inf Model 2025;65:50-61. [PMID: 39690486 DOI: 10.1021/acs.jcim.4c01255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2024]

Huang H, Shi X, Lei H, Hu F, Cai Y. ProtChat: An AI Multi-Agent for Automated Protein Analysis Leveraging GPT-4 and Protein Language Model. J Chem Inf Model 2025;65:62-70. [PMID: 39690112 DOI: 10.1021/acs.jcim.4c01345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2024]

Subramanian AM, Martinez ZA, Lourenço AL, Liu S, Thomson M. Unexplored regions of the protein sequence-structure map revealed at scale by a library of foldtuned language models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2023.12.22.573145. [PMID: 38187750 PMCID: PMC10769378 DOI: 10.1101/2023.12.22.573145] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]

Abstract

The combinatorial scale of amino-acid sequence-space has traditionally precluded substantive study of the full protein sequence-structure map. It remains unknown, for instance, how much of the vast uncharted landscape of far-from-natural sequences encodes the familiar ensemble of natural folds in a fashion consistent with the laws of biophysics but seemingly untouched by evolution on Earth. The scale of sequence perturbations required to access these spaces exceeds the reach of even gold-standard experimental approaches such as directed evolution. We surpass this limitation guided by the innate capacity of protein language models (pLMs) to explore sequences outside their natural training data through generation and self-feedback. We recast pLMs as probes that explore into regions of protein "deep space" that possess little-to-no detectable homology to natural examples, while enforcing core structural constraints, in a novel sequence design approach that we term "foldtuning." We build a library of foldtuned pLMs for >700 natural folds in the SCOP database, covering numerous high-priority targets for synthetic biology, including GPCRs and small GTPases, composable cell-surface-receptor and DNA-binding domains, and small signaling/regulatory domains. Candidate proteins generated by foldtuned pLMs reflect distinctive new "rules of language" for sequence innovation beyond detectable homology to any known protein and sample subtle structural alterations in a manner reminiscent of natural structural evolution and diversification. Experimental validation of two markedly different fold targets; the tyrosine-kinase- and small-GTPase-regulating SH3 domain and the bacterial RNase inhibitor barstar demonstrates that fold-tuning proposes protein variants that express and fold stably in vitro and function in vivo . Foldtuning reveals protein sequence-structure information at scale out-side of the context of evolution and promises to push forward the redesign and reconstitution of novel-to-nature synthetic biological systems for applications in health and catalysis.

Collapse

Johnson S, Weigele P, Fomenkov A, Ge A, Vincze A, Eaglesham J, Roberts R, Sun Z. Domainator, a flexible software suite for domain-based annotation and neighborhood analysis, identifies proteins involved in antiviral systems. Nucleic Acids Res 2025;53:gkae1175. [PMID: 39657740 PMCID: PMC11754643 DOI: 10.1093/nar/gkae1175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 11/07/2024] [Accepted: 11/15/2024] [Indexed: 12/12/2024] Open

Chen Z, Ji C, Xu W, Gao J, Huang J, Xu H, Qian G, Huang J. UniAMP: enhancing AMP prediction using deep neural networks with inferred information of peptides. BMC Bioinformatics 2025;26:10. [PMID: 39799358 PMCID: PMC11725221 DOI: 10.1186/s12859-025-06033-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Accepted: 01/02/2025] [Indexed: 01/15/2025] Open

Wang R, Ji Y, Li Y, Lee ST. Applications of Transformers in Computational Chemistry: Recent Progress and Prospects. J Phys Chem Lett 2025;16:421-434. [PMID: 39737793 DOI: 10.1021/acs.jpclett.4c03128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2025]

Das S, Ghosh S, Jana ND. TransConv: convolution-infused transformer for protein secondary structure prediction. J Mol Model 2025;31:37. [PMID: 39776295 DOI: 10.1007/s00894-024-06259-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 12/15/2024] [Indexed: 01/11/2025]

Yan B, Nam Y, Li L, Deek RA, Li H, Ma S. Recent advances in deep learning and language models for studying the microbiome. Front Genet 2025;15:1494474. [PMID: 39840283 PMCID: PMC11747409 DOI: 10.3389/fgene.2024.1494474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Accepted: 12/13/2024] [Indexed: 01/23/2025] Open

Eom H, Park S, Cho K, Lee J, Kim H, Kim S, Yang J, Han YH, Lee J, Seok C, Lee M, Song W, Steinegger M. Discovery of highly active kynureninases for cancer immunotherapy through protein language model. Nucleic Acids Res 2025;53:gkae1245. [PMID: 39777462 PMCID: PMC11704957 DOI: 10.1093/nar/gkae1245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 11/16/2024] [Accepted: 12/05/2024] [Indexed: 01/11/2025] Open

Affiliation(s)

Hyunuk Eom Department of Chemistry, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Sukhwan Park School of Biological Sciences, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Kye Soo Cho Galux Inc, 1837 Nambusunhwan-ro, Gwanak-gu, Seoul 08738, Republic of Korea
Jihyeon Lee Department of Chemistry, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Hyunbin Kim School of Biological Sciences, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Stephanie Kim School of Biological Sciences, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Jinsol Yang Galux Inc, 1837 Nambusunhwan-ro, Gwanak-gu, Seoul 08738, Republic of Korea
Young-Hyun Han Galux Inc, 1837 Nambusunhwan-ro, Gwanak-gu, Seoul 08738, Republic of Korea
Juyong Lee Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea School of Pharmacy, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Arontier Co., 241 Gangnam-daero, Seocho-gu, Seoul 06735, Republic of Korea
Chaok Seok Artificial Intelligence Institute, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Institute of Molecular Biology and Genetics, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Department of Chemistry, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Galux Inc, 1837 Nambusunhwan-ro, Gwanak-gu, Seoul 08738, Republic of Korea
Myeong Sup Lee Department of Biomedical Sciences, University of Ulsan College of Medicine, Asan Medical Center, 88 Olympic-ro 43-gil, Songpa-gu, Seoul 05505, Republic of Korea Galux Inc, 1837 Nambusunhwan-ro, Gwanak-gu, Seoul 08738, Republic of Korea
Woon Ju Song Department of Chemistry, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea
Martin Steinegger School of Biological Sciences, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Artificial Intelligence Institute, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea Institute of Molecular Biology and Genetics, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea

Collapse

Piovesan D, Del Conte A, Mehdiabadi M, Aspromonte M, Blum M, Tesei G, von Bülow S, Lindorff-Larsen K, Tosatto SE. MOBIDB in 2025: integrating ensemble properties and function annotations for intrinsically disordered proteins. Nucleic Acids Res 2025;53:D495-D503. [PMID: 39470701 PMCID: PMC11701742 DOI: 10.1093/nar/gkae969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/07/2024] [Accepted: 10/11/2024] [Indexed: 10/30/2024] Open

Szklarczyk D, Nastou K, Koutrouli M, Kirsch R, Mehryary F, Hachilif R, Hu D, Peluso ME, Huang Q, Fang T, Doncheva NT, Pyysalo S, Bork P, Jensen LJ, von Mering C. The STRING database in 2025: protein networks with directionality of regulation. Nucleic Acids Res 2025;53:D730-D737. [PMID: 39558183 PMCID: PMC11701646 DOI: 10.1093/nar/gkae1113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/18/2024] [Accepted: 10/29/2024] [Indexed: 11/20/2024] Open

Affiliation(s)

Damian Szklarczyk Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
Katerina Nastou Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
Mikaela Koutrouli Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
Rebecca Kirsch Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
Farrokh Mehryary TurkuNLP Lab, Department of Computing, University of Turku, Vesilinnantie 5, 20014 Turku, Finland
Radja Hachilif Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
Dewei Hu Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
Matteo E Peluso Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
Qingyao Huang Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
Tao Fang Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
Nadezhda T Doncheva Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
Sampo Pyysalo TurkuNLP Lab, Department of Computing, University of Turku, Vesilinnantie 5, 20014 Turku, Finland
Peer Bork Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany Max Delbrück Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, Germany Department of Bioinformatics, Biozentrum, University of Würzburg, Am Hubland, 97074 Würzburg, Germany
Lars J Jensen Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark
Christian von Mering Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland

Collapse

Shen H, Li Y, Pi Q, Tian J, Xu X, Huang Z, Huang J, Pian C, Mao S. Unveiling novel antimicrobial peptides from the ruminant gastrointestinal microbiomes: A deep learning-driven approach yields an anti-MRSA candidate. J Adv Res 2025:S2090-1232(25)00005-0. [PMID: 39756573 DOI: 10.1016/j.jare.2025.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2024] [Revised: 01/01/2025] [Accepted: 01/02/2025] [Indexed: 01/07/2025] Open

Abstract

INTRODUCTION

Antimicrobial peptides (AMPs) present a promising avenue to combat the growing threat of antibiotic resistance. The ruminant gastrointestinal microbiome serves as a unique ecosystem that offers untapped potential for AMP discovery.

OBJECTIVES

The aims of this study are to develop an effective methodology for the identification of novel AMPs from ruminant gastrointestinal microbiomes, followed by evaluating their antimicrobial efficacy and elucidating the mechanisms underlying their activity.

METHODS

We developed a deep learning-based model to identify AMP candidates from a dataset comprising 120 metagenomes and 10,373 metagenome-assembled genomes derived from the ruminant gastrointestinal tract. Both in vivo and in vitro experiments were performed to examine and validate the antimicrobial activities of the AMP candidates that were selected through bioinformatic analysis and subsequently synthesized chemically. Additionally, molecular dynamics simulations were conducted to explore the action mechanism of the most potent AMP candidate.

RESULTS

The deep learning model identified 27,192 potential secretory AMP candidates. Following bioinformatic analysis, 39 candidates were synthesized and tested. Remarkably, all synthesized peptides demonstrated antimicrobial activity against Staphylococcus aureus, with 79.5% showing effectiveness against multiple pathogens. Notably, Peptide 4, which exhibited the highest antimicrobial activity against methicillin-resistant Staphylococcus aureus (MRSA), confirmed this effect in a mouse model with wound infection, exhibiting a low propensity for resistance development and minimal cytotoxicity and hemolysis towards mammalian cells. Molecular dynamics simulations provided insights into the mechanism of Peptide 4, primarily its ability to disrupt bacterial cell membranes, leading to cell death.

CONCLUSION

This study highlights the power of combining deep learning with microbiome research to uncover novel therapeutic candidates, paving the way for the development of next-generation antimicrobials like Peptide 4 to combat the growing threat of MRSA would infections. It also underscores the value of utilizing ruminant microbial resources.

Collapse

Affiliation(s)

Hong Shen Bioinformatics Center, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China
Yanru Li College of Agriculture, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China
Qingjie Pi Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China; Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China
Junru Tian Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China; Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China
Xianghan Xu College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China
Zan Huang Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China; Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China.
Jinghu Huang College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China.
Cong Pian School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 211198, Jiangsu, China.
Shengyong Mao Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China; Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China.

Collapse

Hennig J, Paulino C. 4D structural biology-The 9^th Murnau Conference on structural biology. Structure 2025;33:1-5. [PMID: 39753099 DOI: 10.1016/j.str.2024.11.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 11/13/2024] [Accepted: 11/18/2024] [Indexed: 01/11/2025]

Chatzimiltis S, Agathocleous M, Promponas VJ, Christodoulou C. Post-processing enhances protein secondary structure prediction with second order deep learning and embeddings. Comput Struct Biotechnol J 2025;27:243-251. [PMID: 39866664 PMCID: PMC11764030 DOI: 10.1016/j.csbj.2024.12.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 12/20/2024] [Accepted: 12/21/2024] [Indexed: 01/28/2025] Open

Daoud A, Ben-Hur A. The role of chromatin state in intron retention: A case study in leveraging large scale deep learning models. PLoS Comput Biol 2025;21:e1012755. [PMID: 39792954 PMCID: PMC11756788 DOI: 10.1371/journal.pcbi.1012755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 01/23/2025] [Accepted: 12/30/2024] [Indexed: 01/12/2025] Open

Song J, Kurgan L. Two decades of advances in sequence-based prediction of MoRFs, disorder-to-order transitioning binding regions. Expert Rev Proteomics 2025;22:1-9. [PMID: 39789785 DOI: 10.1080/14789450.2025.2451715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/20/2024] [Accepted: 12/26/2024] [Indexed: 01/12/2025]

Boadu F, Lee A, Cheng J. Deep learning methods for protein function prediction. Proteomics 2025;25:e2300471. [PMID: 38996351 PMCID: PMC11735672 DOI: 10.1002/pmic.202300471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 06/15/2024] [Accepted: 06/18/2024] [Indexed: 07/14/2024]

Pratyush P, Pokharel S, Ismail HD, Bahmani S, Kc DB. LMPTMSite: A Platform for PTM Site Prediction in Proteins Leveraging Transformer-Based Protein Language Models. Methods Mol Biol 2025;2867:261-297. [PMID: 39576587 DOI: 10.1007/978-1-0716-4196-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]

Brizuela CA, Liu G, Stokes JM, de la Fuente‐Nunez C. AI Methods for Antimicrobial Peptides: Progress and Challenges. Microb Biotechnol 2025;18:e70072. [PMID: 39754551 PMCID: PMC11702388 DOI: 10.1111/1751-7915.70072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 11/18/2024] [Accepted: 12/16/2024] [Indexed: 01/06/2025] Open

Gao M, Song C, Liu T. PLM-T3SE: Accurate Prediction of Type III Secretion Effectors Using Protein Language Model Embeddings. J Cell Biochem 2025;126:e30642. [PMID: 39164870 DOI: 10.1002/jcb.30642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Revised: 08/04/2024] [Accepted: 08/07/2024] [Indexed: 08/22/2024]

Pratyush P, Kc DB. Advances in Prediction of Posttranslational Modification Sites Known to Localize in Protein Supersecondary Structures. Methods Mol Biol 2025;2870:117-151. [PMID: 39543034 DOI: 10.1007/978-1-0716-4213-9_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]

Zhou Y, Liu W, Luo C, Huang Z, Samarappuli Mudiyanselage Savini G, Zhao L, Wang R, Huang J. Ab-Amy 2.0: Predicting light chain amyloidogenic risk of therapeutic antibodies based on antibody language model. Methods 2025;233:11-18. [PMID: 39550021 DOI: 10.1016/j.ymeth.2024.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 10/28/2024] [Accepted: 11/06/2024] [Indexed: 11/18/2024] Open

Zhang M, Zhang Y, Dong K, Lin J, Cui X, Zhang Y. Identification of Critical Phosphorylation Sites Enhancing Kinase Activity With a Bimodal Fusion Framework. Mol Cell Proteomics 2025;24:100889. [PMID: 39617062 PMCID: PMC11774822 DOI: 10.1016/j.mcpro.2024.100889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 11/26/2024] [Accepted: 11/28/2024] [Indexed: 01/12/2025] Open