Lai H, Luo D, Yang M, Zhu T, Yang H, Luo X, Wei Y, Xie S, Hong F, Shu K, Dao F, Ding H. PBertKla: a protein large language model for predicting human
lysine lactylation sites.
BMC Biol 2025;
23:95. [PMID:
40189537 PMCID:
PMC11974188 DOI:
10.1186/s12915-025-02202-1]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2025] [Accepted: 03/31/2025] [Indexed: 04/09/2025] Open
Abstract
BACKGROUND
Lactylation is a newly discovered type of post-translational modification, primarily occurring on lysine (K) residues of both histones and non-histones to exert diverse effects on target proteins. Research has shown that lysine lactylation (Kla) modification is ubiquitous in different cells and participates in the determination of cell function and fate, as well as in the initiation and progression of various diseases. Precise identification of Kla sites is fundamental for elucidating their biological functions and uncovering their application potential.
RESULTS
Here, we proposed a novel human Kla site predictor (named PBertKla) through curating a reliable benchmark dataset with proper sample length and sequence identity threshold to train a protein large language model with optimal hyperparameters. Extensive experimental results consistently demonstrated that our model possessed robust human Kla site prediction ability, achieving an AUC (area under receiver operating characteristic curve) value of over 0.880 on the independent validation data. Feature visualization analysis further validated the effectiveness of in feature learning and representation from Kla sequences. Moreover, we benchmarked PBertKla against other cutting-edge models on an independent testing dataset from different sources, highlighting its superiority and transferability.
CONCLUSIONS
All results indicated that PBertKla excelled as an automatic predictor of human Kla sites, and it would advance the investigation of lactylation modifications and their significance in health and disease.
Collapse