He J, Hu J, Liu H. A three-gene random forest model for diagnosing idiopathic pulmonary fibrosis based on circadian rhythm-related genes in lung tissue.
Expert Rev Respir Med 2023;
17:1307-1320. [PMID:
38285622 DOI:
10.1080/17476348.2024.2311262]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 01/24/2024] [Indexed: 01/31/2024]
Abstract
BACKGROUND
The disorder of circadian rhythm could be a key factor mediating fibrotic lung disease Therefore, our study aims to determine the diagnostic value of circadian rhythm-related genes (CRRGs) in IPF.
METHODS
We retrieved the data on CRRGs from previous studies and the GSE150910 dataset. The participants from the GSE150910 dataset were divided into training and internal validation sets. Next, we used several various bioinformatics methods and machine learning algorithms to screen genes. Next, we identified SEMA5A, COL7A1, and TUBB3, which were included in the random forest (RF) diagnostic model. Finally, external validation was conducted on data retrieved from the GSE184316 datasets.
RESULTS
The results revealed that the RF diagnostic model could diagnose patients with IPF in the internal validation set with the area under the ROC curve (AUC) value of 0.905 and in the external validation with the AUC value of 0.767. Furthermore, real-time quantitative PCR and western blotting results revealed a significant decrease in SEMA5A (p < 0.05) expression level and an increase in COL7A1 and TUBB3 expression levels in TGF-β1-treated normal human lung fibroblasts.
CONCLUSION
We constructed an RF diagnostic model based on SEMA5A, COL7A1, and TUBB3 expression in lung tissue for diagnosing patients with IPF.
Collapse