Liu Z, Elcheva I. A six-gene prognostic signature for both adult and pediatric acute myeloid leukemia identified with machine learning.
Am J Transl Res 2022;
14:6210-6221. [PMID:
36247279 PMCID:
PMC9556437]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/19/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND
Although it is well-known that adult and pediatric acute myeloid leukemias (AMLs) are genetically distinct diseases, they still share certain gene expression profiles. The age-related genetic heterogeneities of AMLs have been well-studied, but the common prognostic signatures and molecular mechanisms of adult and pediatric AMLs are less investigated.
AIM
To identify genes and pathways that are associated with both pediatric and adult AMLs and discover a gene signature for overall survival (OS) prediction.
METHODS
Through mining the transcriptome profiles of The Cancer Genome Atlas (TCGA) data sets of adult cancers and The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) data of pediatric cancers, we identified genes that are commonly dysregulated in both pediatric and adult AMLs, further discovered a common gene signature, and built two risk score models for TCGA and TARGET cohorts, respectively with L 0 regularized global AUC (area under the receiver operating characteristic curve) summary maximization.
RESULTS
We identified 57 genes that are differentially expressed and prognostically significant in both adult and childhood AMLs. The top 4 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enriched with those 57 genes include transcriptional misregulation, focal adhesion, PI3K-Akt signaling pathway, and signaling pathways regulating pluripotency of stem cells. We further identified a 6-gene signature including genes of ADAMTS3, DNMT3B, NYNRIN, SORT1, ZFHX3, and ZG16B for risk prediction. We constructed a risk score model with one dataset (either TCGA or TARGET) and evaluated its performance with the other. The test AUCs for the risk prediction of TCGA data with a 2-year and 5-year OS cutoffs are 0.762 (P = 2.33e-13, 95% CI: 0.69-0.83) and 0.759 (P = 7.26e-08, 95% CI: 0.66-0.85), respectively, while the test AUCs of TARGET data with the same cutoffs are 0.71 (P = 3.3e-07, 95% CI: 0.62-0.79) and 0.72 (P= 5.25e-09, 95% CI: 0.65-0.80), respectively. We further stratified patients into 3 equal sized prognostic subtypes with the 6-gene risk scores. The P-values of the tertile partitions are 1.74e-07 and 3.28e-08 for the TARGET and TCGA cohorts, respectively, which are significantly better than the standard cytogenetic risk stratification of both cohorts (TARGET: P = 1.64e-06; TCGA: P = 1.79e-05). When validated with two other independent cohorts, the 6-gene risk score models remain a significant predictor for OS. Investigating the common gene expression program is significant in that we may extrapolate the findings from adults to children and avoid unnecessary pediatric clinical trials.
Collapse