Drozdov I, Szubert B, Rowe IA, Kendall TJ, Fallowfield JA. Accurate prediction of all-cause mortality in patients with metabolic dysfunction-associated steatotic liver disease using electronic health records.
Ann Hepatol 2024;
29:101528. [PMID:
38971372 DOI:
10.1016/j.aohep.2024.101528]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 06/13/2024] [Indexed: 07/08/2024]
Abstract
INTRODUCTION AND OBJECTIVES
Despite the huge clinical burden of MASLD, validated tools for early risk stratification are lacking, and heterogeneous disease expression and a highly variable rate of progression to clinical outcomes result in prognostic uncertainty. We aimed to investigate longitudinal electronic health record-based outcome prediction in MASLD using a state-of-the-art machine learning model.
PATIENTS AND METHODS
n = 940 patients with histologically-defined MASLD were used to develop a deep-learning model for all-cause mortality prediction. Patient timelines, spanning 12 years, were fully-annotated with demographic/clinical characteristics, ICD-9 and -10 codes, blood test results, prescribing data, and secondary care activity. A Transformer neural network (TNN) was trained to output concomitant probabilities of 12-, 24-, and 36-month all-cause mortality. In-sample performance was assessed using 5-fold cross-validation. Out-of-sample performance was assessed in an independent set of n = 528 MASLD patients.
RESULTS
In-sample model performance achieved AUROC curve 0.74-0.90 (95 % CI: 0.72-0.94), sensitivity 64 %-82 %, specificity 75 %-92 % and Positive Predictive Value (PPV) 94 %-98 %. Out-of-sample model validation had AUROC 0.70-0.86 (95 % CI: 0.67-0.90), sensitivity 69 %-70 %, specificity 96 %-97 % and PPV 75 %-77 %. Key predictive factors, identified using coefficients of determination, were age, presence of type 2 diabetes, and history of hospital admissions with length of stay >14 days.
CONCLUSIONS
A TNN, applied to routinely-collected longitudinal electronic health records, achieved good performance in prediction of 12-, 24-, and 36-month all-cause mortality in patients with MASLD. Extrapolation of our technique to population-level data will enable scalable and accurate risk stratification to identify people most likely to benefit from anticipatory health care and personalized interventions.
Collapse