Yang M, Zhuang J, Hu W, Li J, Wang Y, Zhang Z, Liu C, Chen H. Enhancing Patient Selection in Sepsis Clinical Trials Design Through an AI Enrichment Strategy: Algorithm Development and Validation.
J Med Internet Res 2024;
26:e54621. [PMID:
39231425 PMCID:
PMC11411223 DOI:
10.2196/54621]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 04/22/2024] [Accepted: 07/21/2024] [Indexed: 09/06/2024] Open
Abstract
BACKGROUND
Sepsis is a heterogeneous syndrome, and enrollment of more homogeneous patients is essential to improve the efficiency of clinical trials. Artificial intelligence (AI) has facilitated the identification of homogeneous subgroups, but how to estimate the uncertainty of the model outputs when applying AI to clinical decision-making remains unknown.
OBJECTIVE
We aimed to design an AI-based model for purposeful patient enrollment, ensuring that a patient with sepsis recruited into a trial would still be persistently ill by the time the proposed therapy could impact patient outcome. We also expected that the model could provide interpretable factors and estimate the uncertainty of the model outputs at a customized confidence level.
METHODS
In this retrospective study, 9135 patients with sepsis requiring vasopressor treatment within 24 hours after sepsis onset were enrolled from Beth Israel Deaconess Medical Center. This cohort was used for model development, and 10-fold cross-validation with 50 repeats was used for internal validation. In total, 3743 patients with sepsis from the eICU Collaborative Research Database were used as the external validation cohort. All included patients with sepsis were stratified based on disease progression trajectories: rapid death, recovery, and persistent ill. A total of 148 variables were selected for predicting the 3 trajectories. Four machine learning algorithms with 3 different setups were used. We estimated the uncertainty of the model outputs using conformal prediction (CP). The Shapley Additive Explanations method was used to explain the model.
RESULTS
The multiclass gradient boosting machine was identified as the best-performing model with good discrimination and calibration performance in both validation cohorts. The mean area under the receiver operating characteristic curve with SD was 0.906 (0.018) for rapid death, 0.843 (0.008) for recovery, and 0.807 (0.010) for persistent ill in the internal validation cohort. In the external validation cohort, the mean area under the receiver operating characteristic curve (SD) was 0.878 (0.003) for rapid death, 0.764 (0.008) for recovery, and 0.696 (0.007) for persistent ill. The maximum norepinephrine equivalence, total urine output, Acute Physiology Score III, mean systolic blood pressure, and the coefficient of variation of oxygen saturation contributed the most. Compared to the model without CP, using the model with CP at a mixed confidence approach reduced overall prediction errors by 27.6% (n=62) and 30.7% (n=412) in the internal and external validation cohorts, respectively, as well as enabled the identification of more potentially persistent ill patients.
CONCLUSIONS
The implementation of our model has the potential to reduce heterogeneity and enroll more homogeneous patients in sepsis clinical trials. The use of CP for estimating the uncertainty of the model outputs allows for a more comprehensive understanding of the model's reliability and assists in making informed decisions based on the predicted outcomes.
Collapse