Nasser L, McLeod SL, Hall JN. Evaluating the Reliability of a Remote Acuity Prediction Tool in a Canadian Academic Emergency Department.
Ann Emerg Med 2024;
83:373-379. [PMID:
38180398 DOI:
10.1016/j.annemergmed.2023.11.018]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/04/2023] [Accepted: 11/17/2023] [Indexed: 01/06/2024]
Abstract
STUDY OBJECTIVE
There is increasing interest in harnessing artificial intelligence to virtually triage patients seeking care. The objective was to examine the reliability of a virtual machine learning algorithm to remotely predict acuity scores for patients seeking emergency department (ED) care by applying the algorithm to retrospective ED data.
METHODS
This was a retrospective review of adult patients conducted at an academic tertiary care ED (annual census 65,000) from January 2021 to August 2022. Data including ED visit date and time, patient age, sex, reason for visit, presenting complaint and patient-reported pain score were used by the machine learning algorithm to predict acuity scores. The algorithm was designed to up-triage high-risk complaints to promote safety for remote use. The predicted scores were then compared to nurse-led triage scores previously derived in real time using the electronic Canadian Triage and Acuity Scale (eCTAS), an electronic triage decision-support tool used in the ED. Interrater reliability was estimated using kappa statistics with 95% confidence intervals (CIs).
RESULTS
In total, 21,469 unique ED patient encounters were included. Exact modal agreement was achieved for 10,396 (48.4%) patient encounters. Interrater reliability ranged from poor to fair, as estimated using unweighted kappa (0.18, 95% CI 0.17 to 0.19), linear-weighted kappa (0.25, 95% CI 0.24 to 0.26), and quadratic-weighted kappa (0.36, 95% CI 0.35 to 0.37) statistics. Using the nurse-led eCTAS score as the reference, the machine learning algorithm overtriaged 9,897 (46.1%) and undertriaged 1,176 (5.5%) cases. Some of the presenting complaints under-triaged were conditions generally requiring further probing to delineate their nature, including abnormal lab/imaging results, visual disturbance, and fever.
CONCLUSION
This machine learning algorithm needs further refinement before being safely implemented for patient use.
Collapse