1
|
Akhlaghi H, Freeman S, Vari C, McKenna B, Braitberg G, Karro J, Tahayori B. Machine learning in clinical practice: Evaluation of an artificial intelligence tool after implementation. Emerg Med Australas 2024; 36:118-124. [PMID: 37771067 DOI: 10.1111/1742-6723.14325] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 09/14/2023] [Accepted: 09/19/2023] [Indexed: 09/30/2023]
Abstract
OBJECTIVE Artificial intelligence (AI) has gradually found its way into healthcare, and its future integration into clinical practice is inevitable. In the present study, we evaluate the accuracy of a novel AI algorithm designed to predict admission based on a triage note after clinical implementation. This is the first of such studies to investigate real-time AI performance in the emergency setting. METHODS The novel AI algorithm that predicts admission using a triage note was translated into clinical practice and integrated within St Vincent's Hospital Melbourne's electronic emergency patient management system. The data were collected from 1 January 2021 to 17 August 2022 to evaluate the diagnostic accuracy of the AI system after implementation. RESULTS A total of 77 125 ED presentations were included. The live AI algorithm has a sensitivity of 73.1% (95% confidence interval 72.5-73.8), specificity of 74.3% (73.9-74.7), positive predictive value of 50% (49.6-50.4) and negative predictive value of 88.7% (88.5-89) with a total accuracy of 74% (73.7-74.3). The accuracy of the system was at the lowest for admission to psychiatric units (34%) and at the highest for gastroenterology and medical admission (84% and 80%, respectively). CONCLUSION Our study showed the diagnostic evaluation of a real-time AI clinical decision-support tool became less accurate than the original. Although real-time sensitivity and specificity of the AI tool was still acceptable as a decision-support tool in the ED, we propose that continuous training and evaluation of AI-enabled clinical support tools in healthcare are conducted to ensure consistent accuracy and performance to prevent inadvertent consequences.
Collapse
Affiliation(s)
- Hamed Akhlaghi
- Department of Emergency Medicine, St Vincent's Hospital Melbourne, Melbourne, Victoria, Australia
- Department of Medical Education, The University of Melbourne, Melbourne, Victoria, Australia
- Faculty of Health, Deakin University, Melbourne, Victoria, Australia
| | - Sam Freeman
- Department of Emergency Medicine, St Vincent's Hospital Melbourne, Melbourne, Victoria, Australia
- SensiLab, Monash University, Melbourne, Victoria, Australia
| | - Cynthia Vari
- Department of Emergency Medicine, St Vincent's Hospital Melbourne, Melbourne, Victoria, Australia
| | - Bede McKenna
- Department of Emergency Medicine, St Vincent's Hospital Melbourne, Melbourne, Victoria, Australia
| | - George Braitberg
- Department of Emergency Medicine, Austin Health, Melbourne, Victoria, Australia
- Department of Critical Care, The University of Melbourne, Melbourne, Victoria, Australia
| | - Jonathan Karro
- Department of Emergency Medicine, St Vincent's Hospital Melbourne, Melbourne, Victoria, Australia
| | - Bahman Tahayori
- Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
2
|
Brann F, Sterling NW, Frisch SO, Schrager JD. Sepsis Prediction at Emergency Department Triage Using Natural Language Processing: Retrospective Cohort Study. JMIR AI 2024; 3:e49784. [PMID: 38875594 PMCID: PMC11041457 DOI: 10.2196/49784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 08/15/2023] [Accepted: 12/16/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Despite its high lethality, sepsis can be difficult to detect on initial presentation to the emergency department (ED). Machine learning-based tools may provide avenues for earlier detection and lifesaving intervention. OBJECTIVE The study aimed to predict sepsis at the time of ED triage using natural language processing of nursing triage notes and available clinical data. METHODS We constructed a retrospective cohort of all 1,234,434 consecutive ED encounters in 2015-2021 from 4 separate clinically heterogeneous academically affiliated EDs. After exclusion criteria were applied, the final cohort included 1,059,386 adult ED encounters. The primary outcome criteria for sepsis were presumed severe infection and acute organ dysfunction. After vectorization and dimensional reduction of triage notes and clinical data available at triage, a decision tree-based ensemble (time-of-triage) model was trained to predict sepsis using the training subset (n=950,921). A separate (comprehensive) model was trained using these data and laboratory data, as it became available at 1-hour intervals, after triage. Model performances were evaluated using the test (n=108,465) subset. RESULTS Sepsis occurred in 35,318 encounters (incidence 3.45%). For sepsis prediction at the time of patient triage, using the primary definition, the area under the receiver operating characteristic curve (AUC) and macro F1-score for sepsis were 0.94 and 0.61, respectively. Sensitivity, specificity, and false positive rate were 0.87, 0.85, and 0.15, respectively. The time-of-triage model accurately predicted sepsis in 76% (1635/2150) of sepsis cases where sepsis screening was not initiated at triage and 97.5% (1630/1671) of cases where sepsis screening was initiated at triage. Positive and negative predictive values were 0.18 and 0.99, respectively. For sepsis prediction using laboratory data available each hour after ED arrival, the AUC peaked to 0.97 at 12 hours. Similar results were obtained when stratifying by hospital and when Centers for Disease Control and Prevention hospital toolkit for adult sepsis surveillance criteria were used to define sepsis. Among septic cases, sepsis was predicted in 36.1% (1375/3814), 49.9% (1902/3814), and 68.3% (2604/3814) of encounters, respectively, at 3, 2, and 1 hours prior to the first intravenous antibiotic order or where antibiotics where not ordered within the first 12 hours. CONCLUSIONS Sepsis can accurately be predicted at ED presentation using nursing triage notes and clinical information available at the time of triage. This indicates that machine learning can facilitate timely and reliable alerting for intervention. Free-text data can improve the performance of predictive modeling at the time of triage and throughout the ED course.
Collapse
Affiliation(s)
- Felix Brann
- Vital Software, Inc, Claymont, DE, United States
| | | | | | - Justin D Schrager
- Vital Software, Inc, Claymont, DE, United States
- Department of Emergency Medicine, Emory University School of Medicine, Atlanta, GA, United States
| |
Collapse
|
3
|
Stewart J, Lu J, Goudie A, Arendts G, Meka SA, Freeman S, Walker K, Sprivulis P, Sanfilippo F, Bennamoun M, Dwivedi G. Applications of natural language processing at emergency department triage: A narrative review. PLoS One 2023; 18:e0279953. [PMID: 38096321 PMCID: PMC10721204 DOI: 10.1371/journal.pone.0279953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 11/30/2023] [Indexed: 12/18/2023] Open
Abstract
INTRODUCTION Natural language processing (NLP) uses various computational methods to analyse and understand human language, and has been applied to data acquired at Emergency Department (ED) triage to predict various outcomes. The objective of this scoping review is to evaluate how NLP has been applied to data acquired at ED triage, assess if NLP based models outperform humans or current risk stratification techniques when predicting outcomes, and assess if incorporating free-text improve predictive performance of models when compared to predictive models that use only structured data. METHODS All English language peer-reviewed research that applied an NLP technique to free-text obtained at ED triage was eligible for inclusion. We excluded studies focusing solely on disease surveillance, and studies that used information obtained after triage. We searched the electronic databases MEDLINE, Embase, Cochrane Database of Systematic Reviews, Web of Science, and Scopus for medical subject headings and text keywords related to NLP and triage. Databases were last searched on 01/01/2022. Risk of bias in studies was assessed using the Prediction model Risk of Bias Assessment Tool (PROBAST). Due to the high level of heterogeneity between studies and high risk of bias, a metanalysis was not conducted. Instead, a narrative synthesis is provided. RESULTS In total, 3730 studies were screened, and 20 studies were included. The population size varied greatly between studies ranging from 1.8 million patients to 598 triage notes. The most common outcomes assessed were prediction of triage score, prediction of admission, and prediction of critical illness. NLP models achieved high accuracy in predicting need for admission, triage score, critical illness, and mapping free-text chief complaints to structured fields. Incorporating both structured data and free-text data improved results when compared to models that used only structured data. However, the majority of studies (80%) were assessed to have a high risk of bias, and only one study reported the deployment of an NLP model into clinical practice. CONCLUSION Unstructured free-text triage notes have been used by NLP models to predict clinically relevant outcomes. However, the majority of studies have a high risk of bias, most research is retrospective, and there are few examples of implementation into clinical practice. Future work is needed to prospectively assess if applying NLP to data acquired at ED triage improves ED outcomes when compared to usual clinical practice.
Collapse
Affiliation(s)
- Jonathon Stewart
- School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia
- Harry Perkins Institute of Medical Research, Murdoch, Western Australia, Australia
- Department of Emergency Medicine, Fiona Stanley Hospital, Murdoch, Western Australia, Australia
| | - Juan Lu
- School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia
- Harry Perkins Institute of Medical Research, Murdoch, Western Australia, Australia
- Department of Computer Science and Software Engineering, The University of Western Australia, Crawley, Western Australia, Australia
| | - Adrian Goudie
- Department of Emergency Medicine, Fiona Stanley Hospital, Murdoch, Western Australia, Australia
| | - Glenn Arendts
- School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia
- Department of Emergency Medicine, Fiona Stanley Hospital, Murdoch, Western Australia, Australia
| | - Shiv Akarsh Meka
- HIVE & Data and Digital Innovation, Royal Perth Hospital, Perth, Western Australia, Australia
| | - Sam Freeman
- Department of Emergency Medicine, St Vincent’s Hospital Melbourne, Melbourne, Victoria, Australia
- SensiLab, Monash University, Melbourne, Victoria, Australia
| | - Katie Walker
- School of Clinical Sciences at Monash Health, Monash University, Melbourne, Victoria, Australia
| | - Peter Sprivulis
- Western Australia Department of Health, East Perth, Western Australia, Australia
| | - Frank Sanfilippo
- School of Population and Global Health, University of Western Australia, Crawley, Western Australia, Australia
| | - Mohammed Bennamoun
- Department of Computer Science and Software Engineering, The University of Western Australia, Crawley, Western Australia, Australia
| | - Girish Dwivedi
- School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia
- Harry Perkins Institute of Medical Research, Murdoch, Western Australia, Australia
- Department of Cardiology, Fiona Stanley Hospital, Murdoch, Western Australia, Australia
| |
Collapse
|
4
|
Sarbay İ, Berikol GB, Özturan İU. Performance of emergency triage prediction of an open access natural language processing based chatbot application (ChatGPT): A preliminary, scenario-based cross-sectional study. Turk J Emerg Med 2023; 23:156-161. [PMID: 37529789 PMCID: PMC10389099 DOI: 10.4103/tjem.tjem_79_23] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 04/13/2023] [Accepted: 05/24/2023] [Indexed: 08/03/2023] Open
Abstract
OBJECTIVES Artificial intelligence companies have been increasing their initiatives recently to improve the results of chatbots, which are software programs that can converse with a human in natural language. The role of chatbots in health care is deemed worthy of research. OpenAI's ChatGPT is a supervised and empowered machine learning-based chatbot. The aim of this study was to determine the performance of ChatGPT in emergency medicine (EM) triage prediction. METHODS This was a preliminary, cross-sectional study conducted with case scenarios generated by the researchers based on the emergency severity index (ESI) handbook v4 cases. Two independent EM specialists who were experts in the ESI triage scale determined the triage categories for each case. A third independent EM specialist was consulted as arbiter, if necessary. Consensus results for each case scenario were assumed as the reference triage category. Subsequently, each case scenario was queried with ChatGPT and the answer was recorded as the index triage category. Inconsistent classifications between the ChatGPT and reference category were defined as over-triage (false positive) or under-triage (false negative). RESULTS Fifty case scenarios were assessed in the study. Reliability analysis showed a fair agreement between EM specialists and ChatGPT (Cohen's Kappa: 0.341). Eleven cases (22%) were over triaged and 9 (18%) cases were under triaged by ChatGPT. In 9 cases (18%), ChatGPT reported two consecutive triage categories, one of which matched the expert consensus. It had an overall sensitivity of 57.1% (95% confidence interval [CI]: 34-78.2), specificity of 34.5% (95% CI: 17.9-54.3), positive predictive value (PPV) of 38.7% (95% CI: 21.8-57.8), negative predictive value (NPV) of 52.6 (95% CI: 28.9-75.6), and an F1 score of 0.461. In high acuity cases (ESI-1 and ESI-2), ChatGPT showed a sensitivity of 76.2% (95% CI: 52.8-91.8), specificity of 93.1% (95% CI: 77.2-99.2), PPV of 88.9% (95% CI: 65.3-98.6), NPV of 84.4 (95% CI: 67.2-94.7), and an F1 score of 0.821. The receiver operating characteristic curve showed an area under the curve of 0.846 (95% CI: 0.724-0.969, P < 0.001) for high acuity cases. CONCLUSION The performance of ChatGPT was best when predicting high acuity cases (ESI-1 and ESI-2). It may be useful when determining the cases requiring critical care. When trained with more medical knowledge, ChatGPT may be more accurate for other triage category predictions.
Collapse
Affiliation(s)
- İbrahim Sarbay
- Department of Emergency Medicine, Keşan State Hospital, Edirne, Turkey
| | - Göksu Bozdereli Berikol
- Department of Emergency Medicine, Bakırköy Dr. Sadi Konuk Training and Research Hospital, İstanbul, Turkey
| | - İbrahim Ulaş Özturan
- Department of Emergency Medicine, Kocaeli University, Faculty of Medicine, Kocaeli, Turkey
- Department of Medical Education, Acibadem University, Institute of Health Sciences, Istanbul, Turkey
| |
Collapse
|
5
|
Mitha S, Schwartz J, Hobensack M, Cato K, Woo K, Smaldone A, Topaz M. Natural Language Processing of Nursing Notes: An Integrative Review. Comput Inform Nurs 2023; 41:377-384. [PMID: 36730744 DOI: 10.1097/cin.0000000000000967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Natural language processing includes a variety of techniques that help to extract meaning from narrative data. In healthcare, medical natural language processing has been a growing field of study; however, little is known about its use in nursing. We searched PubMed, EMBASE, and CINAHL and found 689 studies, narrowed to 43 eligible studies using natural language processing in nursing notes. Data related to the study purpose, patient population, methodology, performance evaluation metrics, and quality indicators were extracted for each study. The majority (86%) of the studies were conducted from 2015 to 2021. Most of the studies (58%) used inpatient data. One of four studies used data from open-source databases. The most common standard terminologies used were the Unified Medical Language System and Systematized Nomenclature of Medicine, whereas nursing-specific standard terminologies were used only in eight studies. Full system performance metrics (eg, F score) were reported for 61% of applicable studies. The overall number of nursing natural language processing publications remains relatively small compared with the other medical literature. Future studies should evaluate and report appropriate performance metrics and use existing standard nursing terminologies to enable future scalability of the methods and findings.
Collapse
Affiliation(s)
- Shazia Mitha
- Author Affiliations : Columbia University School of Nursing, New York
| | | | | | | | | | | | | |
Collapse
|
6
|
Eysenbach G, Kleib M, Norris C, O'Rourke HM, Montgomery C, Douma M. The Use and Structure of Emergency Nurses' Triage Narrative Data: Scoping Review. JMIR Nurs 2023; 6:e41331. [PMID: 36637881 PMCID: PMC9883744 DOI: 10.2196/41331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 11/24/2022] [Accepted: 11/28/2022] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Emergency departments use triage to ensure that patients with the highest level of acuity receive care quickly and safely. Triage is typically a nursing process that is documented as structured and unstructured (free text) data. Free-text triage narratives have been studied for specific conditions but never reviewed in a comprehensive manner. OBJECTIVE The objective of this paper was to identify and map the academic literature that examines triage narratives. The paper described the types of research conducted, identified gaps in the research, and determined where additional review may be warranted. METHODS We conducted a scoping review of unstructured triage narratives. We mapped the literature, described the use of triage narrative data, examined the information available on the form and structure of narratives, highlighted similarities among publications, and identified opportunities for future research. RESULTS We screened 18,074 studies published between 1990 and 2022 in CINAHL, MEDLINE, Embase, Cochrane, and ProQuest Central. We identified 0.53% (96/18,074) of studies that directly examined the use of triage nurses' narratives. More than 12 million visits were made to 2438 emergency departments included in the review. In total, 82% (79/96) of these studies were conducted in the United States (43/96, 45%), Australia (31/96, 32%), or Canada (5/96, 5%). Triage narratives were used for research and case identification, as input variables for predictive modeling, and for quality improvement. Overall, 31% (30/96) of the studies offered a description of the triage narrative, including a list of the keywords used (27/96, 28%) or more fulsome descriptions (such as word counts, character counts, abbreviation, etc; 7/96, 7%). We found limited use of reporting guidelines (8/96, 8%). CONCLUSIONS The breadth of the identified studies suggests that there is widespread routine collection and research use of triage narrative data. Despite the use of triage narratives as a source of data in studies, the narratives and nurses who generate them are poorly described in the literature, and data reporting is inconsistent. Additional research is needed to describe the structure of triage narratives, determine the best use of triage narratives, and improve the consistent use of triage-specific data reporting guidelines. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR2-10.1136/bmjopen-2021-055132.
Collapse
Affiliation(s)
| | - Manal Kleib
- Faculty of Nursing, University of Alberta, Edmonton, AB, Canada
| | - Colleen Norris
- Faculty of Nursing, University of Alberta, Edmonton, AB, Canada
| | | | | | - Matthew Douma
- School of Nursing, Midwifery and Health Systems, University College Dublin, Dublin, Ireland
| |
Collapse
|