1
|
Liu W, Cai L, Li Y. Application of natural language processing to post-structuring of rectal cancer MRI reports. Clin Radiol 2024; 79:e204-e210. [PMID: 38042740 DOI: 10.1016/j.crad.2023.10.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/20/2023] [Accepted: 10/26/2023] [Indexed: 12/04/2023]
Abstract
AIM To evaluate a natural language processing (NLP) system for extracting structured information from the free-form text of rectal cancer magnetic resonance imaging (MRI) reports written in Chinese. MATERIALS AND METHODS A rule-based NLP model that could extract 11 key image features of rectal cancer was constructed using 358 MRI reports of rectal cancer written between 2015 and 2021. Fifty reports written before 2015 and 50 written after 2021 were used as test datasets, and the reference standard was determined by manual extraction of information by two radiologists. The length and reporting rate of image features in pre-2015 and post-2021 datasets, as well as the accuracy, precision, recall, and F1 score of feature extraction by the NLP system, were compared. The time required for the NLP to extract data was compared with that required by the radiologists. RESULTS Reports written after 2021 had longer diagnostic impression sections than reports written before 2015. The reporting rate of key imaging features of rectal cancer was 36.55% before 2015 and 79.82% after 2021. The accuracy, precision, recall, and F1 score of NLP for correct extraction of values from reports were 93.82%, 95.63%, 87.06%, and 91.15%, respectively, for pre-2015 reports, and 92.55%, 98.53%, 94.15%, and 96.29%, respectively, for post-2021 reports. NLP generated all the structured information in <1 second. CONCLUSIONS The NLP system with rule-based pattern matching achieved rapid and accurate structured processing of rectal cancer MRI reports. MRI reports with structured templates are more suitable for NLP-based extraction of information.
Collapse
Affiliation(s)
- W Liu
- Department of Radiology, Aerospace Center Hospital, Beijing, 100049, China; Department of Radiology, Beijing Friendship Hospital, Capital Medical University, Beijing, 100050, China
| | - L Cai
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Y Li
- Department of General Surgery, Aerospace Center Hospital, Beijing, 100049, China.
| |
Collapse
|
2
|
Karway GK, Koyner JL, Caskey J, Spicer AB, Carey KA, Gilbert ER, Dligach D, Mayampurath A, Afshar M, Churpek MM. Development and external validation of multimodal postoperative acute kidney injury risk machine learning models. JAMIA Open 2023; 6:ooad109. [PMID: 38144168 PMCID: PMC10746378 DOI: 10.1093/jamiaopen/ooad109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 11/18/2023] [Accepted: 12/11/2023] [Indexed: 12/26/2023] Open
Abstract
Objectives To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings. Materials and Methods Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong's test for statistical differences. Results The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]). Discussion A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models. Conclusion These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.
Collapse
Affiliation(s)
- George K Karway
- Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Jay L Koyner
- Section of Nephrology, Department of Medicine, University of Chicago, Chicago, IL 60637, United States
| | - John Caskey
- Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Alexandra B Spicer
- Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States
| | - Kyle A Carey
- Section of Nephrology, Department of Medicine, University of Chicago, Chicago, IL 60637, United States
| | - Emily R Gilbert
- Department of Medicine, Loyola University Chicago, Chicago, IL 60153, United States
| | - Dmitriy Dligach
- Department of Computer Science, Loyola University Chicago, Chicago, IL 60626, United States
| | - Anoop Mayampurath
- Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States
| | - Majid Afshar
- Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States
| | - Matthew M Churpek
- Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States
| |
Collapse
|
3
|
Hsu E, Bako AT, Potter T, Pan AP, Britz GW, Tannous J, Vahidy FS. Extraction of Radiological Characteristics From Free-Text Imaging Reports Using Natural Language Processing Among Patients With Ischemic and Hemorrhagic Stroke: Algorithm Development and Validation. JMIR AI 2023; 2:e42884. [PMID: 38875556 PMCID: PMC11041442 DOI: 10.2196/42884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 01/10/2023] [Accepted: 04/08/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Neuroimaging is the gold-standard diagnostic modality for all patients suspected of stroke. However, the unstructured nature of imaging reports remains a major challenge to extracting useful information from electronic health records systems. Despite the increasing adoption of natural language processing (NLP) for radiology reports, information extraction for many stroke imaging features has not been systematically evaluated. OBJECTIVE In this study, we propose an NLP pipeline, which adopts the state-of-the-art ClinicalBERT model with domain-specific pretraining and task-oriented fine-tuning to extract 13 stroke features from head computed tomography imaging notes. METHODS We used the model to generate structured data sets with information on the presence or absence of common stroke features for 24,924 patients with strokes. We compared the survival characteristics of patients with and without features of severe stroke (eg, midline shift, perihematomal edema, or mass effect) using the Kaplan-Meier curve and log-rank tests. RESULTS Pretrained on 82,073 head computed tomography notes with 13.7 million words and fine-tuned on 200 annotated notes, our HeadCT_BERT model achieved an average area under receiver operating characteristic curve of 0.9831, F1-score of 0.8683, and accuracy of 97%. Among patients with acute ischemic stroke, admissions with any severe stroke feature in initial imaging notes were associated with a lower probability of survival (P<.001). CONCLUSIONS Our proposed NLP pipeline achieved high performance and has the potential to improve medical research and patient safety.
Collapse
Affiliation(s)
- Enshuo Hsu
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Abdulaziz T Bako
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Thomas Potter
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Alan P Pan
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Gavin W Britz
- Department of Neurosurgery, Houston Methodist Neurological Institute, Houston, TX, United States
- Department of Neurology, Weill Cornell Medical College, New York, NY, United States
| | - Jonika Tannous
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
| | - Farhaan S Vahidy
- Center for Health Data Science and Analytics, Houston Methodist Research Institute, Houston, TX, United States
- Department of Neurosurgery, Houston Methodist Neurological Institute, Houston, TX, United States
- Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, United States
| |
Collapse
|
4
|
Puts S, Nobel M, Zegers C, Bermejo I, Robben S, Dekker A. How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation. JMIR Form Res 2023; 7:e38125. [PMID: 36947118 PMCID: PMC10131747 DOI: 10.2196/38125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 09/25/2022] [Accepted: 12/22/2022] [Indexed: 03/23/2023] Open
Abstract
BACKGROUND Natural language processing (NLP) is thought to be a promising solution to extract and store concepts from free text in a structured manner for data mining purposes. This is also true for radiology reports, which still consist mostly of free text. Accurate and complete reports are very important for clinical decision support, for instance, in oncological staging. As such, NLP can be a tool to structure the content of the radiology report, thereby increasing the report's value. OBJECTIVE This study describes the implementation and validation of an N-stage classifier for pulmonary oncology. It is based on free-text radiological chest computed tomography reports according to the tumor, node, and metastasis (TNM) classification, which has been added to the already existing T-stage classifier to create a combined TN-stage classifier. METHODS SpaCy, PyContextNLP, and regular expressions were used for proper information extraction, after additional rules were set to accurately extract N-stage. RESULTS The overall TN-stage classifier accuracy scores were 0.84 and 0.85, respectively, for the training (N=95) and validation (N=97) sets. This is comparable to the outcomes of the T-stage classifier (0.87-0.92). CONCLUSIONS This study shows that NLP has potential in classifying pulmonary oncology from free-text radiological reports according to the TNM classification system as both the T- and N-stages can be extracted with high accuracy.
Collapse
Affiliation(s)
- Sander Puts
- GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands
- Department of Radiation Oncology, Maastro, Maastricht, Netherlands
| | - Martijn Nobel
- School of Health Professions Education, Maastricht University, Maastricht, Netherlands
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Center+, Maastricht, Netherlands
| | - Catharina Zegers
- GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands
- Department of Radiation Oncology, Maastro, Maastricht, Netherlands
| | - Iñigo Bermejo
- GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands
| | - Simon Robben
- School of Health Professions Education, Maastricht University, Maastricht, Netherlands
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Center+, Maastricht, Netherlands
| | - Andre Dekker
- GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands
- Department of Radiation Oncology, Maastro, Maastricht, Netherlands
| |
Collapse
|
5
|
Crema C, Attardi G, Sartiano D, Redolfi A. Natural language processing in clinical neuroscience and psychiatry: A review. Front Psychiatry 2022; 13:946387. [PMID: 36186874 PMCID: PMC9515453 DOI: 10.3389/fpsyt.2022.946387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Natural language processing (NLP) is rapidly becoming an important topic in the medical community. The ability to automatically analyze any type of medical document could be the key factor to fully exploit the data it contains. Cutting-edge artificial intelligence (AI) architectures, particularly machine learning and deep learning, have begun to be applied to this topic and have yielded promising results. We conducted a literature search for 1,024 papers that used NLP technology in neuroscience and psychiatry from 2010 to early 2022. After a selection process, 115 papers were evaluated. Each publication was classified into one of three categories: information extraction, classification, and data inference. Automated understanding of clinical reports in electronic health records has the potential to improve healthcare delivery. Overall, the performance of NLP applications is high, with an average F1-score and AUC above 85%. We also derived a composite measure in the form of Z-scores to better compare the performance of NLP models and their different classes as a whole. No statistical differences were found in the unbiased comparison. Strong asymmetry between English and non-English models, difficulty in obtaining high-quality annotated data, and train biases causing low generalizability are the main limitations. This review suggests that NLP could be an effective tool to help clinicians gain insights from medical reports, clinical research forms, and more, making NLP an effective tool to improve the quality of healthcare services.
Collapse
Affiliation(s)
- Claudio Crema
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | | | - Daniele Sartiano
- Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy
| | - Alberto Redolfi
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| |
Collapse
|
6
|
Bizzo BC, Almeida RR, Alkasab TK. Artificial Intelligence Enabling Radiology Reporting. Radiol Clin North Am 2021; 59:1045-1052. [PMID: 34689872 DOI: 10.1016/j.rcl.2021.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The radiology reporting process is beginning to incorporate structured, semantically labeled data. Tools based on artificial intelligence technologies using a structured reporting context can assist with internal report consistency and longitudinal tracking. To-do lists of relevant issues could be assembled by artificial intelligence tools, incorporating components of the patient's history. Radiologists will review and select artificial intelligence-generated and other data to be transmitted to the electronic health record and generate feedback for ongoing improvement of artificial intelligence tools. These technologies should make reports more valuable by making reports more accessible and better able to integrate into care pathways.
Collapse
Affiliation(s)
- Bernardo C Bizzo
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, 55 Fruit Street, Founders 210, Boston, MA 02114, USA
| | - Renata R Almeida
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA 02115, USA
| | - Tarik K Alkasab
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, 55 Fruit Street, Founders 210, Boston, MA 02114, USA.
| |
Collapse
|
7
|
Impact of Different Approaches to Preparing Notes for Analysis With Natural Language Processing on the Performance of Prediction Models in Intensive Care. Crit Care Explor 2021; 3:e0450. [PMID: 34136824 PMCID: PMC8202578 DOI: 10.1097/cce.0000000000000450] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Supplemental Digital Content is available in the text. OBJECTIVES: To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU. DESIGN: Clinical note text was used to build machine learning models for adults admitted to the ICU. Preprocessing strategies studied were none (raw text), cleaning text, stemming, term frequency-inverse document frequency vectorization, and creation of n-grams. Model performance was assessed by the area under the receiver operating characteristic curve. Models were trained and internally validated on University of California San Francisco data using 10-fold cross validation. These models were then externally validated on Beth Israel Deaconess Medical Center data. SETTING: ICUs at University of California San Francisco and Beth Israel Deaconess Medical Center. SUBJECTS: Ten thousand patients in the University of California San Francisco training and internal testing dataset and 27,058 patients in the external validation dataset, Beth Israel Deaconess Medical Center. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Mortality rate at Beth Israel Deaconess Medical Center and University of California San Francisco was 10.9% and 7.4%, respectively. Data are presented as area under the receiver operating characteristic curve (95% CI) for models validated at University of California San Francisco and area under the receiver operating characteristic curve for models validated at Beth Israel Deaconess Medical Center. Models built and trained on University of California San Francisco data for the prediction of inhospital mortality improved from the raw note text model (AUROC, 0.84; CI, 0.80–0.89) to the term frequency-inverse document frequency model (AUROC, 0.89; CI, 0.85–0.94). When applying the models developed at University of California San Francisco to Beth Israel Deaconess Medical Center data, there was a similar increase in model performance from raw note text (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.72) to the term frequency-inverse document frequency model (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.83). CONCLUSIONS: Differences in preprocessing strategies for note text impacted model discrimination. Completing a preprocessing pathway including cleaning, stemming, and term frequency-inverse document frequency vectorization resulted in the preprocessing strategy with the greatest improvement in model performance. Further study is needed, with particular emphasis on how to manage author implicit bias present in note text, before natural language processing algorithms are implemented in the clinical setting.
Collapse
|
8
|
Casey A, Davidson E, Poon M, Dong H, Duma D, Grivas A, Grover C, Suárez-Paniagua V, Tobin R, Whiteley W, Wu H, Alex B. A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 2021; 21:179. [PMID: 34082729 PMCID: PMC8176715 DOI: 10.1186/s12911-021-01533-7] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports. METHODS We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics. RESULTS We present a comprehensive analysis of the 164 publications retrieved with publications in 2019 almost triple those in 2015. Each publication is categorised into one of 6 clinical application categories. Deep learning use increases in the period but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results. CONCLUSIONS Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process and we show that research in this field continues to grow. Reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers in the field providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.
Collapse
Affiliation(s)
- Arlene Casey
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Emma Davidson
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Michael Poon
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
| | - Hang Dong
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Daniel Duma
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
| | - Andreas Grivas
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Claire Grover
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - Víctor Suárez-Paniagua
- Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, Scotland
- Health Data Research UK, London, UK
| | - Richard Tobin
- Institute for Language, Cognition and Computation, School of informatics, University of Edinburgh, Edinburgh, Scotland
| | - William Whiteley
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, Scotland
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Honghan Wu
- Health Data Research UK, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Beatrice Alex
- School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland
- Edinburgh Futures Institute, University of Edinburgh, Edinburgh, Scotland
| |
Collapse
|
9
|
Schultz MA, Walden RL, Cato K, Coviak CP, Cruz C, D'Agostino F, Douthit BJ, Forbes T, Gao G, Lee MA, Lekan D, Wieben A, Jeffery AD. Data Science Methods for Nursing-Relevant Patient Outcomes and Clinical Processes: The 2019 Literature Year in Review. Comput Inform Nurs 2021; 39:654-667. [PMID: 34747890 PMCID: PMC8578863 DOI: 10.1097/cin.0000000000000705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Data science continues to be recognized and used within healthcare due to the increased availability of large data sets and advanced analytics. It can be challenging for nurse leaders to remain apprised of this rapidly changing landscape. In this article, we describe our findings from a scoping literature review of papers published in 2019 that use data science to explore, explain, and/or predict 15 phenomena of interest to nurses. Fourteen of the 15 phenomena were associated with at least one paper published in 2019. We identified the use of many contemporary data science methods (eg, natural language processing, neural networks) for many of the outcomes. We found many studies exploring Readmissions and Pressure Injuries. The topics of Artificial Intelligence/Machine Learning Acceptance, Burnout, Patient Safety, and Unit Culture were poorly represented. We hope that the studies described in this article help readers: (1) understand the breadth and depth of data science's ability to improve clinical processes and patient outcomes that are relevant to nurses and (2) identify gaps in the literature that are in need of exploration.
Collapse
Affiliation(s)
- Mary Anne Schultz
- Author Affiliations: California State University (Dr Schultz); Annette and Irwin Eskind Family Biomedical Library, Vanderbilt University (Ms Walden); Department of Emergency Medicine, Columbia University School of Nursing (Dr Cato); Grand Valley State University (Dr Coviak); Global Health Technology & Informatics, Chevron, San Ramon, CA (Mr Cruz); Saint Camillus International University of Health Sciences, Rome, Italy (Dr D'Agostino); Duke University School of Nursing (Mr Douthit); East Carolina University College of Nursing (Dr Forbes); St Catherine University Department of Nursing (Dr Gao); Texas Woman's University College of Nursing (Dr Lee); Assistant Professor, University of North Carolina at Greensboro School of Nursing (Dr Lekan); University of Wisconsin School of Nursing (Ms Wieben); and Vanderbilt University School of Nursing, and Tennessee Valley Healthcare System, US Department of Veterans Affairs (Dr Jeffery)
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Senders JT, Cho LD, Calvachi P, McNulty JJ, Ashby JL, Schulte IS, Almekkawi AK, Mehrtash A, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma. JCO Clin Cancer Inform 2021; 4:25-34. [PMID: 31977252 DOI: 10.1200/cci.19.00060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
PURPOSE The aim of this study was to develop an open-source natural language processing (NLP) pipeline for text mining of medical information from clinical reports. We also aimed to provide insight into why certain variables or reports are more suitable for clinical text mining than others. MATERIALS AND METHODS Various NLP models were developed to extract 15 radiologic characteristics from free-text radiology reports for patients with glioblastoma. Ten-fold cross-validation was used to optimize the hyperparameter settings and estimate model performance. We examined how model performance was associated with quantitative attributes of the radiologic characteristics and reports. RESULTS In total, 562 unique brain magnetic resonance imaging reports were retrieved. NLP extracted 15 radiologic characteristics with high to excellent discrimination (area under the curve, 0.82 to 0.98) and accuracy (78.6% to 96.6%). Model performance was correlated with the inter-rater agreement of the manually provided labels (ρ = 0.904; P < .001) but not with the frequency distribution of the variables of interest (ρ = 0.179; P = .52). All variables labeled with a near perfect inter-rater agreement were classified with excellent performance (area under the curve > 0.95). Excellent performance could be achieved for variables with only 50 to 100 observations in the minority group and class imbalances up to a 9:1 ratio. Report-level classification accuracy was not associated with the number of words or the vocabulary size in the distinct text documents. CONCLUSION This study provides an open-source NLP pipeline that allows for text mining of narratively written clinical reports. Small sample sizes and class imbalance should not be considered as absolute contraindications for text mining in clinical research. However, future studies should report measures of inter-rater agreement whenever ground truth is based on a consensus label and use this measure to identify clinical variables eligible for text mining.
Collapse
Affiliation(s)
- Joeky T Senders
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands
| | - Logan D Cho
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neuroscience, Brown University, Providence, RI
| | - Paola Calvachi
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - John J McNulty
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Vagelos College of Physicians and Surgeons, Columbia University, New York, NY
| | - Joanna L Ashby
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Isabelle S Schulte
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Ahmad Kareem Almekkawi
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Alireza Mehrtash
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - William B Gormley
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Timothy R Smith
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Marike L D Broekman
- Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands.,Department of Neurosurgery, Haaglanden Medical Center, The Hague, the Netherlands
| | - Omar Arnaout
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
11
|
Kulshrestha S, Dligach D, Joyce C, Baker MS, Gonzalez R, O’Rourke AP, Glazer JM, Stey A, Kruser JM, Churpek MM, Afshar M. Prediction of severe chest injury using natural language processing from the electronic health record. Injury 2021; 52:205-212. [PMID: 33131794 PMCID: PMC7856032 DOI: 10.1016/j.injury.2020.10.094] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 10/02/2020] [Accepted: 10/23/2020] [Indexed: 02/02/2023]
Abstract
INTRODUCTION Trauma injury severity scores are currently calculated retrospectively from the electronic health record (EHR) using manual annotation by certified trauma coders. Natural language processing (NLP) of clinical documents in the EHR may enable automated injury scoring. We hypothesize that NLP with machine learning can discriminate between cases of severe and non-severe injury to the thorax after trauma. METHODS Clinical documents from a trauma center were examined between 2014 and 2018. Severe chest injury was defined as a thorax abbreviated injury score (AIS) >2 and served as the reference standard for supervised learning. Free text unigrams and concept unique identifiers (CUIs) from the Unified Medical Language Systems (UMLS) were extracted from clinical documents collected at one hour, four hours, and eight hours after patient arrival to the emergency department. Logistic regression models with elastic net regularization were tuned to maximize area under the receiver operating characteristic curve (AUROC) using 10-fold cross-validation on the training dataset (80%) and tested on a hold-out 20% dataset. RESULTS There were 6,891 traumas that met inclusion criteria. The complete data corpus consisted of 473,694 documents. Models trained using the first hour of data had a mean AUROC of 0.88 (95%CI [0.86, 0.89]); model discrimination and reclassification from the first hour significantly improved after eight hours with a mean AUROC of 0.94 (95%CI [0.93, 0.95]). Performance of models using CUIs were similar to unigrams (p>0.05). Models demonstrated excellent clinical face validity. CONCLUSIONS Both CUIs and unigrams demonstrated excellent discrimination in predicting severity of chest injury using the first eight hours of clinical documents. Our model demonstrates that automated anatomical injury scoring is feasible and may be used for aggregation of data for trauma research and quality programs.
Collapse
Affiliation(s)
- Sujay Kulshrestha
- Burn and Shock Trauma Research Institute, Loyola University Chicago, CTRE Building 115, Room 315, 2160 South 1st Avenue, Maywood, IL, USA,Department of Surgery, Loyola University Medical Center, EMS Building 110, Room 3210, 2160 South 1st Avenue, Maywood, IL, USA
| | - Dmitriy Dligach
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, CTRE Building 115, Room 126, 2160 South 1st Avenue, Maywood, IL, USA,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, 2160 South 1st Avenue, Maywood, IL, USA,Department of Computer Science, Loyola University Chicago, 1052 West Loyola Avenue, Chicago, IL, USA
| | - Cara Joyce
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, CTRE Building 115, Room 126, 2160 South 1st Avenue, Maywood, IL, USA,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, 2160 South 1st Avenue, Maywood, IL, USA
| | - Marshall S. Baker
- Department of Surgery, Loyola University Medical Center, EMS Building 110, Room 3210, 2160 South 1st Avenue, Maywood, IL, USA,Edward Hines Jr. Veterans Affairs Hospital, 5000 South Fifth Avenue, Hines, IL, USA
| | - Richard Gonzalez
- Burn and Shock Trauma Research Institute, Loyola University Chicago, CTRE Building 115, Room 315, 2160 South 1st Avenue, Maywood, IL, USA,Department of Surgery, Loyola University Medical Center, EMS Building 110, Room 3210, 2160 South 1st Avenue, Maywood, IL, USA
| | - Ann P. O’Rourke
- Department of Surgery, University of Wisconsin, 600 Highland Avenue, MC 3236, Madison, WI, USA
| | - Joshua M. Glazer
- Department of Emergency Medicine, University of Wisconsin, 800 University Bay Drive, Suite 310, MC 9123, Madison, WI, USA
| | - Anne Stey
- Division of Trauma and Surgical Critical Care, Department of Surgery, Northwestern University, 76 North St. Clair Street, Suite 650, Chicago, IL, USA
| | - Jacqueline M. Kruser
- Division of Pulmonary and Critical Care, Department of Medicine, Northwestern University, 633 North St. Clair Street, 20th Floor, McGaw M-335, Chicago, IL, USA,Department of Medical Social Sciences, Northwestern University, 633 North St. Clair Street, 19th Floor, Chicago, IL, USA
| | - Matthew M. Churpek
- Department of Medicine, University of Wisconsin, 8007 Excelsior Drive, Madison, WI, USA
| | - Majid Afshar
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, CTRE Building 115, Room 126, 2160 South 1st Avenue, Maywood, IL, USA,Department of Health Informatics and Data Science, Loyola University Chicago, 2160 South First Avenue, Maywood, IL, USA
| |
Collapse
|
12
|
Kirubarajan A, Taher A, Khan S, Masood S. Artificial intelligence in emergency medicine: A scoping review. J Am Coll Emerg Physicians Open 2020; 1:1691-1702. [PMID: 33392578 PMCID: PMC7771825 DOI: 10.1002/emp2.12277] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 09/04/2020] [Accepted: 09/22/2020] [Indexed: 01/08/2023] Open
Abstract
INTRODUCTION Despite the growing investment in and adoption of artificial intelligence (AI) in medicine, the applications of AI in an emergency setting remain unclear. This scoping review seeks to identify available literature regarding the applications of AI in emergency medicine. METHODS The scoping review was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for scoping reviews using Medline-OVID, EMBASE, CINAHL, and IEEE, with a double screening and extraction process. The search included articles published until February 28, 2020. Articles were excluded if they did not self-classify as studying an AI intervention, were not relevant to the emergency department (ED), or did not report outcomes or evaluation. RESULTS Of the 1483 original database citations, 395 were eligible for full-text evaluation. Of these articles, a total of 150 were included in the scoping review. The majority of included studies were retrospective in nature (n = 124, 82.7%), with only 3 (2.0%) prospective controlled trials. We found 37 (24.7%) interventions aimed at improving diagnosis within the ED. Among the 150 studies, 19 (12.7%) focused on diagnostic imaging within the ED. A total of 16 (10.7%) studies were conducted in the out-of-hospital environment (eg, emergency medical services, paramedics) with the remainder occurring either in the ED or the trauma bay. Of the 24 (16%) studies that had human comparators, there were 12 (8%) studies in which AI interventions outperformed clinicians in at least 1 measured outcome. CONCLUSION AI-related research is rapidly increasing in emergency medicine. There are several promising AI interventions that can improve emergency care, particularly for acute radiographic imaging and prediction-based diagnoses. Higher quality evidence is needed to further assess both short- and long-term clinical outcomes.
Collapse
Affiliation(s)
- Abirami Kirubarajan
- Faculty of MedicineUniversity of TorontoTorontoOntarioCanada
- Institute of Health Policy Management and EvaluationUniversity of TorontoTorontoOntarioCanada
| | - Ahmed Taher
- Division of Emergency Medicine, Department of MedicineUniversity of TorontoTorontoOntarioCanada
| | - Shawn Khan
- Faculty of MedicineUniversity of TorontoTorontoOntarioCanada
| | - Sameer Masood
- Division of Emergency Medicine, Department of MedicineUniversity of TorontoTorontoOntarioCanada
- Toronto General Hospital Research InstituteUniversity Health NetworkTorontoOntarioCanada
| |
Collapse
|
13
|
Patel VD, Garcia RM, Swor DE, Liotta EM, Maas MB, Naidech A. Natural History of Infratentorial Intracerebral Hemorrhages: Two Subgroups with Distinct Presentations and Outcomes. J Stroke Cerebrovasc Dis 2020; 29:104920. [PMID: 32423853 PMCID: PMC7375913 DOI: 10.1016/j.jstrokecerebrovasdis.2020.104920] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2020] [Revised: 04/24/2020] [Accepted: 04/26/2020] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND/OBJECTIVE Infratentorial intracerebral hemorrhage (ICH) is associated with worse prognosis than supratentorial ICH; however, infratentorial ICH is often excluded or underrepresented in clinical trials of ICH. We sought to evaluate the natural history of infratentorial ICH stratified by brainstem or cerebellar location using a prospective observational study inclusive of all spontaneous ICH. METHODS Using a prospective, single center cohort of patients with spontaneous ICH between 2008-2019, we conducted a descriptive analysis of baseline demographics, severity of injury scores, and long-term functional outcomes of infratentorial ICH stratified by cerebellar or brainstem location. RESULTS Infratentorial ICH occurred in 82 (13%) of 632 patients in our ICH cohort. Among infratentorial ICH, cerebellar ICH occurred in 45 (55%) and brainstem ICH occurred in 37 (45%). Compared to cerebellar ICH, patients with brainstem ICH had significantly worse severity of injury scores, including lower admission Glasgow Coma Scale (median 14 [7.0 - 15.0] versus 4 [3.0 - 8.0], respectively; P < 0.001) and higher ICH Score (median 2 [1.0 - 3.0] versus 3 [2.75 - 4.0], respectively; P = 0.02). Patients with cerebellar ICH were more likely to be discharged home or to acute rehabilitation (OR 4.8, 95% CI 1.8 - 12.8) but there was no difference in in-hospital mortality (OR 0.4, 95% CI 0.1 - 1.1, P = 0.08) or cause of death (P = 0.5). Modified Rankin Scale scores at 3 months were significantly better in patients with cerebellar ICH compared to brainstem ICH (median 3.5 [1.8 - 6.0] versus median 6 [5.0 - 6.0], P = 0.03). CONCLUSIONS Location of infratentorial ICH is an important determinant of admission severity and clinical outcome in unselected patients with ICH. Patients with cerebellar ICH have less severe symptoms at presentation and more favorable functional outcomes compared to patients with brainstem ICH.
Collapse
Affiliation(s)
- Viren D Patel
- Department of Neurology, Northwestern University, 710 N. Lake Shore Drive, Suite 1105, Chicago, IL 60611, USA.
| | - Roxanna M Garcia
- Department of Neurosurgery, Northwestern University, Chicago, IL, USA.
| | - Dionne E Swor
- Department of Neurology, Northwestern University, 710 N. Lake Shore Drive, Suite 1105, Chicago, IL 60611, USA.
| | - Eric M Liotta
- Department of Neurology, Northwestern University, 710 N. Lake Shore Drive, Suite 1105, Chicago, IL 60611, USA.
| | - Matthew B Maas
- Department of Neurology, Northwestern University, 710 N. Lake Shore Drive, Suite 1105, Chicago, IL 60611, USA.
| | - Andrew Naidech
- Department of Neurology, Northwestern University, 710 N. Lake Shore Drive, Suite 1105, Chicago, IL 60611, USA.
| |
Collapse
|
14
|
Foreman B. Neurocritical Care: Bench to Bedside (Eds. Claude Hemphill, Michael James) Integrating and Using Big Data in Neurocritical Care. Neurotherapeutics 2020; 17:593-605. [PMID: 32152955 PMCID: PMC7283405 DOI: 10.1007/s13311-020-00846-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The critical care environment drives huge volumes of data, and clinicians are tasked with quickly processing this data and responding to it urgently. The neurocritical care environment increasingly involves EEG, multimodal intracranial monitoring, and complex imaging which preclude comprehensive human synthesis, and requires new concepts to integrate data into clinical care. By definition, Big Data is data that cannot be handled using traditional infrastructures and is characterized by the volume, variety, velocity, and variability of the data being produced. Big Data in the neurocritical care unit requires rethinking of data storage infrastructures and the development of tools and analytics to drive advancements in the field. Preprocessing, feature extraction, statistical inference, and analytic tools are required in order to achieve the primary goals of Big Data for clinical use: description, prediction, and prescription. Barriers to its use at bedside include a lack of infrastructure development within the healthcare industry, lack of standardization of data inputs, and ultimately existential and scientific concerns about the outputs that result from the use of tools such as artificial intelligence. However, as implied by the fundamental theorem of biomedical informatics, physicians remain central to the development and utility of Big Data to improve patient care.
Collapse
Affiliation(s)
- Brandon Foreman
- Department of Neurology & Rehabilitation Medicine, University of Cincinnati Medical Center, 231 Albert Sabin Way, Cincinnati, OH, 45267-0517, USA.
- Collaborative for Research on Acute Neurological Injuries (CRANI), University of Cincinnati, Cincinnati, OH, USA.
| |
Collapse
|