1
|
Eguia H, Sánchez-Bocanegra CL, Vinciarelli F, Alvarez-Lopez F, Saigí-Rubió F. Clinical Decision Support and Natural Language Processing in Medicine: Systematic Literature Review. J Med Internet Res 2024; 26:e55315. [PMID: 39348889 DOI: 10.2196/55315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 04/20/2024] [Accepted: 07/24/2024] [Indexed: 10/02/2024] Open
Abstract
BACKGROUND Ensuring access to accurate and verified information is essential for effective patient treatment and diagnosis. Although health workers rely on the internet for clinical data, there is a need for a more streamlined approach. OBJECTIVE This systematic review aims to assess the current state of artificial intelligence (AI) and natural language processing (NLP) techniques in health care to identify their potential use in electronic health records and automated information searches. METHODS A search was conducted in the PubMed, Embase, ScienceDirect, Scopus, and Web of Science online databases for articles published between January 2000 and April 2023. The only inclusion criteria were (1) original research articles and studies on the application of AI-based medical clinical decision support using NLP techniques and (2) publications in English. A Critical Appraisal Skills Programme tool was used to assess the quality of the studies. RESULTS The search yielded 707 articles, from which 26 studies were included (24 original articles and 2 systematic reviews). Of the evaluated articles, 21 (81%) explained the use of NLP as a source of data collection, 18 (69%) used electronic health records as a data source, and a further 8 (31%) were based on clinical data. Only 5 (19%) of the articles showed the use of combined strategies for NLP to obtain clinical data. In total, 16 (62%) articles presented stand-alone data review algorithms. Other studies (n=9, 35%) showed that the clinical decision support system alternative was also a way of displaying the information obtained for immediate clinical use. CONCLUSIONS The use of NLP engines can effectively improve clinical decision systems' accuracy, while biphasic tools combining AI algorithms and human criteria may optimize clinical diagnosis and treatment flows. TRIAL REGISTRATION PROSPERO CRD42022373386; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=373386.
Collapse
Affiliation(s)
- Hans Eguia
- SEMERGEN New Technologies Working Group, Madrid, Spain
- Faculty of Health Sciences, Universitat Oberta de Catalunya (UOC), Barcelona, Spain
| | | | - Franco Vinciarelli
- SEMERGEN New Technologies Working Group, Madrid, Spain
- Emergency Hospital Clemente Álvarez, Rosario (Santa Fe), Argentina
| | | | - Francesc Saigí-Rubió
- Faculty of Health Sciences, Universitat Oberta de Catalunya (UOC), Barcelona, Spain
| |
Collapse
|
2
|
Loscertales J, Abrisqueta-Costa P, Gutierrez A, Hernández-Rivas JÁ, Andreu-Lapiedra R, Mora A, Leiva-Farré C, López-Roda MD, Callejo-Mellén Á, Álvarez-García E, García-Marco JA. Real-World Evidence on the Clinical Characteristics and Management of Patients with Chronic Lymphocytic Leukemia in Spain Using Natural Language Processing: The SRealCLL Study. Cancers (Basel) 2023; 15:4047. [PMID: 37627075 PMCID: PMC10452602 DOI: 10.3390/cancers15164047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 08/04/2023] [Accepted: 08/07/2023] [Indexed: 08/27/2023] Open
Abstract
The SRealCLL study aimed to obtain real-world evidence on the clinical characteristics and treatment patterns of patients with chronic lymphocytic leukemia (CLL) using natural language processing (NLP). Electronic health records (EHRs) from seven Spanish hospitals (January 2016-December 2018) were analyzed using EHRead® technology, based on NLP and machine learning. A total of 534 CLL patients were assessed. No treatment was detected in 270 (50.6%) patients (watch-and-wait, W&W). First-line (1L) treatment was identified in 230 (43.1%) patients and relapsed/refractory (2L) treatment was identified in 58 (10.9%). The median age ranged from 71 to 75 years, with a uniform male predominance (54.8-63.8%). The main comorbidities included hypertension (W&W: 35.6%; 1L: 38.3%; 2L: 39.7%), diabetes mellitus (W&W: 24.4%; 1L: 24.3%; 2L: 31%), cardiac arrhythmia (W&W: 16.7%; 1L: 17.8%; 2L: 17.2%), heart failure (W&W 16.3%, 1L 17.4%, 2L 17.2%), and dyslipidemia (W&W: 13.7%; 1L: 18.7%; 2L: 19.0%). The most common antineoplastic treatment was ibrutinib in 1L (64.8%) and 2L (62.1%), followed by bendamustine + rituximab (12.6%), obinutuzumab + chlorambucil (5.2%), rituximab + chlorambucil (4.8%), and idelalisib + rituximab (3.9%) in 1L and venetoclax (15.5%), idelalisib + rituximab (6.9%), bendamustine + rituximab (3.5%), and venetoclax + rituximab (3.5%) in 2L. This study expands the information available on patients with CLL in Spain, describing the diversity in patient characteristics and therapeutic approaches in clinical practice.
Collapse
Affiliation(s)
- Javier Loscertales
- Hematology Department, Hospital Universitario de la Princesa, Calle de Diego de León 62, 28006 Madrid, Spain;
| | - Pau Abrisqueta-Costa
- Hematology Department, Hospital Universitari Vall d’Hebron, Pg de la vall d’Hebron 199, 08035 Barcelona, Spain
| | - Antonio Gutierrez
- Hematology Department, Hospital Son Espases/IdISBa, Carretera de Valldemossa 79, 07120 Palma de Mallorca, Spain;
| | - José Ángel Hernández-Rivas
- Hematology Department, Hospital Universitario Infanta Leonor, Avda. Gran Vía del Este 80, 28031 Madrid, Spain;
| | - Rafael Andreu-Lapiedra
- Hematology Department, Hospital Universitario La Fe, Avinguda de Fernando Abril Martorell 106, 46026 Valencia, Spain;
| | - Alba Mora
- Hematology Department, Hospital de la Santa Creu i Sant Pau, Calle de St. Antoni Maria Claret 167, 08025 Barcelona, Spain;
| | - Carolina Leiva-Farré
- Medical Department, Astrazeneca Farmacéutica Spain S.A., Calle del Puerto de Somport 21, 28050 Madrid, Spain; (C.L.-F.); (M.D.L.-R.); (Á.C.-M.); (E.Á.-G.)
| | - María Dolores López-Roda
- Medical Department, Astrazeneca Farmacéutica Spain S.A., Calle del Puerto de Somport 21, 28050 Madrid, Spain; (C.L.-F.); (M.D.L.-R.); (Á.C.-M.); (E.Á.-G.)
| | - Ángel Callejo-Mellén
- Medical Department, Astrazeneca Farmacéutica Spain S.A., Calle del Puerto de Somport 21, 28050 Madrid, Spain; (C.L.-F.); (M.D.L.-R.); (Á.C.-M.); (E.Á.-G.)
| | - Esther Álvarez-García
- Medical Department, Astrazeneca Farmacéutica Spain S.A., Calle del Puerto de Somport 21, 28050 Madrid, Spain; (C.L.-F.); (M.D.L.-R.); (Á.C.-M.); (E.Á.-G.)
| | - José Antonio García-Marco
- Hematology Department, Hospital Universitario Puerta de Hierro-Majadahonda, Calle Joaquín Rodrigo 1, 28222 Majadahonda, Spain;
| |
Collapse
|
3
|
Natural language processing for the surveillance of postoperative venous thromboembolism. Surgery 2021; 170:1175-1182. [PMID: 34090671 DOI: 10.1016/j.surg.2021.04.027] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 04/07/2021] [Accepted: 04/20/2021] [Indexed: 11/20/2022]
Abstract
BACKGROUND The objective of the study was to develop a portal natural language processing approach to aid in the identification of postoperative venous thromboembolism events from free-text clinical notes. METHODS We abstracted clinical notes from 25,494 operative events from 2 independent health care systems. A venous thromboembolism detected as part of the American College of Surgeons National Surgical Quality Improvement Program was used as the reference standard. A natural language processing engine, easy clinical information extractor-pulmonary embolism/deep vein thrombosis (EasyCIE-PEDVT), was trained to detect pulmonary embolism and deep vein thrombosis from clinical notes. International Classification of Diseases discharge diagnosis codes for venous thromboembolism were used as baseline comparators. The classification performance of EasyCIE-PEDVT was compared with International Classification of Diseases codes using sensitivity, specificity, area under the receiver operating characteristic curve, using an internal and external validation cohort. RESULTS To detect pulmonary embolism, EasyCIE-PEDVT had a sensitivity of 0.714 and 0.815 in internal and external validation, respectively. To detect deep vein thrombosis, EasyCIE-PEDVT had a sensitivity of 0.846 and 0.849 in internal and external validation, respectively. EasyCIE-PEDVT had significantly higher discrimination for deep vein thrombosis compared with International Classification of Diseases codes in internal validation (area under the receiver operating characteristic curve: 0.920 vs 0.761; P < .001) and external validation (area under the receiver operating characteristic curve: 0.921 vs 0.794; P < .001). There was no significant difference in the discrimination for pulmonary embolism between EasyCIE-PEDVT and International Classification of Diseases codes. CONCLUSION Accurate surveillance of postoperative venous thromboembolism may be achieved using natural language processing on clinical notes in 2 independent health care systems. These findings suggest natural language processing may augment manual chart abstraction for large registries such as National Surgical Quality Improvement Program.
Collapse
|
4
|
Afshar M, Dligach D, Sharma B, Cai X, Boyda J, Birch S, Valdez D, Zelisko S, Joyce C, Modave F, Price R. Development and application of a high throughput natural language processing architecture to convert all clinical documents in a clinical data warehouse into standardized medical vocabularies. J Am Med Inform Assoc 2021; 26:1364-1369. [PMID: 31145455 DOI: 10.1093/jamia/ocz068] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 04/18/2019] [Accepted: 04/24/2019] [Indexed: 12/23/2022] Open
Abstract
OBJECTIVE Natural language processing (NLP) engines such as the clinical Text Analysis and Knowledge Extraction System are a solution for processing notes for research, but optimizing their performance for a clinical data warehouse remains a challenge. We aim to develop a high throughput NLP architecture using the clinical Text Analysis and Knowledge Extraction System and present a predictive model use case. MATERIALS AND METHODS The CDW was comprised of 1 103 038 patients across 10 years. The architecture was constructed using the Hadoop data repository for source data and 3 large-scale symmetric processing servers for NLP. Each named entity mention in a clinical document was mapped to the Unified Medical Language System concept unique identifier (CUI). RESULTS The NLP architecture processed 83 867 802 clinical documents in 13.33 days and produced 37 721 886 606 CUIs across 8 standardized medical vocabularies. Performance of the architecture exceeded 500 000 documents per hour across 30 parallel instances of the clinical Text Analysis and Knowledge Extraction System including 10 instances dedicated to documents greater than 20 000 bytes. In a use-case example for predicting 30-day hospital readmission, a CUI-based model had similar discrimination to n-grams with an area under the curve receiver operating characteristic of 0.75 (95% CI, 0.74-0.76). DISCUSSION AND CONCLUSION Our health system's high throughput NLP architecture may serve as a benchmark for large-scale clinical research using a CUI-based approach.
Collapse
Affiliation(s)
- Majid Afshar
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - Dmitriy Dligach
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA.,Department of Computer Science, Loyola University, Chicago, Illinois, USA
| | - Brihat Sharma
- Department of Computer Science, Loyola University, Chicago, Illinois, USA
| | - Xiaoyuan Cai
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Jason Boyda
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Steven Birch
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Daniel Valdez
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Suzan Zelisko
- Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| | - Cara Joyce
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - François Modave
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - Ron Price
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Informatics and Systems Development, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA
| |
Collapse
|
5
|
Izquierdo JL, Ancochea J, Soriano JB. Clinical Characteristics and Prognostic Factors for Intensive Care Unit Admission of Patients With COVID-19: Retrospective Study Using Machine Learning and Natural Language Processing. J Med Internet Res 2020; 22:e21801. [PMID: 33090964 PMCID: PMC7595750 DOI: 10.2196/21801] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 07/28/2020] [Accepted: 10/20/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Many factors involved in the onset and clinical course of the ongoing COVID-19 pandemic are still unknown. Although big data analytics and artificial intelligence are widely used in the realms of health and medicine, researchers are only beginning to use these tools to explore the clinical characteristics and predictive factors of patients with COVID-19. OBJECTIVE Our primary objectives are to describe the clinical characteristics and determine the factors that predict intensive care unit (ICU) admission of patients with COVID-19. Determining these factors using a well-defined population can increase our understanding of the real-world epidemiology of the disease. METHODS We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling) to analyze the electronic health records (EHRs) of patients with COVID-19. We explored the unstructured free text in the EHRs within the Servicio de Salud de Castilla-La Mancha (SESCAM) Health Care Network (Castilla-La Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1 to March 29, 2020. We extracted related clinical information regarding diagnosis, progression, and outcome for all COVID-19 cases. RESULTS A total of 10,504 patients with a clinical or polymerase chain reaction-confirmed diagnosis of COVID-19 were identified; 5519 (52.5%) were male, with a mean age of 58.2 years (SD 19.7). Upon admission, the most common symptoms were cough, fever, and dyspnea; however, all three symptoms occurred in fewer than half of the cases. Overall, 6.1% (83/1353) of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm, we identified that a combination of age, fever, and tachypnea was the most parsimonious predictor of ICU admission; patients younger than 56 years, without tachypnea, and temperature <39 degrees Celsius (or >39 ºC without respiratory crackles) were not admitted to the ICU. In contrast, patients with COVID-19 aged 40 to 79 years were likely to be admitted to the ICU if they had tachypnea and delayed their visit to the emergency department after being seen in primary care. CONCLUSIONS Our results show that a combination of easily obtainable clinical variables (age, fever, and tachypnea with or without respiratory crackles) predicts whether patients with COVID-19 will require ICU admission.
Collapse
|
6
|
Combining Natural Language Processing of Electronic Medical Notes With Administrative Data to Determine Racial/Ethnic Differences in the Disclosure and Documentation of Military Sexual Trauma in Veterans. Med Care 2020; 57 Suppl 6 Suppl 2:S149-S156. [PMID: 31095054 DOI: 10.1097/mlr.0000000000001031] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
BACKGROUND Despite national screening efforts, military sexual trauma (MST) is underreported. Little is known of racial/ethnic differences in MST reporting in the Veterans Health Administration (VHA). OBJECTIVE This study aimed to compare patterns of MST disclosure in VHA by race/ethnicity. RESEARCH DESIGN Retrospective cohort study of MST disclosures in a national, random sample of Veterans who served in Afghanistan and Iraq and completed MST screens from October 2009 to 2014. We used natural language processing (NLP) to extract MST concepts from electronic medical notes in the year following Veterans' first MST screen. MEASURE(S) Any evidence of MST (positive MST screen or NLP concepts) and late MST disclosure (NLP concepts following a negative MST screen). Multivariable logistic regressions, stratified by sex, tested racial/ethnic differences in any MST evidence, and late disclosure. RESULTS Of 6618 male and 6716 female Veterans with MST screen results, 1473 had a positive screen (68 male, 1%; 1405 female, 21%). Of those with a negative screen, 257 evidenced late MST disclosure by NLP (44 male, 39%; 213 female, 13%). Late MST disclosure was usually documented during mental health visits. There were no significant racial/ethnic differences in MST disclosure among men. Among women, blacks were less likely than whites to have any MST evidence (adjusted odds ratio=0.75). In the subsample with any MST evidence, black and Hispanic women were more likely than whites to disclose MST late (adjusted odds ratio=1.89 and 1.59, respectively). CONCLUSIONS Combining NLP results with MST screen data facilitated the identification of under-reported sexual trauma experiences among men and racial/ethnic minority women.
Collapse
|
7
|
Shi J, Hurdle JF. Trie-based rule processing for clinical NLP: A use-case study of n-trie, making the ConText algorithm more efficient and scalable. J Biomed Inform 2018; 85:106-113. [PMID: 30092358 DOI: 10.1016/j.jbi.2018.08.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 07/19/2018] [Accepted: 08/05/2018] [Indexed: 11/19/2022]
Abstract
OBJECTIVE To develop and evaluate an efficient Trie structure for large-scale, rule-based clinical natural language processing (NLP), which we call n-trie. BACKGROUND Despite the popularity of machine learning techniques in natural language processing, rule-based systems boast important advantages: distinctive transparency, ease of incorporating external knowledge, and less demanding annotation requirements. However, processing efficiency remains a major obstacle for adopting standard rule-base NLP solutions in big data analyses. METHODS We developed n-trie to specifically address the token-based nature of context detection, an important facet of clinical NLP that is known to slow down NLP pipelines. N-trie, a new rule processing engine using a revised Trie structure, allows fast execution of lexicon-based NLP rules. To determine its applicability and evaluate its performance, we applied the n-trie engine in an implementation (called FastContext) of the ConText algorithm and compared its processing speed and accuracy with JavaConText and GeneralConText, two widely used Java ConText implementations, as well as with a standalone machine learning NegEx implementation, NegScope. RESULTS The n-trie engine ran two orders of magnitude faster and was far less sensitive to rule set size than the comparison implementations, and it proved faster than the best machine learning negation detector. Additionally, the engine consistently gained accuracy improvement as the rule set increased (the desired outcome of adding new rules), while the other implementations did not. CONCLUSIONS The n-trie engine is an efficient, scalable engine to support NLP rule processing and shows the potential for application in other NLP tasks beyond context detection.
Collapse
Affiliation(s)
- Jianlin Shi
- Department of in Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
| | - John F Hurdle
- Department of in Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
| |
Collapse
|
8
|
Johnson SB, Adekkanattu P, Campion TR, Flory J, Pathak J, Patterson OV, DuVall SL, Major V, Aphinyanaphongs Y. From Sour Grapes to Low-Hanging Fruit: A Case Study Demonstrating a Practical Strategy for Natural Language Processing Portability. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:104-112. [PMID: 29888051 PMCID: PMC5961788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Natural Language Processing (NLP) holds potential for patient care and clinical research, but a gap exists between promise and reality. While some studies have demonstrated portability of NLP systems across multiple sites, challenges remain. Strategies to mitigate these challenges can strive for complex NLP problems using advanced methods (hard-to-reach fruit), or focus on simple NLP problems using practical methods (low-hanging fruit). This paper investigates a practical strategy for NLP portability using extraction of left ventricular ejection fraction (LVEF) as a use case. We used a tool developed at the Department of Veterans Affair (VA) to extract the LVEF values from free-text echocardiograms in the MIMIC-III database. The approach showed an accuracy of 98.4%, sensitivity of 99.4%, a positive predictive value of 98.7%, and F-score of 99.0%. This experience, in which a simple NLP solution proved highly portable with excellent performance, illustrates the point that simple NLP applications may be easier to disseminate and adapt, and in the short term may prove more useful, than complex applications.
Collapse
Affiliation(s)
- Stephen B Johnson
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
| | - Prakash Adekkanattu
- Information Technologies & Services, Weill Cornell Medicine, New York, New York
| | - Thomas R Campion
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
- Information Technologies & Services, Weill Cornell Medicine, New York, New York
| | - James Flory
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
| | - Jyotishman Pathak
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
| | - Olga V Patterson
- VA Salt Lake City Health Care System
- University of Utah, Salt Lake City, UT
| | - Scott L DuVall
- VA Salt Lake City Health Care System
- University of Utah, Salt Lake City, UT
| | - Vincent Major
- Center for Health Informatics and Bioinformatics, NYU Langone Medical Center, New York, New York
| | - Yindalon Aphinyanaphongs
- Center for Health Informatics and Bioinformatics, NYU Langone Medical Center, New York, New York
| |
Collapse
|
9
|
Skelton F, Campbell B, Horwitz D, Krein S, Sales A, Gundlapalli A, Trautner BW. Developing a user-friendly report for electronically assisted surveillance of catheter-associated urinary tract infection. Am J Infect Control 2017; 45:572-574. [PMID: 28456323 PMCID: PMC7499359 DOI: 10.1016/j.ajic.2016.09.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Revised: 09/13/2016] [Accepted: 09/13/2016] [Indexed: 11/22/2022]
Abstract
Catheter-associated urinary tract infection (CAUTI) surveillance is labor intensive, generally involving manual medical record review. We developed a prototype automated report through iterative design. Surveys and qualitative interviews were administered to key stakeholders to assess the report design. We found that different provider types expressed different needs regarding report content and format. Therefore, determining the primary audience for reporting data on CAUTI a priori is critical to developing useful reports, particularly as this process becomes standardized and automated.
Collapse
Affiliation(s)
- Felicia Skelton
- Centers for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX; Baylor College of Medicine, Houston, TX
| | - Bryan Campbell
- Centers for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX
| | - Deborah Horwitz
- Centers for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX; Baylor College of Medicine, Houston, TX
| | - Sarah Krein
- Veteran Affairs Ann Arbor Healthcare System, Ann Arbor, MI; University of Michigan, Ann Arbor, MI
| | - Anne Sales
- Veteran Affairs Ann Arbor Healthcare System, Ann Arbor, MI; University of Michigan, Ann Arbor, MI
| | - Adi Gundlapalli
- University of Utah, Salt Lake City, UT; Veteran Affairs Salt Lake City Health Care System, Salt Lake City, UT
| | - Barbara W Trautner
- Centers for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veteran Affairs Medical Center, Houston, TX; Baylor College of Medicine, Houston, TX.
| |
Collapse
|
10
|
Divita G, Carter ME, Tran LT, Redd D, Zeng QT, Duvall S, Samore MH, Gundlapalli AV. v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text. EGEMS (WASHINGTON, DC) 2016; 4:1228. [PMID: 27683667 PMCID: PMC5019303 DOI: 10.13063/2327-9214.1228] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
INTRODUCTION Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. The v3NLP Framework is a set of "best-of-breed" functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. BACKGROUND MetaMap, cTAKES and similar well-known natural language processing (NLP) tools do not have sufficient scalability out of the box. The v3NLP Framework evolved out of the necessity to scale-up these tools up and provide a framework to customize and tune techniques that fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. INNOVATION Beyond scalability, several v3NLP Framework-developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, its functionalities enable developers to create novel annotators and to place annotators into pipelines and scaled applications. DISCUSSION The v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin-resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. CONCLUSION The v3NLP Framework is a set of functionalities and components that provide Java developers with the ability to create novel annotators and to place those annotators into pipelines and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records.
Collapse
Affiliation(s)
- Guy Divita
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Marjorie E Carter
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Le-Thuy Tran
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Doug Redd
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Qing T Zeng
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Scott Duvall
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Matthew H Samore
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| | - Adi V Gundlapalli
- VA Salt Lake City Health Care System and University of Utah School of Medicine
| |
Collapse
|
11
|
Kaggal VC, Elayavilli RK, Mehrabi S, Pankratz JJ, Sohn S, Wang Y, Li D, Rastegar MM, Murphy SP, Ross JL, Chaudhry R, Buntrock JD, Liu H. Toward a Learning Health-care System - Knowledge Delivery at the Point of Care Empowered by Big Data and NLP. BIOMEDICAL INFORMATICS INSIGHTS 2016; 8:13-22. [PMID: 27385912 PMCID: PMC4920204 DOI: 10.4137/bii.s37977] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Revised: 03/20/2016] [Accepted: 03/29/2016] [Indexed: 11/24/2022]
Abstract
The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future.
Collapse
Affiliation(s)
- Vinod C Kaggal
- Division of Information Management and Analytics, Mayo Clinic, Rochester, MN, USA.; Biomedical Informatics and Computational Biology, University of Minnesota, Rochester, MN, USA
| | | | - Saeed Mehrabi
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Joshua J Pankratz
- Division of Information Management and Analytics, Mayo Clinic, Rochester, MN, USA
| | - Sunghwan Sohn
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Yanshan Wang
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Dingcheng Li
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | | | - Sean P Murphy
- Division of Information Management and Analytics, Mayo Clinic, Rochester, MN, USA
| | - Jason L Ross
- Division of Information Management and Analytics, Mayo Clinic, Rochester, MN, USA
| | | | - James D Buntrock
- Division of Information Management and Analytics, Mayo Clinic, Rochester, MN, USA
| | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
12
|
Abstract
This editorial is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". The amount of data being generated in the healthcare industry is growing at a rapid rate. This has generated immense interest in leveraging the availability of healthcare data (and "big data") to improve health outcomes and reduce costs. However, the nature of healthcare data, and especially big data, presents unique challenges in processing and analyzing big data in healthcare. This Focus Theme aims to disseminate some novel approaches to address these challenges. More specifically, approaches ranging from efficient methods of processing large clinical data to predictive models that could generate better predictions from healthcare data are presented.
Collapse
Affiliation(s)
- S S-L Tan
- Sharon Swee-Lin Tan, Centre for Health Informatics, Department of Information Systems, National University of Singapore, Singapore, E-mail:
| | | | | |
Collapse
|