Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chiang JH, Lin JW, Yang CW. Automated evaluation of electronic discharge notes to assess quality of care for cardiovascular diseases using Medical Language Extraction and Encoding System (MedLEE). J Am Med Inform Assoc 2010;17:245-52. [PMID: 20442141 DOI: 10.1136/jamia.2009.000182] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open

For:	Chiang JH, Lin JW, Yang CW. Automated evaluation of electronic discharge notes to assess quality of care for cardiovascular diseases using Medical Language Extraction and Encoding System (MedLEE). J Am Med Inform Assoc 2010;17:245-52. [PMID: 20442141 DOI: 10.1136/jamia.2009.000182] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open

Number

Cited by Other Article(s)

Berge GT, Granmo OC, Tveit TO, Ruthjersen AL, Sharma J. Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records. BMC Med Inform Decis Mak 2023;23:188. [PMID: 37723446 PMCID: PMC10507898 DOI: 10.1186/s12911-023-02271-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 08/17/2023] [Indexed: 09/20/2023] Open

Abstract

BACKGROUND

Data mining of electronic health records (EHRs) has a huge potential for improving clinical decision support and to help healthcare deliver precision medicine. Unfortunately, the rule-based and machine learning-based approaches used for natural language processing (NLP) in healthcare today all struggle with various shortcomings related to performance, efficiency, or transparency.

METHODS

In this paper, we address these issues by presenting a novel method for NLP that implements unsupervised learning of word embeddings, semi-supervised learning for simplified and accelerated clinical vocabulary and concept building, and deterministic rules for fine-grained control of information extraction. The clinical language is automatically learnt, and vocabulary, concepts, and rules supporting a variety of NLP downstream tasks can further be built with only minimal manual feature engineering and tagging required from clinical experts. Together, these steps create an open processing pipeline that gradually refines the data in a transparent way, which greatly improves the interpretable nature of our method. Data transformations are thus made transparent and predictions interpretable, which is imperative for healthcare. The combined method also has other advantages, like potentially being language independent, demanding few domain resources for maintenance, and able to cover misspellings, abbreviations, and acronyms. To test and evaluate the combined method, we have developed a clinical decision support system (CDSS) named Information System for Clinical Concept Searching (ICCS) that implements the method for clinical concept tagging, extraction, and classification.

RESULTS

In empirical studies the method shows high performance (recall 92.6%, precision 88.8%, F-measure 90.7%), and has demonstrated its value to clinical practice. Here we employ a real-life EHR-derived dataset to evaluate the method's performance on the task of classification (i.e., detecting patient allergies) against a range of common supervised learning algorithms. The combined method achieves state-of-the-art performance compared to the alternative methods we evaluate. We also perform a qualitative analysis of common word embedding methods on the task of word similarity to examine their potential for supporting automatic feature engineering for clinical NLP tasks.

CONCLUSIONS

Based on the promising results, we suggest more research should be aimed at exploiting the inherent synergies between unsupervised, supervised, and rule-based paradigms for clinical NLP.

Collapse

Humbert-Droz M, Corley J, Tamang S, Gevaert O. Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2022:2022.12.14.22283470. [PMID: 36561189 PMCID: PMC9774225 DOI: 10.1101/2022.12.14.22283470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Abdelkader W, Navarro T, Parrish R, Cotoi C, Germini F, Iorio A, Haynes RB, Lokker C. Machine Learning Approaches to Retrieve High-Quality, Clinically Relevant Evidence From the Biomedical Literature: Systematic Review. JMIR Med Inform 2021;9:e30401. [PMID: 34499041 PMCID: PMC8461527 DOI: 10.2196/30401] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/15/2021] [Accepted: 07/25/2021] [Indexed: 11/20/2022] Open

Sung SF, Lin CY, Hu YH. EMR-Based Phenotyping of Ischemic Stroke Using Supervised Machine Learning and Text Mining Techniques. IEEE J Biomed Health Inform 2020;24:2922-2931. [DOI: 10.1109/jbhi.2020.2976931] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Khaleghi T, Murat A, Arslanturk S, Davies E. Automated Surgical Term Clustering: A Text Mining Approach for Unstructured Textual Surgery Descriptions. IEEE J Biomed Health Inform 2019;24:2107-2118. [PMID: 31796420 DOI: 10.1109/jbhi.2019.2956973] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

A Lightweight API-Based Approach for Building Flexible Clinical NLP Systems. JOURNAL OF HEALTHCARE ENGINEERING 2019;2019:3435609. [PMID: 31511785 PMCID: PMC6714318 DOI: 10.1155/2019/3435609] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 06/20/2019] [Accepted: 07/26/2019] [Indexed: 12/18/2022]

Xu K, Zhou Z, Gong T, Hao T, Liu W. SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields. BMC Med Inform Decis Mak 2018;18:114. [PMID: 30526592 PMCID: PMC6284263 DOI: 10.1186/s12911-018-0690-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Segura-Bedmar I, Colón-Ruíz C, Tejedor-Alonso MÁ, Moro-Moro M. Predicting of anaphylaxis in big data EMR by exploring machine learning approaches. J Biomed Inform 2018;87:50-59. [DOI: 10.1016/j.jbi.2018.09.012] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Revised: 08/31/2018] [Accepted: 09/24/2018] [Indexed: 11/26/2022]

Karystianis G, Thayer K, Wolfe M, Tsafnat G. Evaluation of a rule-based method for epidemiological document classification towards the automation of systematic reviews. J Biomed Inform 2017;70:27-34. [PMID: 28455150 DOI: 10.1016/j.jbi.2017.04.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Revised: 03/14/2017] [Accepted: 04/02/2017] [Indexed: 02/02/2023]

Abstract

INTRODUCTION

Most data extraction efforts in epidemiology are focused on obtaining targeted information from clinical trials. In contrast, limited research has been conducted on the identification of information from observational studies, a major source for human evidence in many fields, including environmental health. The recognition of key epidemiological information (e.g., exposures) through text mining techniques can assist in the automation of systematic reviews and other evidence summaries.

METHOD

We designed and applied a knowledge-driven, rule-based approach to identify targeted information (study design, participant population, exposure, outcome, confounding factors, and the country where the study was conducted) from abstracts of epidemiological studies included in several systematic reviews of environmental health exposures. The rules were based on common syntactical patterns observed in text and are thus not specific to any systematic review. To validate the general applicability of our approach, we compared the data extracted using our approach versus hand curation for 35 epidemiological study abstracts manually selected for inclusion in two systematic reviews.

RESULTS

The returned F-score, precision, and recall ranged from 70% to 98%, 81% to 100%, and 54% to 97%, respectively. The highest precision was observed for exposure, outcome and population (100%) while recall was best for exposure and study design with 97% and 89%, respectively. The lowest recall was observed for the population (54%), which also had the lowest F-score (70%).

CONCLUSION

The generated performance of our text-mining approach demonstrated encouraging results for the identification of targeted information from observational epidemiological study abstracts related to environmental exposures. We have demonstrated that rules based on generic syntactic patterns in one corpus can be applied to other observational study design by simple interchanging the dictionaries aiming to identify certain characteristics (i.e., outcomes, exposures). At the document level, the recognised information can assist in the selection and categorization of studies included in a systematic review.

Collapse

Kim YM, Delen D. Medical informatics research trend analysis: A text mining approach. Health Informatics J 2016;24:432-452. [PMID: 30376768 DOI: 10.1177/1460458216678443] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Scuba W, Tharp M, Mowery D, Tseytlin E, Liu Y, Drews FA, Chapman WW. Knowledge Author: facilitating user-driven, domain content development to support clinical information extraction. J Biomed Semantics 2016;7:42. [PMID: 27338146 PMCID: PMC4919842 DOI: 10.1186/s13326-016-0086-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Accepted: 06/01/2016] [Indexed: 11/26/2022] Open

Abstract

BACKGROUND

Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text.

RESULTS

Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76 %) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86 %) and varied recall for modifiers (certainty: 91 % sidedness: 80 %, neurovascular anatomy: 46 %).

CONCLUSION

Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts.

Collapse

Hernandez-Boussard T, Tamang S, Blayney D, Brooks J, Shah N. New Paradigms for Patient-Centered Outcomes Research in Electronic Medical Records: An Example of Detecting Urinary Incontinence Following Prostatectomy. EGEMS 2016;4:1231. [PMID: 27347492 PMCID: PMC4899050 DOI: 10.13063/2327-9214.1231] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Abstract

Introduction:

National initiatives to develop quality metrics emphasize the need to include patient-centered outcomes. Patient-centered outcomes are complex, require documentation of patient communications, and have not been routinely collected by healthcare providers. The widespread implementation of electronic medical records (EHR) offers opportunities to assess patient-centered outcomes within the routine healthcare delivery system. The objective of this study was to test the feasibility and accuracy of identifying patient centered outcomes within the EHR.

Methods:

Data from patients with localized prostate cancer undergoing prostatectomy were used to develop and test algorithms to accurately identify patient-centered outcomes in post-operative EHRs – we used urinary incontinence as the use case. Standard data mining techniques were used to extract and annotate free text and structured data to assess urinary incontinence recorded within the EHRs.

Results

A total 5,349 prostate cancer patients were identified in our EHR-system between 1998–2013. Among these EHRs, 30.3% had a text mention of urinary incontinence within 90 days post-operative compared to less than 1.0% with a structured data field for urinary incontinence (i.e. ICD-9 code). Our workflow had good precision and recall for urinary incontinence (positive predictive value: 0.73 and sensitivity: 0.84).

Discussion.

Our data indicate that important patient-centered outcomes, such as urinary incontinence, are being captured in EHRs as free text and highlight the long-standing importance of accurate clinician documentation. Standard data mining algorithms can accurately and efficiently identify these outcomes in existing EHRs; the complete assessment of these outcomes is essential to move practice into the patient-centered realm of healthcare.

Collapse

Karystianis G, Sheppard T, Dixon WG, Nenadic G. Modelling and extraction of variability in free-text medication prescriptions from an anonymised primary care electronic medical record research database. BMC Med Inform Decis Mak 2016;16:18. [PMID: 26860263 PMCID: PMC4748480 DOI: 10.1186/s12911-016-0255-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 01/28/2016] [Indexed: 11/30/2022] Open

Abstract

Background

Free-text medication prescriptions contain detailed instruction information that is key when preparing drug data for analysis. The objective of this study was to develop a novel model and automated text-mining method to extract detailed structured medication information from free-text prescriptions and explore their variability (e.g. optional dosages) in primary care research databases.

Methods

We introduce a prescription model that provides minimum and maximum values for dose number, frequency and interval, allowing modelling variability and flexibility within a drug prescription. We developed a text mining system that relies on rules to extract such structured information from prescription free-text dosage instructions. The system was applied to medication prescriptions from an anonymised primary care electronic record database (Clinical Practice Research Datalink, CPRD).

Results

We have evaluated our approach on a test set of 220 CPRD prescription free-text directions. The system achieved an overall accuracy of 91 % at the prescription level, with 97 % accuracy across the attribute levels. We then further analysed over 56,000 most common free text prescriptions from CPRD records and found that 1 in 4 has inherent variability, i.e. a choice in taking medication specified by different minimum and maximum doses, duration or frequency.

Conclusions

Our approach provides an accurate, automated way of coding prescription free text information, including information about flexibility and variability within a prescription. The method allows the researcher to decide how best to prepare the prescription data for drug efficacy and safety analyses in any given setting, and test various scenarios and their impact.

Electronic supplementary material

The online version of this article (doi:10.1186/s12911-016-0255-x) contains supplementary material, which is available to authorized users.

Collapse

An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating. Pediatr Emerg Care 2015;31:536-41. [PMID: 26148107 DOI: 10.1097/pec.0000000000000484] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Bailey LC, Mistry KB, Tinoco A, Earls M, Rallins MC, Hanley K, Christensen K, Jones M, Woods D. Addressing electronic clinical information in the construction of quality measures. Acad Pediatr 2014;14:S82-9. [PMID: 25169464 DOI: 10.1016/j.acap.2014.06.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Revised: 06/10/2014] [Accepted: 06/12/2014] [Indexed: 10/24/2022]

Abstract

Electronic health records (EHR) and registries play a central role in health care and provide access to detailed clinical information at the individual, institutional, and population level. Use of these data for clinical quality/performance improvement and cost management has been a focus of policy initiatives over the past decade. The Children's Health Insurance Program Reauthorization Act of 2009 (CHIPRA)-mandated Pediatric Quality Measurement Program supports development and testing of quality measures for children on the basis of electronic clinical information, including de novo measures and respecification of existing measures designed for other data sources. Drawing on the experience of Centers of Excellence, we review both structural and pragmatic considerations in e-measurement. The presence of primary observations in EHR-derived data make it possible to measure outcomes in ways that are difficult with administrative data alone. However, relevant information may be located in narrative text, making it difficult to interpret. EHR systems are collecting more discrete data, but the structure, semantics, and adoption of data elements vary across vendors and sites. EHR systems also differ in ability to incorporate pediatric concepts such as variable dosing and growth percentiles. This variability complicates quality measurement, as do limitations in established measure formats, such as the Quality Data Model, to e-measurement. Addressing these challenges will require investment by vendors, researchers, and clinicians alike in developing better pediatric content for standard terminologies and data models, encouraging wider adoption of technical standards that support reliable quality measurement, better harmonizing data collection with clinical work flow in EHRs, and better understanding the behavior and potential of e-measures.

Collapse

Gobbel GT, Garvin J, Reeves R, Cronin RM, Heavirland J, Williams J, Weaver A, Jayaramaraja S, Giuse D, Speroff T, Brown SH, Xu H, Matheny ME. Assisted annotation of medical free text using RapTAT. J Am Med Inform Assoc 2014;21:833-41. [PMID: 24431336 DOI: 10.1136/amiajnl-2013-002255] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Abstract

OBJECTIVE

To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias.

MATERIALS AND METHODS

A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training.

RESULTS

The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85).

DISCUSSION

The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias.

CONCLUSIONS

Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%.

Collapse

Affiliation(s)

Glenn T Gobbel Department of Veterans Affairs Medical Center, Geriatric Research, Education and Clinical Center (GRECC), Nashville, Tennessee, USA Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA Division of General Internal Medicine & Public Health, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Jennifer Garvin IDEAS Center SLC VA Healthcare System, Salt Lake City, Utah, USA Division of Epidemiology, University of Utah School of Medicine, Salt Lake City, Utah, USA Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah, USA Department of Veterans Affairs Medical Center, Geriatric Research, Education and Clinical Center (GRECC), Salt Lake City, Utah, USA
Ruth Reeves Department of Veterans Affairs Medical Center, Geriatric Research, Education and Clinical Center (GRECC), Nashville, Tennessee, USA Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Robert M Cronin Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA Division of General Internal Medicine & Public Health, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Julia Heavirland IDEAS Center SLC VA Healthcare System, Salt Lake City, Utah, USA
Jenifer Williams IDEAS Center SLC VA Healthcare System, Salt Lake City, Utah, USA
Allison Weaver IDEAS Center SLC VA Healthcare System, Salt Lake City, Utah, USA
Shrimalini Jayaramaraja Division of General Internal Medicine & Public Health, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Dario Giuse Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Theodore Speroff Department of Veterans Affairs Medical Center, Geriatric Research, Education and Clinical Center (GRECC), Nashville, Tennessee, USA Division of General Internal Medicine & Public Health, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Steven H Brown Department of Veterans Affairs Medical Center, Geriatric Research, Education and Clinical Center (GRECC), Nashville, Tennessee, USA Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Hua Xu School of Biomedical Informatics, University of Texas Health Science Center, Houston, Texas, USA
Michael E Matheny Department of Veterans Affairs Medical Center, Geriatric Research, Education and Clinical Center (GRECC), Nashville, Tennessee, USA Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA Division of General Internal Medicine & Public Health, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA

Collapse

Ping XO, Tseng YJ, Chung Y, Wu YL, Hsu CW, Yang PM, Huang GT, Lai F, Liang JD. Information extraction for tracking liver cancer patients' statuses: from mixture of clinical narrative report types. Telemed J E Health 2013;19:704-10. [PMID: 23869395 DOI: 10.1089/tmj.2012.0241] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kottke TE, Baechler CJ. An algorithm that identifies coronary and heart failure events in the electronic health record. Prev Chronic Dis 2013;10:E29. [PMID: 23449283 PMCID: PMC3592787 DOI: 10.5888/pcd10.120097] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

Introduction

The advent of universal health care coverage in the United States and the use of electronic health records can make the medical record a disease surveillance tool. The objective of our study was to identify criteria that accurately categorize acute coronary and heart failure events by using electronic health record data exclusively so that the medical record can be used for surveillance without manual record review.

Methods

We serially compared 3 computer algorithms to manual record review. The first 2 algorithms relied on ICD-9-CM (International Classification of Diseases, 9th Revision, Clinical Modification) codes, troponin levels, electrocardiogram (ECG) data, and echocardiograph data. The third algorithm relied on a detailed coding system, Intelligent Medical Objects, Inc., (IMO) interface terminology, troponin levels, and echocardiograph data.

Results

Cohen’s κ for the initial algorithm was 0.47 (95% confidence interval [CI], 0.41–0.54). Cohen’s κ was 0.61 (95% CI, 0.55–0.68) for the second algorithm. Cohen’s κ for the third algorithm was 0.99 (95% CI, 0.98–1.00).

Conclusion

Electronic medical record data are sufficient to categorize coronary heart disease and heart failure events without manual record review. However, only moderate agreement with medical record review can be achieved when the classification is based on 4-digit ICD-9-CM codes because ICD-9-CM 410.9 includes myocardial infarction with elevation of the ST segment on ECG (STEMI) and myocardial infarction without elevation of the ST segment on ECG (nSTEMI). Nearly perfect agreement can be achieved using IMO interface terminology, a more detailed coding system that tracks to ICD9, ICD10 (International Classification of Diseases, Tenth Revision, Clinical Modification), and SnoMED-CT (Systematized Nomenclature of Medicine – Clinical Terms).

Collapse

Wu Y, Denny JC, Rosenbloom ST, Miller RA, Giuse DA, Xu H. A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2012;2012:997-1003. [PMID: 23304375 PMCID: PMC3540461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Haerian K, Varn D, Vaidya S, Ena L, Chase HS, Friedman C. Detection of pharmacovigilance-related adverse events using electronic health records and automated methods. Clin Pharmacol Ther 2012;92:228-34. [PMID: 22713699 DOI: 10.1038/clpt.2012.54] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Harkema H, Chapman WW, Saul M, Dellon ES, Schoen RE, Mehrotra A. Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc 2011;18 Suppl 1:i150-6. [PMID: 21946240 PMCID: PMC3241178 DOI: 10.1136/amiajnl-2011-000431] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Accepted: 08/18/2011] [Indexed: 12/24/2022] Open

Lin JW, Chang CH, Lin MW, Ebell MH, Chiang JH. Automating the process of critical appraisal and assessing the strength of evidence with information extraction technology. J Eval Clin Pract 2011;17:832-8. [PMID: 21707873 DOI: 10.1111/j.1365-2753.2011.01712.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Buczak AL, Babin S, Moniz L. Data-driven approach for creating synthetic electronic medical records. BMC Med Inform Decis Mak 2010;10:59. [PMID: 20946670 PMCID: PMC2972239 DOI: 10.1186/1472-6947-10-59] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2010] [Accepted: 10/14/2010] [Indexed: 12/03/2022] Open

Abstract

BACKGROUND

New algorithms for disease outbreak detection are being developed to take advantage of full electronic medical records (EMRs) that contain a wealth of patient information. However, due to privacy concerns, even anonymized EMRs cannot be shared among researchers, resulting in great difficulty in comparing the effectiveness of these algorithms. To bridge the gap between novel bio-surveillance algorithms operating on full EMRs and the lack of non-identifiable EMR data, a method for generating complete and synthetic EMRs was developed.

METHODS

This paper describes a novel methodology for generating complete synthetic EMRs both for an outbreak illness of interest (tularemia) and for background records. The method developed has three major steps: 1) synthetic patient identity and basic information generation; 2) identification of care patterns that the synthetic patients would receive based on the information present in real EMR data for similar health problems; 3) adaptation of these care patterns to the synthetic patient population.

RESULTS

We generated EMRs, including visit records, clinical activity, laboratory orders/results and radiology orders/results for 203 synthetic tularemia outbreak patients. Validation of the records by a medical expert revealed problems in 19% of the records; these were subsequently corrected. We also generated background EMRs for over 3000 patients in the 4-11 yr age group. Validation of those records by a medical expert revealed problems in fewer than 3% of these background patient EMRs and the errors were subsequently rectified.

CONCLUSIONS

A data-driven method was developed for generating fully synthetic EMRs. The method is general and can be applied to any data set that has similar data elements (such as laboratory and radiology orders and results, clinical activity, prescription orders). The pilot synthetic outbreak records were for tularemia but our approach may be adapted to other infectious diseases. The pilot synthetic background records were in the 4-11 year old age group. The adaptations that must be made to the algorithms to produce synthetic background EMRs for other age groups are indicated.

Collapse