1
|
Koleck TA, Dreisbach C, Bourne PE, Bakken S. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc 2020; 26:364-379. [PMID: 30726935 DOI: 10.1093/jamia/ocy173] [Citation(s) in RCA: 182] [Impact Index Per Article: 45.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/20/2018] [Accepted: 11/27/2018] [Indexed: 12/26/2022] Open
Abstract
OBJECTIVE Natural language processing (NLP) of symptoms from electronic health records (EHRs) could contribute to the advancement of symptom science. We aim to synthesize the literature on the use of NLP to process or analyze symptom information documented in EHR free-text narratives. MATERIALS AND METHODS Our search of 1964 records from PubMed and EMBASE was narrowed to 27 eligible articles. Data related to the purpose, free-text corpus, patients, symptoms, NLP methodology, evaluation metrics, and quality indicators were extracted for each study. RESULTS Symptom-related information was presented as a primary outcome in 14 studies. EHR narratives represented various inpatient and outpatient clinical specialties, with general, cardiology, and mental health occurring most frequently. Studies encompassed a wide variety of symptoms, including shortness of breath, pain, nausea, dizziness, disturbed sleep, constipation, and depressed mood. NLP approaches included previously developed NLP tools, classification methods, and manually curated rule-based processing. Only one-third (n = 9) of studies reported patient demographic characteristics. DISCUSSION NLP is used to extract information from EHR free-text narratives written by a variety of healthcare providers on an expansive range of symptoms across diverse clinical specialties. The current focus of this field is on the development of methods to extract symptom information and the use of symptom information for disease classification tasks rather than the examination of symptoms themselves. CONCLUSION Future NLP studies should concentrate on the investigation of symptoms and symptom documentation in EHR free-text narratives. Efforts should be undertaken to examine patient characteristics and make symptom-related NLP algorithms or pipelines and vocabularies openly available.
Collapse
Affiliation(s)
| | - Caitlin Dreisbach
- School of Nursing, University of Virginia, Charlottesville, Virginia, USA.,Data Science Institute, University of Virginia, Charlottesville, Virginia, USA
| | - Philip E Bourne
- Data Science Institute, University of Virginia, Charlottesville, Virginia, USA
| | - Suzanne Bakken
- School of Nursing, Columbia University, New York, New York, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA.,Data Science Institute, Columbia University, New York, New York, USA
| |
Collapse
|
2
|
Horng S, Greenbaum NR, Nathanson LA, McClay JC, Goss FR, Nielson JA. Consensus Development of a Modern Ontology of Emergency Department Presenting Problems-The Hierarchical Presenting Problem Ontology (HaPPy). Appl Clin Inform 2019; 10:409-420. [PMID: 31189204 DOI: 10.1055/s-0039-1691842] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
OBJECTIVE Numerous attempts have been made to create a standardized "presenting problem" or "chief complaint" list to characterize the nature of an emergency department visit. Previous attempts have failed to gain widespread adoption as they were not freely shareable or did not contain the right level of specificity, structure, and clinical relevance to gain acceptance by the larger emergency medicine community. Using real-world data, we constructed a presenting problem list that addresses these challenges. MATERIALS AND METHODS We prospectively captured the presenting problems for 180,424 consecutive emergency department patient visits at an urban, academic, Level I trauma center in the Boston metro area. No patients were excluded. We used a consensus process to iteratively derive our system using real-world data. We used the first 70% of consecutive visits to derive our ontology, followed by a 6-month washout period, and the remaining 30% for validation. All concepts were mapped to Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT). RESULTS Our system consists of a polyhierarchical ontology containing 692 unique concepts, 2,118 synonyms, and 30,613 nonvisible descriptions to correct misspellings and nonstandard terminology. Our ontology successfully captured structured data for 95.9% of visits in our validation data set. DISCUSSION AND CONCLUSION We present the HierArchical Presenting Problem ontologY (HaPPy). This ontology was empirically derived and then iteratively validated by an expert consensus panel. HaPPy contains 692 presenting problem concepts, each concept being mapped to SNOMED CT. This freely sharable ontology can help to facilitate presenting problem-based quality metrics, research, and patient care.
Collapse
Affiliation(s)
- Steven Horng
- Division of Clinical Informatics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States.,Department of Emergency Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States
| | - Nathaniel R Greenbaum
- Division of Clinical Informatics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States.,Department of Emergency Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States
| | - Larry A Nathanson
- Division of Clinical Informatics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States.,Department of Emergency Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States
| | - James C McClay
- Department of Emergency Medicine, College of Medicine, University of Nebraska Medical Center, Omaha, Nebraska, United States
| | - Foster R Goss
- Department of Emergency Medicine, University of Colorado Hospital, University of Colorado School of Medicine, Aurora, Colorado, United States
| | - Jeffrey A Nielson
- Northeastern Ohio Medical University, University Hospitals Samaritan Medical Center, Ashland, Ohio, United States
| |
Collapse
|
3
|
Santhi B, Brindha G. Multinomial Naïve Bayes using similarity based conditional probability. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2019. [DOI: 10.3233/jifs-181009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
4
|
Prognosis Essay Scoring and Article Relevancy Using Multi-Text Features and Machine Learning. Symmetry (Basel) 2017. [DOI: 10.3390/sym9010011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
5
|
Zimmerman PA, Mason M, Elder E. A healthy degree of suspicion: A discussion of the implementation of transmission based precautions in the emergency department. ACTA ACUST UNITED AC 2016; 19:149-52. [PMID: 27133874 PMCID: PMC7128487 DOI: 10.1016/j.aenj.2016.03.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Revised: 03/10/2016] [Accepted: 03/29/2016] [Indexed: 02/01/2023]
Abstract
Background Emergency department (ED) presentations have increased significantly domestically and internationally. Swift identification and implementation of transmission based precautions (TBP) for patients known or suspected of having an epidemiologically important pathogen is important. ED staff, particularly triage nurses, are pivotal in detecting and preventing infection, including healthcare associated infections (HAI). Methods MEDLINE, CINAHL, PubMed and Ovid were searched for articles published between 2004 and 2015 using key search terms: infection control/prevention and emergency department(s), triage, and transmission based precautions and emergency department(s), and triage, to identify common themes for discussion. Systematic review/meta-analysis was not in the scope of this exploration. Findings Themes were identified relating to HAI and ED practices and grouped into: assisted detection of conditions for which TBP is required, ED and TBP, mass-causality event/bioterrorism/pandemic/epidemic, infection control not TBP and multi-resistant organisms not TBP. The literature is heavily influenced by worldwide epidemic/pandemics and bioterrorist risks resulting in increased awareness of the importance of swift identification of syndromes that require TBP, but only in these situations. Conclusion Implementation of appropriate TBP, changing triage practices, training and measures to assist decision-making could assist in preventing HAI in the ED context. A systematic quantitative review of the literature is recommended to guide practice change research.
Collapse
Affiliation(s)
- Peta-Anne Zimmerman
- School of Nursing and Midwifery, Griffith University, Australia; Gold Coast Hospital and Health Service, Australia.
| | - Matt Mason
- School of Nursing, Midwifery and Paramedicine, University of the Sunshine Coast, Australia
| | - Elizabeth Elder
- School of Nursing and Midwifery, Griffith University, Australia
| |
Collapse
|
6
|
Hatakeyama Y, Miyano I, Kataoka H, Nakajima N, Watabe T, Yasuda N, Okuhara Y. Use of a Latent Topic Model for Characteristic Extraction from Health Checkup Questionnaire Data. Methods Inf Med 2015; 54:515-21. [PMID: 26063536 DOI: 10.3414/me15-01-0023] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2015] [Accepted: 05/29/2015] [Indexed: 12/19/2022]
Abstract
OBJECTIVES When patients complete questionnaires during health checkups, many of their responses are subjective, making topic extraction difficult. Therefore, the purpose of this study was to develop a model capable of extracting appropriate topics from subjective data in questionnaires conducted during health checkups. METHODS We employed a latent topic model to group the lifestyle habits of the study participants and represented their responses to items on health checkup questionnaires as a probability model. For the probability model, we used latent Dirichlet allocation to extract 30 topics from the questionnaires. According to the model parameters, a total of 4381 study participants were then divided into groups based on these topics. Results from laboratory tests, including blood glucose level, triglycerides, and estimated glomerular filtration rate, were compared between each group, and these results were then compared with those obtained by hierarchical clustering. RESULTS If a significant (p < 0.05) difference was observed in any of the laboratory measurements between groups, it was considered to indicate a questionnaire response pattern corresponding to the value of the test result. A comparison between the latent topic model and hierarchical clustering grouping revealed that, in the latent topic model method, a small group of participants who reported having subjective signs of urinary disorder were allocated to a single group. CONCLUSIONS The latent topic model is useful for extracting characteristics from a small number of groups from questionnaires with a large number of items. These results show that, in addition to chief complaints and history of past illness, questionnaire data obtained during medical checkups can serve as useful judgment criteria for assessing the conditions of patients.
Collapse
Affiliation(s)
- Y Hatakeyama
- Yutaka Hatakeyama, Center of Medical Information Science, Kochi University Medical School, Oko-cho Kohasu, Nankoku, Kochi 783-8505, Japan, E-mail:
| | | | | | | | | | | | | |
Collapse
|
7
|
Emergency Medical Text Classifier: New system improves processing and classification of triage notes. Online J Public Health Inform 2014; 6:e178. [PMID: 25379126 PMCID: PMC4221085 DOI: 10.5210/ojphi.v6i2.5469] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Objective Automated syndrome classification aims to aid near real-time syndromic
surveillance to serve as an early warning system for disease outbreaks,
using Emergency Department (ED) data. We present a system that improves the
automatic classification of an ED record with triage note into one or more
syndrome categories using the vector space model coupled with a
‘learning’ module that employs a pseudo-relevance feedback
mechanism. Materials and Methods: Terms from standard syndrome
definitions are used to construct an initial reference dictionary for
generating the syndrome and triage note vectors. Based on cosine similarity
between the vectors, each record is classified into a syndrome category. We
then take terms from the top-ranked records that belong to the syndrome of
interest as feedback. These terms are added to the reference dictionary and
the process is repeated to determine the final classification. The system
was tested on two different datasets for each of three syndromes:
Gastro-Intestinal (GI), Respiratory (Resp) and Fever-Rash (FR). Performance
was measured in terms of sensitivity (Se) and specificity (Sp).
Results: The use of relevance feedback produced high values
of sensitivity and specificity for all three syndromes in both test sets:
GI: 90% and 71%, Resp: 97% and 73%, FR: 100% and 87%, respectively, in test
set 1, and GI: 88% and 69%, Resp: 87% and 61%, FR: 97% and 71%,
respectively, in test set 2. Conclusions: The new system for
pre-processing and syndromic classification of ED records with triage notes
achieved improvements in Se and Sp. Our results also demonstrate that the
system can be tuned to achieve different levels of performance based on user
requirements.
Collapse
|
8
|
Zheng H, Gaff H, Smith G, DeLisle S. Epidemic surveillance using an electronic medical record: an empiric approach to performance improvement. PLoS One 2014; 9:e100845. [PMID: 25006878 PMCID: PMC4090236 DOI: 10.1371/journal.pone.0100845] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Accepted: 05/30/2014] [Indexed: 01/19/2023] Open
Abstract
BACKGROUNDS Electronic medical records (EMR) form a rich repository of information that could benefit public health. We asked how structured and free-text narrative EMR data should be combined to improve epidemic surveillance for acute respiratory infections (ARI). METHODS Eight previously characterized ARI case detection algorithms (CDA) were applied to historical EMR entries to create authentic time series of daily ARI case counts (background). An epidemic model simulated influenza cases (injection). From the time of the injection, cluster-detection statistics were applied daily on paired background+injection (combined) and background-only time series. This cycle was then repeated with the injection shifted to each week of the evaluation year. We computed: a) the time from injection to the first statistical alarm uniquely found in the combined dataset (Detection Delay); b) how often alarms originated in the background-only dataset (false-alarm rate, or FAR); and c) the number of cases found within these false alarms (Caseload). For each CDA, we plotted the Detection Delay as a function of FAR or Caseload, over a broad range of alarm thresholds. RESULTS CDAs that combined text analyses seeking ARI symptoms in clinical notes with provider-assigned diagnostic codes in order to maximize the precision rather than the sensitivity of case-detection lowered Detection Delay at any given FAR or Caseload. CONCLUSION An empiric approach can guide the integration of EMR data into case-detection methods that improve both the timeliness and efficiency of epidemic detection.
Collapse
Affiliation(s)
- Hongzhang Zheng
- Veterans Affairs Maryland Health Care System, Baltimore, Maryland, United States of America
- School of Medicine, University of Maryland, Baltimore, Maryland, United States of America
| | - Holly Gaff
- Department of Biological Sciences, Old Dominion University, Norfolk, Virginia, United States of America
| | - Gary Smith
- School of Veterinary Medicine, University of Pennsylvania, Kennett Square, Pennsylvania, United States of America
| | - Sylvain DeLisle
- Veterans Affairs Maryland Health Care System, Baltimore, Maryland, United States of America
- School of Medicine, University of Maryland, Baltimore, Maryland, United States of America
| |
Collapse
|
9
|
Gerbier-Colomban S, Gicquel Q, Millet AL, Riou C, Grando J, Darmoni S, Potinet-Pagliaroli V, Metzger MH. Evaluation of syndromic algorithms for detecting patients with potentially transmissible infectious diseases based on computerised emergency-department data. BMC Med Inform Decis Mak 2013; 13:101. [PMID: 24004720 PMCID: PMC3766242 DOI: 10.1186/1472-6947-13-101] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 08/30/2013] [Indexed: 11/17/2022] Open
Abstract
Background The objective of this study was to ascertain the performance of syndromic algorithms for the early detection of patients in healthcare facilities who have potentially transmissible infectious diseases, using computerised emergency department (ED) data. Methods A retrospective cohort in an 810-bed University of Lyon hospital in France was analysed. Adults who were admitted to the ED and hospitalised between June 1, 2007, and March 31, 2010 were included (N=10895). Different algorithms were built to detect patients with infectious respiratory, cutaneous or gastrointestinal syndromes. The performance parameters of these algorithms were assessed with regard to the capacity of our infection-control team to investigate the detected cases. Results For respiratory syndromes, the sensitivity of the detection algorithms was 82.70%, and the specificity was 82.37%. For cutaneous syndromes, the sensitivity of the detection algorithms was 78.08%, and the specificity was 95.93%. For gastrointestinal syndromes, the sensitivity of the detection algorithms was 79.41%, and the specificity was 81.97%. Conclusions This assessment permitted us to detect patients with potentially transmissible infectious diseases, while striking a reasonable balance between true positives and false positives, for both respiratory and cutaneous syndromes. The algorithms for gastrointestinal syndromes were not specific enough for routine use, because they generated a large number of false positives relative to the number of infected patients. Detection of patients with potentially transmissible infectious diseases will enable us to take precautions to prevent transmission as soon as these patients come in contact with healthcare facilities.
Collapse
Affiliation(s)
- Solweig Gerbier-Colomban
- Hospices Civils de Lyon, Hôpital de la Croix-Rousse, Unité d'hygiène et d'épidémiologie, F-69317 Lyon, France.
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Yan W, Palm L, Lu X, Nie S, Xu B, Zhao Q, Tao T, Cheng L, Tan L, Dong H, Diwan VK. ISS--an electronic syndromic surveillance system for infectious disease in rural China. PLoS One 2013; 8:e62749. [PMID: 23626853 PMCID: PMC3633833 DOI: 10.1371/journal.pone.0062749] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 03/29/2013] [Indexed: 12/04/2022] Open
Abstract
Background syndromic surveillance system has great advantages in promoting the early detection of epidemics and reducing the necessities of disease confirmation, and it is especially effective for surveillance in resource poor settings. However, most current syndromic surveillance systems are established in developed countries, and there are very few reports on the development of an electronic syndromic surveillance system in resource-constrained settings. Objective this study describes the design and pilot implementation of an electronic surveillance system (ISS) for the early detection of infectious disease epidemics in rural China, complementing the conventional case report surveillance system. Methods ISS was developed based on an existing platform ‘Crisis Information Sharing Platform’ (CRISP), combining with modern communication and GIS technology. ISS has four interconnected functions: 1) work group and communication group; 2) data source and collection; 3) data visualization; and 4) outbreak detection and alerting. Results As of Jan. 31st 2012, ISS has been installed and pilot tested for six months in four counties in rural China. 95 health facilities, 14 pharmacies and 24 primary schools participated in the pilot study, entering respectively 74256, 79701, and 2330 daily records into the central database. More than 90% of surveillance units at the study sites are able to send daily information into the system. In the paper, we also presented the pilot data from health facilities in the two counties, which showed the ISS system had the potential to identify the change of disease patterns at the community level. Conclusions The ISS platform may facilitate the early detection of infectious disease epidemic as it provides near real-time syndromic data collection, interactive visualization, and automated aberration detection. However, several constraints and challenges were encountered during the pilot implementation of ISS in rural China.
Collapse
Affiliation(s)
- Weirong Yan
- Division of Global Health (IHCAR), Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Conway M, Dowling JN, Chapman WW. Using chief complaints for syndromic surveillance: a review of chief complaint based classifiers in North America. J Biomed Inform 2013; 46:734-43. [PMID: 23602781 DOI: 10.1016/j.jbi.2013.04.003] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Revised: 08/30/2012] [Accepted: 04/03/2013] [Indexed: 11/27/2022]
Abstract
A major goal of Natural Language Processing in the public health informatics domain is the automatic extraction and encoding of data stored in free text patient records. This extracted data can then be utilized by computerized systems to perform syndromic surveillance. In particular, the chief complaint--a short string that describes a patient's symptoms--has come to be a vital resource for syndromic surveillance in the North American context due to its near ubiquity. This paper reviews fifteen systems in North America--at the city, county, state and federal level--that use chief complaints for syndromic surveillance.
Collapse
Affiliation(s)
- Mike Conway
- Division of Biomedical Informatics, University of California, San Diego, 9500 Gilman Dr. MC 0505 La Jolla, California 92093, USA.
| | | | | |
Collapse
|
12
|
Dórea FC, Muckle CA, Kelton D, McClure JT, McEwen BJ, McNab WB, Sanchez J, Revie CW. Exploratory analysis of methods for automated classification of laboratory test orders into syndromic groups in veterinary medicine. PLoS One 2013; 8:e57334. [PMID: 23505427 PMCID: PMC3591392 DOI: 10.1371/journal.pone.0057334] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2012] [Accepted: 01/21/2013] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Recent focus on earlier detection of pathogen introduction in human and animal populations has led to the development of surveillance systems based on automated monitoring of health data. Real- or near real-time monitoring of pre-diagnostic data requires automated classification of records into syndromes--syndromic surveillance--using algorithms that incorporate medical knowledge in a reliable and efficient way, while remaining comprehensible to end users. METHODS This paper describes the application of two of machine learning (Naïve Bayes and Decision Trees) and rule-based methods to extract syndromic information from laboratory test requests submitted to a veterinary diagnostic laboratory. RESULTS High performance (F1-macro = 0.9995) was achieved through the use of a rule-based syndrome classifier, based on rule induction followed by manual modification during the construction phase, which also resulted in clear interpretability of the resulting classification process. An unmodified rule induction algorithm achieved an F(1-micro) score of 0.979 though this fell to 0.677 when performance for individual classes was averaged in an unweighted manner (F(1-macro)), due to the fact that the algorithm failed to learn 3 of the 16 classes from the training set. Decision Trees showed equal interpretability to the rule-based approaches, but achieved an F(1-micro) score of 0.923 (falling to 0.311 when classes are given equal weight). A Naïve Bayes classifier learned all classes and achieved high performance (F(1-micro)= 0.994 and F(1-macro) = .955), however the classification process is not transparent to the domain experts. CONCLUSION The use of a manually customised rule set allowed for the development of a system for classification of laboratory tests into syndromic groups with very high performance, and high interpretability by the domain experts. Further research is required to develop internal validation rules in order to establish automated methods to update model rules without user input.
Collapse
Affiliation(s)
- Fernanda C Dórea
- Department of Health Management, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, Prince Edward Island, Canada.
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Alemi F, Torii M, Atherton MJ, Pattie DC, Cox KL. Bayesian Processing of Context-Dependent Text. Med Decis Making 2012; 32:E1-9. [DOI: 10.1177/0272989x12439753] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Objective. This article aims to examine whether words listed in reasons for appointments could effectively predict laboratory-verified influenza cases in syndromic surveillance systems. Methods. Data were collected from the Armed Forces Health Longitudinal Technological Application medical record system. We used 2 algorithms to combine the impact of words within reasons for appointments: Dependent (DBSt) and Independent (IBSt) Bayesian System. We used receiver operating characteristic curves to compare the accuracy of these 2 methods of processing reasons for appointments against current and previous lists of diagnoses used in the Department of Defense’s syndromic surveillance system. Results. We examined 13,096 cases, where the results of influenza tests were available. Each reason for an appointment had an average of 3.5 words (standard deviation = 2.2 words). There was no difference in performance of the 2 algorithms. The area under the curve for IBSt was 0.58 and for DBSt was 0.56. The difference was not statistically significant (McNemar statistic = 0.0054; P = 0.07). Conclusions. These data suggest that reasons for appointments can improve the accuracy of lists of diagnoses in predicting laboratory-verified influenza cases. This study recommends further exploration of the DBSt algorithm and reasons for appointments in predicting likely influenza cases.
Collapse
Affiliation(s)
- Farrokh Alemi
- Department of Health Systems Administration, Georgetown University, Washington, DC (FA)
- Imaging Science and Information Systems Center, Georgetown University, Washington, DC (MT)
- SciMetrika LLC, Falls Church, VA (MJA)
- Planned Systems International Inc., Falls Church, VA (DCP)
- Health Surveillance Center, Silver Spring, MD (KLC)
| | - Manabu Torii
- Department of Health Systems Administration, Georgetown University, Washington, DC (FA)
- Imaging Science and Information Systems Center, Georgetown University, Washington, DC (MT)
- SciMetrika LLC, Falls Church, VA (MJA)
- Planned Systems International Inc., Falls Church, VA (DCP)
- Health Surveillance Center, Silver Spring, MD (KLC)
| | - Martin J. Atherton
- Department of Health Systems Administration, Georgetown University, Washington, DC (FA)
- Imaging Science and Information Systems Center, Georgetown University, Washington, DC (MT)
- SciMetrika LLC, Falls Church, VA (MJA)
- Planned Systems International Inc., Falls Church, VA (DCP)
- Health Surveillance Center, Silver Spring, MD (KLC)
| | - David C. Pattie
- Department of Health Systems Administration, Georgetown University, Washington, DC (FA)
- Imaging Science and Information Systems Center, Georgetown University, Washington, DC (MT)
- SciMetrika LLC, Falls Church, VA (MJA)
- Planned Systems International Inc., Falls Church, VA (DCP)
- Health Surveillance Center, Silver Spring, MD (KLC)
| | - Kenneth L. Cox
- Department of Health Systems Administration, Georgetown University, Washington, DC (FA)
- Imaging Science and Information Systems Center, Georgetown University, Washington, DC (MT)
- SciMetrika LLC, Falls Church, VA (MJA)
- Planned Systems International Inc., Falls Church, VA (DCP)
- Health Surveillance Center, Silver Spring, MD (KLC)
| |
Collapse
|
14
|
Abstract
OBJECTIVE Online social networking sites are web services in which users create public or semipublic profiles and connect to build online communities, finding like-minded people through self-labeled personal attributes including ethnicity, leisure interests, political beliefs, and, increasingly, health status. Thirty-nine percent of patients in the United States identified themselves as users of social networks in a recent survey. "Tags," user-generated descriptors functioning as labels for user-generated content, are increasingly important to social networking, and the language used by patients is thus becoming important for knowledge representation in these systems. However, patient language poses considerable challenges for health communication and networking. How have information systems traditionally incorporated these languages in their controlled vocabularies and thesauri? How do system builders know what consumers and patients say? METHODS This comprehensive review of the literature of health care (PubMed MEDLINE, CINAHL), library science, and information science (Library and Information Science and Technology Abstracts, Library and Information Science Abstracts, and Library Literature) examines the research domains in which consumer and patient language has been explored. RESULTS Consumer contributions to controlled vocabulary appear to be seriously under-researched inside and outside of health care. CONCLUSION The author reflects on the implications of these findings for online social networks devoted to patients and the patient experience.
Collapse
Affiliation(s)
- Catherine A Smith
- School of Library and Information Studies, University of Wisconsin-Madison, 600 North Park Street #4255, Madison, WI 53706, USA.
| |
Collapse
|
15
|
Burkom HS. Comments on 'some methodological issues in biosurveillance'. Stat Med 2011; 30:426-9; discussion 434-41. [PMID: 21312212 DOI: 10.1002/sim.3986] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Howard S Burkom
- Applied Physics Laboratory, The Johns Hopkins University, Laurel, MD, USA.
| |
Collapse
|
16
|
DeLisle S, South B, Anthony JA, Kalp E, Gundlapallli A, Curriero FC, Glass GE, Samore M, Perl TM. Combining free text and structured electronic medical record entries to detect acute respiratory infections. PLoS One 2010; 5:e13377. [PMID: 20976281 PMCID: PMC2954790 DOI: 10.1371/journal.pone.0013377] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2010] [Accepted: 08/30/2010] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND The electronic medical record (EMR) contains a rich source of information that could be harnessed for epidemic surveillance. We asked if structured EMR data could be coupled with computerized processing of free-text clinical entries to enhance detection of acute respiratory infections (ARI). METHODOLOGY A manual review of EMR records related to 15,377 outpatient visits uncovered 280 reference cases of ARI. We used logistic regression with backward elimination to determine which among candidate structured EMR parameters (diagnostic codes, vital signs and orders for tests, imaging and medications) contributed to the detection of those reference cases. We also developed a computerized free-text search to identify clinical notes documenting at least two non-negated ARI symptoms. We then used heuristics to build case-detection algorithms that best combined the retained structured EMR parameters with the results of the text analysis. PRINCIPAL FINDINGS An adjusted grouping of diagnostic codes identified reference ARI patients with a sensitivity of 79%, a specificity of 96% and a positive predictive value (PPV) of 32%. Of the 21 additional structured clinical parameters considered, two contributed significantly to ARI detection: new prescriptions for cough remedies and elevations in body temperature to at least 38°C. Together with the diagnostic codes, these parameters increased detection sensitivity to 87%, but specificity and PPV declined to 95% and 25%, respectively. Adding text analysis increased sensitivity to 99%, but PPV dropped further to 14%. Algorithms that required satisfying both a query of structured EMR parameters as well as text analysis disclosed PPVs of 52-68% and retained sensitivities of 69-73%. CONCLUSION Structured EMR parameters and free-text analyses can be combined into algorithms that can detect ARI cases with new levels of sensitivity or precision. These results highlight potential paths by which repurposed EMR information could facilitate the discovery of epidemics before they cause mass casualties.
Collapse
Affiliation(s)
- Sylvain DeLisle
- Veterans Affairs Maryland Health Care System, Baltimore, Maryland, United States of America.
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Chen H, Zeng D, Yan P. RODS. INTEGRATED SERIES IN INFORMATION SYSTEMS 2010. [PMCID: PMC7498900 DOI: 10.1007/978-1-4419-1278-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The Real-time Outbreak and Disease Surveillance (RODS) system was initiated by the RODS Laboratory at the University of Pittsburgh in 1999. The system is now an open source project under the GNU license. The RODS development effort has been organized into seven functional areas: overall design, data collection, syndrome classification, database and data warehousing, outbreak detection algorithms, data access, and user interfaces. Each functional area has a coordinator for the open source project, and there is an overall coordinator responsible for the architecture, overall integration of components, and overall quality of the JAVA source code. Figure 8-1 illustrates the RODS' system architecture. The RODS system as a syndromic surveillance application was originally deployed in Pennsylvania, Utah, and Ohio. As of 2006, RODS performs emergency department surveillance for other states of California, Illinois, Kentucky, Michigan, New Jersey, Nevada, and Wyoming through an ASP model at the University of Pittsburgh, and through local installations in Taiwan, Canada, Mississippi, Michigan, California, and Texas. As of June 2006, about 20 regions with more than 200 healthcare facilities connected to RODS in real-time. It was also deployed during the 2002 Winter Olympics (Espino et al., 2004). It also serves as the user interface for national over-the-counter medication sales surveillance data collected through the NRDM.
Collapse
|