Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ye Y, Tsui F(R, Wagner M, Espino JU, Li Q. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. J Am Med Inform Assoc 2014;21:815-23. [PMID: 24406261 PMCID: PMC4147621 DOI: 10.1136/amiajnl-2013-001934] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Revised: 09/25/2013] [Accepted: 12/11/2013] [Indexed: 01/29/2023] Open

For:	Ye Y, Tsui F(R, Wagner M, Espino JU, Li Q. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. J Am Med Inform Assoc 2014;21:815-23. [PMID: 24406261 PMCID: PMC4147621 DOI: 10.1136/amiajnl-2013-001934] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Revised: 09/25/2013] [Accepted: 12/11/2013] [Indexed: 01/29/2023] Open

Number

Cited by Other Article(s)

Safari N, Fang H, Veerareddy A, Xu P, Krueger F. The anatomical structure of sex differences in trust propensity: A voxel-based morphometry study. Cortex 2024;176:260-273. [PMID: 38677959 DOI: 10.1016/j.cortex.2024.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 08/14/2023] [Accepted: 02/28/2024] [Indexed: 04/29/2024]

Abstract

Trust is a key component of human relationships. Sex differences in trust behavior have been elucidated by parental investment theory and social role theory, attributing men's higher trust propensity to their increased engagement in physically and socially risky activities aimed at securing additional resources. Although sex differences in trust behavior exist and the neuropsychological signatures of trust are known, the underlying anatomical structure of sex differences is still unexplored. Our study aimed to investigate the anatomical structure of sex differences in trust behavior toward strangers (i.e., trust propensity, TP) by employing voxel-based morphometry (VBM) in a sample of healthy young adults. We collected behavioral data for TP as measured with participants in the role of trustors completing the one-shot trust game (TG) with anonymous partners as trustees. We conducted primary region of interest (ROI) and exploratory whole-brain (WB) VBM analyses of high-resolution structural images to test for the association between TP and regional gray matter volume (GMV) associated with sex differences. Confirming previous studies, our behavioral results demonstrated that men trusted more than women during the one-shot TG. Our WB analysis showed a greater GMV related to TP in men than women in the precuneus (PreC), whereas our ROI analysis in regions of the default-mode network (dorsomedial prefrontal cortex [dmPFC], PreC, superior temporal gyrus) to simulate the partner's trustworthiness, central-executive network (ventrolateral PFC) to implement a calculus-based trust strategy, and action-perception network (precentral gyrus) to performance cost-benefit calculations, as proposed by a neuropsychoeconomic model of trust. Our findings advance the neuropsychological understanding of sex differences in TP, which has implications for interpersonal partnerships, financial transactions, and societal engagements.

Collapse

Wang H, Alanis N, Haygood L, Swoboda TK, Hoot N, Phillips D, Knowles H, Stinson SA, Mehta P, Sambamoorthi U. Using natural language processing in emergency medicine health service research: A systematic review and meta-analysis. Acad Emerg Med 2024. [PMID: 38757352 DOI: 10.1111/acem.14937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/15/2024] [Accepted: 04/17/2024] [Indexed: 05/18/2024]

Wissel BD, Greiner HM, Glauser TA, Mangano FT, Holland-Bouley KD, Zhang N, Szczesniak RD, Santel D, Pestian JP, Dexheimer JW. Automated, machine learning-based alerts increase epilepsy surgery referrals: A randomized controlled trial. Epilepsia 2023;64:1791-1799. [PMID: 37102995 PMCID: PMC10524622 DOI: 10.1111/epi.17629] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 04/25/2023] [Accepted: 04/25/2023] [Indexed: 04/28/2023]

Affiliation(s)

Benjamin D Wissel Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Hansel M Greiner Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Tracy A Glauser Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Francesco T Mangano Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Neurosurgery, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Katherine D Holland-Bouley Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Neurology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Nanhua Zhang Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Biostatistics & Epidemiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Rhonda D Szczesniak Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Biostatistics & Epidemiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
Daniel Santel Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA
John P Pestian Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
Judith W Dexheimer Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA Division of Emergency Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA

Collapse

Dolatabadi E, Chen B, Buchan SA, Austin AM, Azimaee M, McGeer A, Mubareka S, Kwong JC. Natural Language Processing for Clinical Laboratory Data Repository Systems: Implementation and Evaluation for Respiratory Viruses. JMIR AI 2023;2:e44835. [PMID: 38875570 PMCID: PMC11057455 DOI: 10.2196/44835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/31/2023] [Accepted: 04/18/2023] [Indexed: 06/16/2024]

Abstract

BACKGROUND

With the growing volume and complexity of laboratory repositories, it has become tedious to parse unstructured data into structured and tabulated formats for secondary uses such as decision support, quality assurance, and outcome analysis. However, advances in natural language processing (NLP) approaches have enabled efficient and automated extraction of clinically meaningful medical concepts from unstructured reports.

OBJECTIVE

In this study, we aimed to determine the feasibility of using the NLP model for information extraction as an alternative approach to a time-consuming and operationally resource-intensive handcrafted rule-based tool. Therefore, we sought to develop and evaluate a deep learning-based NLP model to derive knowledge and extract information from text-based laboratory reports sourced from a provincial laboratory repository system.

METHODS

The NLP model, a hierarchical multilabel classifier, was trained on a corpus of laboratory reports covering testing for 14 different respiratory viruses and viral subtypes. The corpus includes 87,500 unique laboratory reports annotated by 8 subject matter experts (SMEs). The classification task involved assigning the laboratory reports to labels at 2 levels: 24 fine-grained labels in level 1 and 6 coarse-grained labels in level 2. A "label" also refers to the status of a specific virus or strain being tested or detected (eg, influenza A is detected). The model's performance stability and variation were analyzed across all labels in the classification task. Additionally, the model's generalizability was evaluated internally and externally on various test sets.

RESULTS

Overall, the NLP model performed well on internal, out-of-time (pre-COVID-19), and external (different laboratories) test sets with microaveraged F1-scores >94% across all classes. Higher precision and recall scores with less variability were observed for the internal and pre-COVID-19 test sets. As expected, the model's performance varied across categories and virus types due to the imbalanced nature of the corpus and sample sizes per class. There were intrinsically fewer classes of viruses being detected than those tested; therefore, the model's performance (lowest F1-score of 57%) was noticeably lower in the detected cases.

CONCLUSIONS

We demonstrated that deep learning-based NLP models are promising solutions for information extraction from text-based laboratory reports. These approaches enable scalable, timely, and practical access to high-quality and encoded laboratory data if integrated into laboratory information system repositories.

Collapse

Aronis JM, Ye Y, Espino J, Hochheiser H, Michaels MG, Cooper GF. A Bayesian System to Track Outbreaks of Influenza-Like Illnesses Including Novel Diseases. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.10.23289799. [PMID: 37293033 PMCID: PMC10246032 DOI: 10.1101/2023.05.10.23289799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Hao T, Wissel B, Ni Y, Pajor N, Glauser T, Pestian J, Dexheimer JW. Implementation of Machine Learning Pipelines for Clinical Practice: Development and Validation Study. JMIR Med Inform 2022;10:e37833. [PMID: 36525289 PMCID: PMC9804095 DOI: 10.2196/37833] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 09/01/2022] [Accepted: 09/19/2022] [Indexed: 01/03/2023] Open

Tang KY, Hsiao CH, Hwang GJ. A scholarly network of AI research with an information science focus: Global North and Global South perspectives. PLoS One 2022;17:e0266565. [PMID: 35427381 PMCID: PMC9012391 DOI: 10.1371/journal.pone.0266565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 03/22/2022] [Indexed: 11/19/2022] Open

Abstract This paper primarily aims to provide a citation-based method for exploring the scholarly network of artificial intelligence (AI)-related research in the information science (IS) domain, especially from Global North (GN) and Global South (GS) perspectives. Three research objectives were addressed, namely (1) the publication patterns in the field, (2) the most influential articles and researched keywords in the field, and (3) the visualization of the scholarly network between GN and GS researchers between the years 2010 and 2020. On the basis of the PRISMA statement, longitudinal research data were retrieved from the Web of Science and analyzed. Thirty-two AI-related keywords were used to retrieve relevant quality articles. Finally, 149 articles accompanying the follow-up 8838 citing articles were identified as eligible sources. A co-citation network analysis was adopted to scientifically visualize the intellectual structure of AI research in GN and GS networks. The results revealed that the United States, Australia, and the United Kingdom are the most productive GN countries; by contrast, China and India are the most productive GS countries. Next, the 10 most frequently co-cited AI research articles in the IS domain were identified. Third, the scholarly networks of AI research in the GN and GS areas were visualized. Between 2010 and 2015, GN researchers in the IS domain focused on applied research involving intelligent systems (e.g., decision support systems); between 2016 and 2020, GS researchers focused on big data applications (e.g., geospatial big data research). Both GN and GS researchers focused on technology adoption research (e.g., AI-related products and services) throughout the investigated period. Overall, this paper reveals the intellectual structure of the scholarly network on AI research and several applications in the IS literature. The findings provide research-based evidence for expanding global AI research. Collapse

Automating and Improving Cardiovascular Disease Prediction Using Machine Learning and EMR Data Features from a Regional Healthcare System. Int J Med Inform 2022;163:104786. [DOI: 10.1016/j.ijmedinf.2022.104786] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 04/23/2022] [Accepted: 04/25/2022] [Indexed: 02/05/2023]

Detection and Prevention of Virus Infection. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022;1368:21-52. [DOI: 10.1007/978-981-16-8969-7_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Chamola V, Hassija V, Gupta S, Goyal A, Guizani M, Sikdar B. Disaster and Pandemic Management Using Machine Learning: A Survey. IEEE INTERNET OF THINGS JOURNAL 2021;8:16047-16071. [PMID: 35782181 PMCID: PMC8768997 DOI: 10.1109/jiot.2020.3044966] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 11/26/2020] [Accepted: 12/10/2020] [Indexed: 05/14/2023]

Dipaola F, Shiffer D, Gatti M, Menè R, Solbiati M, Furlan R. Machine Learning and Syncope Management in the ED: The Future Is Coming. ACTA ACUST UNITED AC 2021;57:medicina57040351. [PMID: 33917508 PMCID: PMC8067452 DOI: 10.3390/medicina57040351] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 03/30/2021] [Accepted: 04/02/2021] [Indexed: 11/16/2022]

Oliveira CR, Niccolai P, Ortiz AM, Sheth SS, Shapiro ED, Niccolai LM, Brandt CA. Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study. JMIR Med Inform 2020;8:e20826. [PMID: 32469840 PMCID: PMC7671846 DOI: 10.2196/20826] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 09/18/2020] [Accepted: 10/04/2020] [Indexed: 12/13/2022] Open

Abstract

Background

Accurate identification of new diagnoses of human papillomavirus–associated cancers and precancers is an important step toward the development of strategies that optimize the use of human papillomavirus vaccines. The diagnosis of human papillomavirus cancers hinges on a histopathologic report, which is typically stored in electronic medical records as free-form, or unstructured, narrative text. Previous efforts to perform surveillance for human papillomavirus cancers have relied on the manual review of pathology reports to extract diagnostic information, a process that is both labor- and resource-intensive. Natural language processing can be used to automate the structuring and extraction of clinical data from unstructured narrative text in medical records and may provide a practical and effective method for identifying patients with vaccine-preventable human papillomavirus disease for surveillance and research.

Objective

This study's objective was to develop and assess the accuracy of a natural language processing algorithm for the identification of individuals with cancer or precancer of the cervix and anus.

Methods

A pipeline-based natural language processing algorithm was developed, which incorporated machine learning and rule-based methods to extract diagnostic elements from the narrative pathology reports. To test the algorithm’s classification accuracy, we used a split-validation study design. Full-length cervical and anal pathology reports were randomly selected from 4 clinical pathology laboratories. Two study team members, blinded to the classifications produced by the natural language processing algorithm, manually and independently reviewed all reports and classified them at the document level according to 2 domains (diagnosis and human papillomavirus testing results). Using the manual review as the gold standard, the algorithm’s performance was evaluated using standard measurements of accuracy, recall, precision, and F-measure.

Results

The natural language processing algorithm’s performance was validated on 949 pathology reports. The algorithm demonstrated accurate identification of abnormal cytology, histology, and positive human papillomavirus tests with accuracies greater than 0.91. Precision was lowest for anal histology reports (0.87, 95% CI 0.59-0.98) and highest for cervical cytology (0.98, 95% CI 0.95-0.99). The natural language processing algorithm missed 2 out of the 15 abnormal anal histology reports, which led to a relatively low recall (0.68, 95% CI 0.43-0.87).

Conclusions

This study outlines the development and validation of a freely available and easily implementable natural language processing algorithm that can automate the extraction and classification of clinical data from cervical and anal cytology and histology.

Collapse

Kasson PM. Infectious Disease Research in the Era of Big Data. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-121219-025722] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Lu W, Ng R. Automated Analysis of Public Health Laboratory Test Results. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2020;2020:393-402. [PMID: 32477660 PMCID: PMC7233052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Tsui F, Ye Y, Ruiz V, Cooper GF, Wagner MM. Automated influenza case detection for public health surveillance and clinical diagnosis using dynamic influenza prevalence method. J Public Health (Oxf) 2019;40:878-885. [PMID: 29059331 DOI: 10.1093/pubmed/fdx141] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Indexed: 11/13/2022] Open

A Review of Automatic Phenotyping Approaches using Electronic Health Records. ELECTRONICS 2019. [DOI: 10.3390/electronics8111235] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Shafaf N, Malek H. Applications of Machine Learning Approaches in Emergency Medicine; a Review Article. ARCHIVES OF ACADEMIC EMERGENCY MEDICINE 2019;7:34. [PMID: 31555764 PMCID: PMC6732202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/30/2022]

Conway M, Keyhani S, Christensen L, South BR, Vali M, Walter LC, Mowery DL, Abdelrahman S, Chapman WW. Moonstone: a novel natural language processing system for inferring social risk from clinical narratives. J Biomed Semantics 2019;10:6. [PMID: 30975223 PMCID: PMC6458709 DOI: 10.1186/s13326-019-0198-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 03/18/2019] [Indexed: 11/10/2022] Open

Abstract

Background

Social risk factors are important dimensions of health and are linked to access to care, quality of life, health outcomes and life expectancy. However, in the Electronic Health Record, data related to many social risk factors are primarily recorded in free-text clinical notes, rather than as more readily computable structured data, and hence cannot currently be easily incorporated into automated assessments of health. In this paper, we present Moonstone, a new, highly configurable rule-based clinical natural language processing system designed to automatically extract information that requires inferencing from clinical notes. Our initial use case for the tool is focused on the automatic extraction of social risk factor information — in this case, housing situation, living alone, and social support — from clinical notes. Nursing notes, social work notes, emergency room physician notes, primary care notes, hospital admission notes, and discharge summaries, all derived from the Veterans Health Administration, were used for algorithm development and evaluation.

Results

An evaluation of Moonstone demonstrated that the system is highly accurate in extracting and classifying the three variables of interest (housing situation, living alone, and social support). The system achieved positive predictive value (i.e. precision) scores ranging from 0.66 (homeless/marginally housed) to 0.98 (lives at home/not homeless), accuracy scores ranging from 0.63 (lives in facility) to 0.95 (lives alone), and sensitivity (i.e. recall) scores ranging from 0.75 (lives in facility) to 0.97 (lives alone).

Conclusions

The Moonstone system is — to the best of our knowledge — the first freely available, open source natural language processing system designed to extract social risk factors from clinical text with good (lives in facility) to excellent (lives alone) performance. Although developed with the social risk factor identification task in mind, Moonstone provides a powerful tool to address a range of clinical natural language processing tasks, especially those tasks that require nuanced linguistic processing in conjunction with inference capabilities.

Collapse

Smooth Bayesian network model for the prediction of future high-cost patients with COPD. Int J Med Inform 2019;126:147-155. [PMID: 31029256 DOI: 10.1016/j.ijmedinf.2019.03.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 02/28/2019] [Accepted: 03/26/2019] [Indexed: 02/05/2023]

Abstract

INTRODUCTION

The clinical course of chronic obstructive pulmonary disease (COPD) is marked by acute exacerbation events that increase hospitalization rates and healthcare spending. The early identification of future high-cost patients with COPD may decrease healthcare spending by informing individualized interventions that prevent exacerbation events and decelerate disease progression. Existing studies of cost prediction of other chronic diseases have applied regression and machine-learning methods that cannot capture the complex causal relationships between COPD factors. Thus, the exploration of these factors through nonlinear, high-dimensional but explainable modeling is greatly needed.

OBJECTIVES

We aimed to develop a machine-learning model to identify future high-cost patients with COPD. Such a model should incorporate expert knowledge about causal relationships, and the method for estimating the model could provide more accurate predictions than other machine learning methods.

METHODS

We used the 2011-2013 medical insurance data of patients with COPD in a large city. The data set included demographic information and admission records. Leveraging on developments in graphical modeling methods, we proposed a smooth Bayesian network (SBN) model for the prediction of high-cost individuals using medical insurance data. The modeling method incorporated some expert knowledge about causal relationships (i.e., about the Bayesian network structure). We employed a smoothing kernel based on the weighted nearest neighborhood method in the SBN model to address overfitting, case-mix effect, and data sparsity (i.e., using data about "similar patients").

RESULTS

The proposed SBN achieved the area under curve (AUC) of 0.80 and showed considerable improvement over the baseline machine-learning methods. Besides confirming the known factors from the literature, we found "region" (i.e., a suburban or urban area) to be a significant factor, and that in a 3-tier system with primary, secondary and tertiary hospitals, COPD patients who had been admitted to primary hospitals were more likely to develop into future high-cost patients than patients who had been admitted to tertiary hospitals.

CONCLUSION

The proposed SBN model not only obtained higher prediction accuracy and stronger generalizability than a number of benchmark machine-learning methods, but also used the Bayesian network to capture the complex causal relationships between different predictors by incorporating expert knowledge. Furthermore, a framework was developed to establish the relationships between exposure to historical trajectory and future outcome, which can also be applied to other temporal data to model different trajectory information and predict other outcomes.

Collapse

Zhou S, Ren X, Yang J, Jin Q. Evaluating the Value of Defensins for Diagnosing Secondary Bacterial Infections in Influenza-Infected Patients. Front Microbiol 2018;9:2762. [PMID: 30524393 PMCID: PMC6256186 DOI: 10.3389/fmicb.2018.02762] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 10/29/2018] [Indexed: 11/13/2022] Open

Tou H, Yao L, Wei Z, Zhuang X, Zhang B. Automatic infection detection based on electronic medical records. BMC Bioinformatics 2018;19:117. [PMID: 29671399 PMCID: PMC5907141 DOI: 10.1186/s12859-018-2101-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Kennell TI, Willig JH, Cimino JJ. Clinical Informatics Researcher's Desiderata for the Data Content of the Next Generation Electronic Health Record. Appl Clin Inform 2017;8:1159-1172. [PMID: 29270955 DOI: 10.4338/aci-2017-06-r-0101] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Abstract

OBJECTIVE

Clinical informatics researchers depend on the availability of high-quality data from the electronic health record (EHR) to design and implement new methods and systems for clinical practice and research. However, these data are frequently unavailable or present in a format that requires substantial revision. This article reports the results of a review of informatics literature published from 2010 to 2016 that addresses these issues by identifying categories of data content that might be included or revised in the EHR.

MATERIALS AND METHODS

We used an iterative review process on 1,215 biomedical informatics research articles. We placed them into generic categories, reviewed and refined the categories, and then assigned additional articles, for a total of three iterations.

RESULTS

Our process identified eight categories of data content issues: Adverse Events, Clinician Cognitive Processes, Data Standards Creation and Data Communication, Genomics, Medication List Data Capture, Patient Preferences, Patient-reported Data, and Phenotyping.

DISCUSSION

These categories summarize discussions in biomedical informatics literature that concern data content issues restricting clinical informatics research. These barriers to research result from data that are either absent from the EHR or are inadequate (e.g., in narrative text form) for the downstream applications of the data. In light of these categories, we discuss changes to EHR data storage that should be considered in the redesign of EHRs, to promote continued innovation in clinical informatics.

CONCLUSION

Based on published literature of clinical informaticians' reuse of EHR data, we characterize eight types of data content that, if included in the next generation of EHRs, would find immediate application in advanced informatics tools and techniques.

Collapse

Walsh JA, Shao Y, Leng J, He T, Teng CC, Redd D, Treitler Zeng Q, Burningham Z, Clegg DO, Sauer BC. Identifying Axial Spondyloarthritis in Electronic Medical Records of US Veterans. Arthritis Care Res (Hoboken) 2017;69:1414-1420. [PMID: 27813310 DOI: 10.1002/acr.23140] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 10/19/2016] [Accepted: 11/01/2016] [Indexed: 11/09/2022]

Ferraro JP, Ye Y, Gesteland PH, Haug PJ, Tsui FR, Cooper GF, Van Bree R, Ginter T, Nowalk AJ, Wagner M. The effects of natural language processing on cross-institutional portability of influenza case detection for disease surveillance. Appl Clin Inform 2017;8:560-580. [PMID: 28561130 PMCID: PMC6241736 DOI: 10.4338/aci-2016-12-ra-0211] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Accepted: 03/11/2017] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVES

This study evaluates the accuracy and portability of a natural language processing (NLP) tool for extracting clinical findings of influenza from clinical notes across two large healthcare systems. Effectiveness is evaluated on how well NLP supports downstream influenza case-detection for disease surveillance.

METHODS

We independently developed two NLP parsers, one at Intermountain Healthcare (IH) in Utah and the other at University of Pittsburgh Medical Center (UPMC) using local clinical notes from emergency department (ED) encounters of influenza. We measured NLP parser performance for the presence and absence of 70 clinical findings indicative of influenza. We then developed Bayesian network models from NLP processed reports and tested their ability to discriminate among cases of (1) influenza, (2) non-influenza influenza-like illness (NI-ILI), and (3) 'other' diagnosis.

RESULTS

On Intermountain Healthcare reports, recall and precision of the IH NLP parser were 0.71 and 0.75, respectively, and UPMC NLP parser, 0.67 and 0.79. On University of Pittsburgh Medical Center reports, recall and precision of the UPMC NLP parser were 0.73 and 0.80, respectively, and IH NLP parser, 0.53 and 0.80. Bayesian case-detection performance measured by AUROC for influenza versus non-influenza on Intermountain Healthcare cases was 0.93 (using IH NLP parser) and 0.93 (using UPMC NLP parser). Case-detection on University of Pittsburgh Medical Center cases was 0.95 (using UPMC NLP parser) and 0.83 (using IH NLP parser). For influenza versus NI-ILI on Intermountain Healthcare cases performance was 0.70 (using IH NLP parser) and 0.76 (using UPMC NLP parser). On University of Pisstburgh Medical Center cases, 0.76 (using UPMC NLP parser) and 0.65 (using IH NLP parser).

CONCLUSION

In all but one instance (influenza versus NI-ILI using IH cases), local parsers were more effective at supporting case-detection although performances of non-local parsers were reasonable.

Collapse

Ye Y, Wagner MM, Cooper GF, Ferraro JP, Su H, Gesteland PH, Haug PJ, Millett NE, Aronis JM, Nowalk AJ, Ruiz VM, López Pineda A, Shi L, Van Bree R, Ginter T, Tsui F. A study of the transferability of influenza case detection systems between two large healthcare systems. PLoS One 2017;12:e0174970. [PMID: 28380048 PMCID: PMC5381795 DOI: 10.1371/journal.pone.0174970] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 03/17/2017] [Indexed: 01/16/2023] Open

Abstract

Objectives

This study evaluates the accuracy and transferability of Bayesian case detection systems (BCD) that use clinical notes from emergency department (ED) to detect influenza cases.

Methods

A BCD uses natural language processing (NLP) to infer the presence or absence of clinical findings from ED notes, which are fed into a Bayesain network classifier (BN) to infer patients’ diagnoses. We developed BCDs at the University of Pittsburgh Medical Center (BCD_UPMC) and Intermountain Healthcare in Utah (BCD_IH). At each site, we manually built a rule-based NLP and trained a Bayesain network classifier from over 40,000 ED encounters between Jan. 2008 and May. 2010 using feature selection, machine learning, and expert debiasing approach. Transferability of a BCD in this study may be impacted by seven factors: development (source) institution, development parser, application (target) institution, application parser, NLP transfer, BN transfer, and classification task. We employed an ANOVA analysis to study their impacts on BCD performance.

Results

Both BCDs discriminated well between influenza and non-influenza on local test cases (AUCs > 0.92). When tested for transferability using the other institution’s cases, BCD_UPMC discriminations declined minimally (AUC decreased from 0.95 to 0.94, p<0.01), and BCD_IH discriminations declined more (from 0.93 to 0.87, p<0.0001). We attributed the BCD_IH decline to the lower recall of the IH parser on UPMC notes. The ANOVA analysis showed five significant factors: development parser, application institution, application parser, BN transfer, and classification task.

Conclusion

We demonstrated high influenza case detection performance in two large healthcare systems in two geographically separated regions, providing evidentiary support for the use of automated case detection from routinely collected electronic clinical notes in national influenza surveillance. The transferability could be improved by training Bayesian network classifier locally and increasing the accuracy of the NLP parser.

Collapse

Affiliation(s)

Ye Ye Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Michael M. Wagner Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Gregory F. Cooper Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Jeffrey P. Ferraro Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States of America Intermountain Healthcare, Salt Lake City, Utah, United States of America
Howard Su Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Per H. Gesteland Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States of America Intermountain Healthcare, Salt Lake City, Utah, United States of America Department of Pediatrics, University of Utah, Salt Lake City, Utah, United States of America
Peter J. Haug Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States of America Intermountain Healthcare, Salt Lake City, Utah, United States of America
Nicholas E. Millett Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
John M. Aronis Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Andrew J. Nowalk Department of Pediatrics, Children's Hospital of Pittsburgh of UPMC, Pittsburgh, Pennsylvania, United States of America
Victor M. Ruiz Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Arturo López Pineda Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
Lingyun Shi Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Rudy Van Bree Intermountain Healthcare, Salt Lake City, Utah, United States of America
Thomas Ginter VA Salt Lake City Healthcare System, Salt Lake City, Utah, United States of America
Fuchiang Tsui Real-time Outbreak and Disease Surveillance Laboratory, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America * E-mail:

Collapse

Gaut G, Steyvers M, Imel ZE, Atkins DC, Smyth P. Content Coding of Psychotherapy Transcripts Using Labeled Topic Models. IEEE J Biomed Health Inform 2017;21:476-487. [PMID: 26625437 PMCID: PMC4879602 DOI: 10.1109/jbhi.2015.2503985] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Abstract

Psychotherapy represents a broad class of medical interventions received by millions of patients each year. Unlike most medical treatments, its primary mechanisms are linguistic; i.e., the treatment relies directly on a conversation between a patient and provider. However, the evaluation of patient-provider conversation suffers from critical shortcomings, including intensive labor requirements, coder error, nonstandardized coding systems, and inability to scale up to larger data sets. To overcome these shortcomings, psychotherapy analysis needs a reliable and scalable method for summarizing the content of treatment encounters. We used a publicly available psychotherapy corpus from Alexander Street press comprising a large collection of transcripts of patient-provider conversations to compare coding performance for two machine learning methods. We used the labeled latent Dirichlet allocation (L-LDA) model to learn associations between text and codes, to predict codes in psychotherapy sessions, and to localize specific passages of within-session text representative of a session code. We compared the L-LDA model to a baseline lasso regression model using predictive accuracy and model generalizability (measured by calculating the area under the curve (AUC) from the receiver operating characteristic curve). The L-LDA model outperforms the lasso logistic regression model at predicting session-level codes with average AUC scores of 0.79, and 0.70, respectively. For fine-grained level coding, L-LDA and logistic regression are able to identify specific talk-turns representative of symptom codes. However, model performance for talk-turn identification is not yet as reliable as human coders. We conclude that the L-LDA model has the potential to be an objective, scalable method for accurate automated coding of psychotherapy sessions that perform better than comparable discriminative methods at session-level coding and can also predict fine-grained codes.

Collapse

Ford E, Carroll JA, Smith HE, Scott D, Cassell JA. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc 2016;23:1007-15. [PMID: 26911811 PMCID: PMC4997034 DOI: 10.1093/jamia/ocv180] [Citation(s) in RCA: 205] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Revised: 10/13/2015] [Accepted: 10/26/2015] [Indexed: 12/15/2022] Open

Abstract

BACKGROUND

Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality.

METHODS

A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed.

RESULTS

Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025).

CONCLUSIONS

Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall).

Collapse

López Pineda A, Ye Y, Visweswaran S, Cooper GF, Wagner MM, Tsui FR. Comparison of machine learning classifiers for influenza detection from emergency department free-text reports. J Biomed Inform 2015;58:60-69. [PMID: 26385375 PMCID: PMC4684714 DOI: 10.1016/j.jbi.2015.08.019] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 05/28/2015] [Accepted: 08/21/2015] [Indexed: 12/31/2022]

Han D, Wang S, Jiang C, Jiang X, Kim HE, Sun J, Ohno-Machado L. Trends in biomedical informatics: automated topic analysis of JAMIA articles. J Am Med Inform Assoc 2015;22:1153-63. [PMID: 26555018 PMCID: PMC5009912 DOI: 10.1093/jamia/ocv157] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Revised: 09/08/2015] [Accepted: 09/14/2015] [Indexed: 01/26/2023] Open

Visualizing the structure and the evolving of digital medicine: a scientometrics review. Scientometrics 2015. [DOI: 10.1007/s11192-015-1696-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating. Pediatr Emerg Care 2015;31:536-41. [PMID: 26148107 DOI: 10.1097/pec.0000000000000484] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Li Q, Spooner SA, Kaiser M, Lingren N, Robbins J, Lingren T, Tang H, Solti I, Ni Y. An end-to-end hybrid algorithm for automated medication discrepancy detection. BMC Med Inform Decis Mak 2015;15:37. [PMID: 25943550 PMCID: PMC4427951 DOI: 10.1186/s12911-015-0160-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 04/27/2015] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In this study we implemented and developed state-of-the-art machine learning (ML) and natural language processing (NLP) technologies and built a computerized algorithm for medication reconciliation. Our specific aims are: (1) to develop a computerized algorithm for medication discrepancy detection between patients' discharge prescriptions (structured data) and medications documented in free-text clinical notes (unstructured data); and (2) to assess the performance of the algorithm on real-world medication reconciliation data.

METHODS

We collected clinical notes and discharge prescription lists for all 271 patients enrolled in the Complex Care Medical Home Program at Cincinnati Children's Hospital Medical Center between 1/1/2010 and 12/31/2013. A double-annotated, gold-standard set of medication reconciliation data was created for this collection. We then developed a hybrid algorithm consisting of three processes: (1) a ML algorithm to identify medication entities from clinical notes, (2) a rule-based method to link medication names with their attributes, and (3) a NLP-based, hybrid approach to match medications with structured prescriptions in order to detect medication discrepancies. The performance was validated on the gold-standard medication reconciliation data, where precision (P), recall (R), F-value (F) and workload were assessed.

RESULTS

The hybrid algorithm achieved 95.0%/91.6%/93.3% of P/R/F on medication entity detection and 98.7%/99.4%/99.1% of P/R/F on attribute linkage. The medication matching achieved 92.4%/90.7%/91.5% (P/R/F) on identifying matched medications in the gold-standard and 88.6%/82.5%/85.5% (P/R/F) on discrepant medications. By combining all processes, the algorithm achieved 92.4%/90.7%/91.5% (P/R/F) and 71.5%/65.2%/68.2% (P/R/F) on identifying the matched and the discrepant medications, respectively. The error analysis on algorithm outputs identified challenges to be addressed in order to improve medication discrepancy detection.

CONCLUSION

By leveraging ML and NLP technologies, an end-to-end, computerized algorithm achieves promising outcome in reconciling medications between clinical notes and discharge prescriptions.

Collapse

Cooper GF, Villamarin R, Rich Tsui FC, Millett N, Espino JU, Wagner MM. A method for detecting and characterizing outbreaks of infectious disease from clinical reports. J Biomed Inform 2014;53:15-26. [PMID: 25181466 DOI: 10.1016/j.jbi.2014.08.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2014] [Revised: 08/04/2014] [Accepted: 08/22/2014] [Indexed: 11/30/2022]