1
|
Hobensack M, Song J, Oh S, Evans L, Davoudi A, Bowles KH, McDonald MV, Barrón Y, Sridharan S, Wallace AS, Topaz M. Social Risk Factors are Associated with Risk for Hospitalization in Home Health Care: A Natural Language Processing Study. J Am Med Dir Assoc 2023; 24:1874-1880.e4. [PMID: 37553081 PMCID: PMC10839109 DOI: 10.1016/j.jamda.2023.06.031] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/23/2023] [Accepted: 06/25/2023] [Indexed: 08/10/2023]
Abstract
OBJECTIVE This study aimed to develop a natural language processing (NLP) system that identified social risk factors in home health care (HHC) clinical notes and to examine the association between social risk factors and hospitalization or an emergency department (ED) visit. DESIGN Retrospective cohort study. SETTING AND PARTICIPANTS We used standardized assessments and clinical notes from one HHC agency located in the northeastern United States. This included 86,866 episodes of care for 65,593 unique patients. Patients received HHC services between 2015 and 2017. METHODS Guided by HHC experts, we created a vocabulary of social risk factors that influence hospitalization or ED visit risk in the HHC setting. We then developed an NLP system to automatically identify social risk factors documented in clinical notes. We used an adjusted logistic regression model to examine the association between the NLP-based social risk factors and hospitalization or an ED visit. RESULTS On the basis of expert consensus, the following social risk factors emerged: Social Environment, Physical Environment, Education and Literacy, Food Insecurity, Access to Care, and Housing and Economic Circumstances. Our NLP system performed "very good" with an F score of 0.91. Approximately 4% of clinical notes (33% episodes of care) documented a social risk factor. The most frequently documented social risk factors were Physical Environment and Social Environment. Except for Housing and Economic Circumstances, all NLP-based social risk factors were associated with higher odds of hospitalization and ED visits. CONCLUSIONS AND IMPLICATIONS HHC clinicians assess and document social risk factors associated with hospitalizations and ED visits in their clinical notes. Future studies can explore the social risk factors documented in HHC to improve communication across the health care system and to predict patients at risk for being hospitalized or visiting the ED.
Collapse
Affiliation(s)
| | - Jiyoun Song
- Columbia University School of Nursing, New York City, NY, USA
| | - Sungho Oh
- University of Pennsylvania School of Nursing, Philadelphia, PA, USA
| | - Lauren Evans
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Anahita Davoudi
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Kathryn H Bowles
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA; Department of Biobehavioral Health Sciences, NewCourtland Center for Transitions and Health, University of Pennsylvania School of Nursing, Philadelphia, PA, USA
| | | | - Yolanda Barrón
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Andrea S Wallace
- The University of Utah College of Nursing, Salt Lake City, UT, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, NY, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA; Data Science Institute, Columbia University, New York City, NY, USA
| |
Collapse
|
2
|
Mitha S, Schwartz J, Hobensack M, Cato K, Woo K, Smaldone A, Topaz M. Natural Language Processing of Nursing Notes: An Integrative Review. Comput Inform Nurs 2023; 41:377-384. [PMID: 36730744 DOI: 10.1097/cin.0000000000000967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Natural language processing includes a variety of techniques that help to extract meaning from narrative data. In healthcare, medical natural language processing has been a growing field of study; however, little is known about its use in nursing. We searched PubMed, EMBASE, and CINAHL and found 689 studies, narrowed to 43 eligible studies using natural language processing in nursing notes. Data related to the study purpose, patient population, methodology, performance evaluation metrics, and quality indicators were extracted for each study. The majority (86%) of the studies were conducted from 2015 to 2021. Most of the studies (58%) used inpatient data. One of four studies used data from open-source databases. The most common standard terminologies used were the Unified Medical Language System and Systematized Nomenclature of Medicine, whereas nursing-specific standard terminologies were used only in eight studies. Full system performance metrics (eg, F score) were reported for 61% of applicable studies. The overall number of nursing natural language processing publications remains relatively small compared with the other medical literature. Future studies should evaluate and report appropriate performance metrics and use existing standard nursing terminologies to enable future scalability of the methods and findings.
Collapse
Affiliation(s)
- Shazia Mitha
- Author Affiliations : Columbia University School of Nursing, New York
| | | | | | | | | | | | | |
Collapse
|
3
|
Jeon E, Kim A, Lee J, Heo H, Lee H, Woo K. Developing a Classification Algorithm for Prediabetes Risk Detection From Home Care Nursing Notes: Using Natural Language Processing. Comput Inform Nurs 2023:00024665-990000000-00087. [PMID: 37165830 DOI: 10.1097/cin.0000000000001000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This study developed and validated a rule-based classification algorithm for prediabetes risk detection using natural language processing from home care nursing notes. First, we developed prediabetes-related symptomatic terms in English and Korean. Second, we used natural language processing to preprocess the notes. Third, we created a rule-based classification algorithm with 31 484 notes, excluding 315 instances of missing data. The final algorithm was validated by measuring accuracy, precision, recall, and the F1 score against a gold standard testing set (400 notes). The developed terms comprised 11 categories and 1639 words in Korean and 1181 words in English. Using the rule-based classification algorithm, 42.2% of the notes comprised one or more prediabetic symptoms. The algorithm achieved high performance when applied to the gold standard testing set. We proposed a rule-based natural language processing algorithm to optimize the classification of the prediabetes risk group, depending on whether the home care nursing notes contain prediabetes-related symptomatic terms. Tokenization based on white space and the rule-based algorithm were brought into effect to detect the prediabetes symptomatic terms. Applying this algorithm to electronic health records systems will increase the possibility of preventing diabetes onset through early detection of risk groups and provision of tailored intervention.
Collapse
Affiliation(s)
- Eunjoo Jeon
- Author Affiliations: Technology Research, SamsungSDS (Dr Jeon); College of Nursing, Seoul National University (Mss Kim, J. Lee, and H. Lee and Dr Woo); and Seoul National University Hospital (Ms Heo), Seoul, South Korea
| | | | | | | | | | | |
Collapse
|
4
|
Chae S, Song J, Ojo M, Bowles KH, McDonald MV, Barrón Y, Hobensack M, Kennedy E, Sridharan S, Evans L, Topaz M. Factors associated with poor self-management documented in home health care narrative notes for patients with heart failure. Heart Lung 2022; 55:148-154. [PMID: 35597164 PMCID: PMC11021173 DOI: 10.1016/j.hrtlng.2022.05.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 05/03/2022] [Accepted: 05/07/2022] [Indexed: 11/04/2022]
Abstract
BACKGROUND Patients with heart failure (HF) who actively engage in their own self-management have better outcomes. Extracting data through natural language processing (NLP) holds great promise for identifying patients with or at risk of poor self-management. OBJECTIVE To identify home health care (HHC) patients with HF who have poor self-management using NLP of narrative notes, and to examine patient factors associated with poor self-management. METHODS An NLP algorithm was applied to extract poor self-management documentation using 353,718 HHC narrative notes of 9,710 patients with HF. Sociodemographic and structured clinical data were incorporated into multivariate logistic regression models to identify factors associated with poor self-management. RESULTS There were 758 (7.8%) patients in this sample identified as having notes with language describing poor HF self-management. Younger age (OR 0.982, 95% CI 0.976-0.987, p < .001), longer length of stay in HHC (OR 1.036, 95% CI 1.029- 1.043, p < .001), diagnosis of diabetes (OR 1.47, 95% CI 1.3-1.67, p < .001) and depression (OR 1.36, 95% CI 1.09-1.68, p < .01), impaired decision-making (OR 1.64, 95% CI 1.37-1.95, p < .001), smoking (OR 1.7, 95% CI 1.4-2.04, p < .001), and shortness of breath with exertion (OR 1.25, 95% CI 1.1-1.42, p < .01) were associated with poor self-management. CONCLUSIONS Patients with HF who have poor self-management can be identified from the narrative notes in HHC using novel NLP methods. Meaningful information about the self-management of patients with HF can support HHC clinicians in developing individualized care plans to improve self-management and clinical outcomes.
Collapse
Affiliation(s)
- Sena Chae
- College of Nursing, University of Iowa, 50 Newton Rd, Iowa City, IA 52242, United States.
| | - Jiyoun Song
- Columbia University School of Nursing, New York, NY, United States
| | - Marietta Ojo
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Kathryn H Bowles
- Department of Biobehavioral Health Sciences Philadelphia PA, Center for Home Care Policy & Research, University of Pennsylvania School of Nursing, Visiting Nurse Service of New York, New York, NY, United States
| | - Margaret V McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Yolanda Barrón
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Mollie Hobensack
- Columbia University School of Nursing, New York, NY, United States
| | - Erin Kennedy
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, United States
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Lauren Evans
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Maxim Topaz
- Center for Home Care Policy & Research, Columbia University School of Nursing, Data Science Institute, Columbia University, Visiting Nurse Service of New York, New York, NY, United States
| |
Collapse
|
5
|
Abstract
In recent years, the evolution of technology has led to an increase in text data obtained from many sources. In the biomedical domain, text information has also evidenced this accelerated growth, and automatic text summarization systems play an essential role in optimizing physicians’ time resources and identifying relevant information. In this paper, we present a systematic review in recent research of text summarization for biomedical textual data, focusing mainly on the methods employed, type of input data text, areas of application, and evaluation metrics used to assess systems. The survey was limited to the period between 1st January 2014 and 15th March 2022. The data collected was obtained from WoS, IEEE, and ACM digital libraries, while the search strategies were developed with the help of experts in NLP techniques and previous systematic reviews. The four phases of a systematic review by PRISMA methodology were conducted, and five summarization factors were determined to assess the studies included: Input, Purpose, Output, Method, and Evaluation metric. Results showed that 3.5% of 801 studies met the inclusion criteria. Moreover, Single-document, Biomedical Literature, Generic, and Extractive summarization proved to be the most common approaches employed, while techniques based on Machine Learning were performed in 16 studies and Rouge (Recall-Oriented Understudy for Gisting Evaluation) was reported as the evaluation metric in 26 studies. This review found that in recent years, more transformer-based methodologies for summarization purposes have been implemented compared to a previous survey. Additionally, there are still some challenges in text summarization in different domains, especially in the biomedical field in terms of demand for further research.
Collapse
|
6
|
Xu D, Miller T. A simple neural vector space model for medical concept normalization using concept embeddings. J Biomed Inform 2022; 130:104080. [PMID: 35472514 PMCID: PMC9351985 DOI: 10.1016/j.jbi.2022.104080] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 04/15/2022] [Accepted: 04/19/2022] [Indexed: 11/24/2022]
Abstract
OBJECTIVE Medical concept normalization (MCN), the task of linking textual mentions to concepts in an ontology, provides a solution to unify different ways of referring to the same concept. In this paper, we present a simple neural MCN model that takes mentions as input and directly predicts concepts. MATERIALS AND METHODS We evaluate our proposed model on clinical datasets from ShARe/CLEF eHealth 2013 shared task and 2019 n2c2/OHNLP shared task track 3. Our neural MCN model consists of an encoder, and a normalized temperature-scaled softmax (NT-softmax) layer that maximizes the cosine similarity score of matching the mention to the correct concept. We adopt SAPBERT as the encoder and initialize the weights in the NT-softmax layer with pre-computed concept embeddings from SAPBERT. RESULTS Our proposed neural model achieves competitive performance on ShARe/CLEF 2013 and establishes a new state-of-the-art on 2019-n2c2-MCN. Yet this model is simpler than most prior work: it requires no complex pipelines, no hand-crafted rules, and no preprocessing, making it simpler to apply in new settings. DISCUSSION Analyses of our proposed model show that the NT-softmax is better than the conventional softmax on the MCN task, and both the CUI-less threshold parameter and the initialization of the weight vectors in the NT-softmax layer contribute to the improvements. CONCLUSION We propose a simple neural model for clinical MCN, an one-step approach with simpler inference and more effective performance than prior work. Our analyses demonstrate future work on MCN may require more effort on unseen concepts.
Collapse
Affiliation(s)
- Dongfang Xu
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School Boston, MA, USA.
| | - Timothy Miller
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School Boston, MA, USA
| |
Collapse
|
7
|
Yaeger JP, Lu J, Jones J, Ertefaie A, Fiscella K, Gildea D. Derivation of a natural language processing algorithm to identify febrile infants. J Hosp Med 2022; 17:11-18. [PMID: 35504534 DOI: 10.1002/jhm.2732] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/24/2021] [Accepted: 12/09/2021] [Indexed: 11/08/2022]
Abstract
BACKGROUND Diagnostic codes can retrospectively identify samples of febrile infants, but sensitivity is low, resulting in many febrile infants eluding detection. To ensure study samples are representative, an improved approach is needed. OBJECTIVE To derive and internally validate a natural language processing algorithm to identify febrile infants and compare its performance to diagnostic codes. METHODS This cross-sectional study consisted of infants aged 0-90 days brought to one pediatric emergency department from January 2016 to December 2017. We aimed to identify infants with fever, defined as a documented temperature ≥38°C. We used 2017 clinical notes to develop two rule-based algorithms to identify infants with fever and tested them on data from 2016. Using manual abstraction as the gold standard, we compared performance of the two rule-based algorithms (Models 1, 2) to four previously published diagnostic code groups (Models 5-8) using area under the receiver-operating characteristics curve (AUC), sensitivity, and specificity. RESULTS For the test set (n = 1190 infants), 184 infants were febrile (15.5%). The AUCs (0.92-0.95) and sensitivities (86%-92%) of Models 1 and 2 were significantly greater than Models 5-8 (0.67-0.74; 20%-74%) with similar specificities (93%-99%). In contrast to Models 5-8, samples from Models 1 and 2 demonstrated similar characteristics to the gold standard, including fever prevalence, median age, and rates of bacterial infections, hospitalizations, and severe outcomes. CONCLUSIONS Findings suggest rule-based algorithms can accurately identify febrile infants with greater sensitivity while preserving specificity compared to diagnostic codes. If externally validated, rule-based algorithms may be important tools to create representative study samples, thereby improving generalizability of findings.
Collapse
Affiliation(s)
- Jeffrey P Yaeger
- Department of Pediatrics, University of Rochester Medical Center, Rochester, New York, USA
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, New York, USA
| | - Jiahao Lu
- Department of Pediatrics, University of Rochester Medical Center, Rochester, New York, USA
| | - Jeremiah Jones
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York, USA
| | - Ashkan Ertefaie
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York, USA
| | - Kevin Fiscella
- Department of Family Medicine, University of Rochester Medical Center, Rochester, New York, USA
| | - Daniel Gildea
- Department of Computer Science, University of Rochester, Rochester, New York, USA
| |
Collapse
|
8
|
Von Gerich H, Moen H, Block LJ, Chu CH, DeForest H, Hobensack M, Michalowski M, Mitchell J, Nibber R, Olalia MA, Pruinelli L, Ronquillo CE, Topaz M, Peltonen LM. Artificial Intelligence -based technologies in nursing: A scoping literature review of the evidence. Int J Nurs Stud 2021; 127:104153. [DOI: 10.1016/j.ijnurstu.2021.104153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 11/23/2021] [Accepted: 12/01/2021] [Indexed: 12/20/2022]
|
9
|
Jaeger SR, Rasmussen MA. Importance of data preparation when analysing written responses to open-ended questions: An empirical assessment and comparison with manual coding. Food Qual Prefer 2021. [DOI: 10.1016/j.foodqual.2021.104270] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
10
|
Woo K, Song J, Adams V, Block LJ, Currie LM, Shang J, Topaz M. Exploring prevalence of wound infections and related patient characteristics in homecare using natural language processing. Int Wound J 2021; 19:211-221. [PMID: 34105873 PMCID: PMC8684883 DOI: 10.1111/iwj.13623] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 05/06/2021] [Accepted: 05/12/2021] [Indexed: 12/13/2022] Open
Abstract
We aimed to create and validate a natural language processing algorithm to extract wound infection-related information from nursing notes. We also estimated wound infection prevalence in homecare settings and described related patient characteristics. In this retrospective cohort study, a natural language processing algorithm was developed and validated against a gold standard testing set. Cases with wound infection were identified using the algorithm and linked to Outcome and Assessment Information Set data to identify related patient characteristics. The final version of the natural language processing vocabulary contained 3914 terms and expressions related to the presence of wound infection. The natural language processing algorithm achieved overall good performance (F-measure = 0.88). The presence of wound infection was documented for 1.03% (n = 602) of patients without wounds, for 5.95% (n = 3232) of patients with wounds, and 19.19% (n = 152) of patients with wound-related hospitalisation or emergency department visits. Diabetes, peripheral vascular disease, and skin ulcer were significantly associated with wound infection among homecare patients. Our findings suggest that nurses frequently document wound infection-related information. The use of natural language processing demonstrated that valuable information can be extracted from nursing notes which can be used to improve our understanding of the care needs of people receiving homecare. By linking findings from clinical nursing notes with additional structured data, we can analyse related patients' characteristics and use them to develop a tailored intervention that may potentially lead to reduced wound infection-related hospitalizations.
Collapse
Affiliation(s)
- Kyungmi Woo
- College of Nursing, Seoul National University, Seoul, South Korea
| | - Jiyoun Song
- School of Nursing, Columbia University, New York City, New York, USA
| | - Victoria Adams
- Visiting Nurse Service of New York, New York City, New York, USA
| | - Lorraine J Block
- School of Nursing, University of British Columbia, Vancouver, British Columbia, Canada
| | - Leanne M Currie
- School of Nursing, University of British Columbia, Vancouver, British Columbia, Canada
| | - Jingjing Shang
- School of Nursing, Columbia University, New York City, New York, USA
| | - Maxim Topaz
- School of Nursing, Columbia University, New York City, New York, USA.,Visiting Nurse Service of New York, New York City, New York, USA.,Data Science Institute, Columbia University, New York City, New York, USA
| |
Collapse
|
11
|
Senders JT, Cho LD, Calvachi P, McNulty JJ, Ashby JL, Schulte IS, Almekkawi AK, Mehrtash A, Gormley WB, Smith TR, Broekman MLD, Arnaout O. Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma. JCO Clin Cancer Inform 2021; 4:25-34. [PMID: 31977252 DOI: 10.1200/cci.19.00060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
PURPOSE The aim of this study was to develop an open-source natural language processing (NLP) pipeline for text mining of medical information from clinical reports. We also aimed to provide insight into why certain variables or reports are more suitable for clinical text mining than others. MATERIALS AND METHODS Various NLP models were developed to extract 15 radiologic characteristics from free-text radiology reports for patients with glioblastoma. Ten-fold cross-validation was used to optimize the hyperparameter settings and estimate model performance. We examined how model performance was associated with quantitative attributes of the radiologic characteristics and reports. RESULTS In total, 562 unique brain magnetic resonance imaging reports were retrieved. NLP extracted 15 radiologic characteristics with high to excellent discrimination (area under the curve, 0.82 to 0.98) and accuracy (78.6% to 96.6%). Model performance was correlated with the inter-rater agreement of the manually provided labels (ρ = 0.904; P < .001) but not with the frequency distribution of the variables of interest (ρ = 0.179; P = .52). All variables labeled with a near perfect inter-rater agreement were classified with excellent performance (area under the curve > 0.95). Excellent performance could be achieved for variables with only 50 to 100 observations in the minority group and class imbalances up to a 9:1 ratio. Report-level classification accuracy was not associated with the number of words or the vocabulary size in the distinct text documents. CONCLUSION This study provides an open-source NLP pipeline that allows for text mining of narratively written clinical reports. Small sample sizes and class imbalance should not be considered as absolute contraindications for text mining in clinical research. However, future studies should report measures of inter-rater agreement whenever ground truth is based on a consensus label and use this measure to identify clinical variables eligible for text mining.
Collapse
Affiliation(s)
- Joeky T Senders
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands
| | - Logan D Cho
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Department of Neuroscience, Brown University, Providence, RI
| | - Paola Calvachi
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - John J McNulty
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.,Vagelos College of Physicians and Surgeons, Columbia University, New York, NY
| | - Joanna L Ashby
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Isabelle S Schulte
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Ahmad Kareem Almekkawi
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Alireza Mehrtash
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - William B Gormley
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Timothy R Smith
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Marike L D Broekman
- Department of Neurosurgery, Leiden University Medical Center, Leiden, the Netherlands.,Department of Neurosurgery, Haaglanden Medical Center, The Hague, the Netherlands
| | - Omar Arnaout
- Computational Neuroscience Outcomes Center, Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
12
|
Woo K, Adams V, Wilson P, Fu LH, Cato K, Rossetti SC, McDonald M, Shang J, Topaz M. Identifying Urinary Tract Infection-Related Information in Home Care Nursing Notes. J Am Med Dir Assoc 2021; 22:1015-1021.e2. [PMID: 33434568 PMCID: PMC8106637 DOI: 10.1016/j.jamda.2020.12.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 07/28/2020] [Accepted: 12/06/2020] [Indexed: 12/12/2022]
Abstract
Objectives: Urinary tract infection (UTI) is common in home care but not easily captured with standard assessment. This study aimed to examine the value of nursing notes in detecting UTI signs and symptoms in home care. Design: The study developed a natural language processing (NLP) algorithm to automatically identify UTI-related information in nursing notes. Setting and Participants: Home care visit notes (n = 1,149,586) and care coordination notes (n = 1,461,171) for 89,459 patients treated in the largest nonprofit home care agency in the United States during 2014. Measures: We generated 6 categories of UTI-related information from literature and used the Unified Medical Language System (UMLS) to identify a preliminary list of terms. The NLP algorithm was tested on a gold standard set of 300 clinical notes annotated by clinical experts. We used structured Outcome and Assessment Information Set data to extract the frequency of UTI-related emergency department (ED) visits or hospitalizations and explored time-patterns in documentation of UTI-related information. Results: The NLP system achieved very good overall performance (F measure = 0.9, 95% CI: 0.87–0.93) based on the test results obtained by using the notes for patients admitted to the ED or hospital due to UTI. UTI-related information was significantly more prevalent (P < .01 for all the tests) in home care episodes with UTI-related ED admission or hospitalization vs the general patient population; 81% of home care episodes with UTI-related hospitalization or ED admission had at least 1 category of UTI-related information vs 21.6% among episodes without UTI-related hospitalization or ED admission. Frequency of UTI-related information documentation increased in advance of UTI-related hospitalization or ED admission, peaking within a few days before the event. Conclusions and Implications: Information in nursing notes is often overlooked by stakeholders and not integrated into predictive modeling for decision-making support, but our findings highlight their value in early risk identification and care guidance. Health care administrators should consider using NLP to extract clinical data from nursing notes to improve early detection and treatment, which may lead to quality improvement and cost reduction.
Collapse
Affiliation(s)
- Kyungmi Woo
- College of Nursing, Seoul National University, Seoul, Republic of Korea.
| | - Victoria Adams
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA
| | - Paula Wilson
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA
| | - Li-Heng Fu
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Kenrick Cato
- College of Nursing, Seoul National University, Seoul, Republic of Korea
| | - Sarah Collins Rossetti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA; School of Nursing, Columbia University, New York, NY, USA
| | - Margaret McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA
| | - Jingjing Shang
- School of Nursing, Columbia University, New York, NY, USA
| | - Maxim Topaz
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA; School of Nursing, Columbia University, New York, NY, USA; Data Science Institute, Columbia University, New York, NY, USA
| |
Collapse
|
13
|
Topaz M, Koleck TA, Onorato N, Smaldone A, Bakken S. Nursing documentation of symptoms is associated with higher risk of emergency department visits and hospitalizations in homecare patients. Nurs Outlook 2020; 69:435-446. [PMID: 33386145 DOI: 10.1016/j.outlook.2020.12.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 10/23/2020] [Accepted: 12/11/2020] [Indexed: 10/22/2022]
Abstract
BACKGROUND Nurses often document patient symptoms in narrative notes. PURPOSE This study used a technique called natural language processing (NLP) to: (1) Automatically identify documentation of seven common symptoms (anxiety, cognitive disturbance, depressed mood, fatigue, sleep disturbance, pain, and well-being) in homecare narrative nursing notes, and (2) examine the association between symptoms and emergency department visits or hospital admissions from homecare. METHOD NLP was applied on a large subset of narrative notes (2.5 million notes) documented for 89,825 patients admitted to one large homecare agency in the Northeast United States. FINDINGS NLP accurately identified symptoms in narrative notes. Patients with more documented symptom categories had higher risk of emergency department visit or hospital admission. DISCUSSION Further research is needed to explore additional symptoms and implement NLP systems in the homecare setting to enable early identification of concerning patient trends leading to emergency department visit or hospital admission.
Collapse
Affiliation(s)
- Maxim Topaz
- Center for Home Care Policy and Research, Visiting Nurse Service of New York, New York, NY; Columbia University School of Nursing, Columbia University Data Science Institute, New York, NY
| | | | - Nicole Onorato
- Center for Home Care Policy and Research, Visiting Nurse Service of New York, New York, NY.
| | - Arlene Smaldone
- Columbia University School of Nursing, Columbia University College of Dental Medicine, New York, NY
| | - Suzanne Bakken
- Columbia University School of Nursing, Columbia University Department of Biomedical Informatics, Columbia University Data Science Institute, New York, NY
| |
Collapse
|
14
|
Dionisi S, Di Simone E, Alicastro GM, Angelini S, Giannetta N, Iacorossi L, Di Muzio M. Nursing Summary: designing a nursing section in the Electronic Health Record. ACTA BIO-MEDICA : ATENEI PARMENSIS 2019; 90:293-299. [PMID: 31580318 PMCID: PMC7233749 DOI: 10.23750/abm.v90i3.7411] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Accepted: 06/21/2018] [Indexed: 11/23/2022]
Abstract
The introduction of new information technologies in healthcare led to major changes in the field of tools for managing and evaluating the assistance. In Italy, an example of applying new technologies to the healthcare context is the realization of Fascicolo Sanitario Elettronico (FSE). The FSE is a tool that collects online data and health and socio-health information that make up the patient’s clinical history. The aim of this review is to analyze which components are needed to organize and structure the information and data within the “Nursing Summary”. Literature searches were conducted using the following available online Databases: CINAHL, PubMed and Cochrane Library. The searches were conducted by analyzing publications from the last five years (2012-2016). The process of selection of articles led to the choice of 14 research studies. Additionally, national guidelines were analyzed, concerning official documents and technical specifications for the development of projects of FSE. The analysis of the scientific literature showed that nursing data in the EHR can be used to develop some Clinical Decision Support Systems. Relevant were also used to clarify how the nursing data could be structured in the “Nursing Summary”. The research findings have identified which could be the main components of a possible nursing section to integrate the FSE. This project is proposed as a preliminary study that needs further development. (www.actabiomedica.it)
Collapse
Affiliation(s)
- Sara Dionisi
- Department of Biomedicine and Prevention, Tor Vergata University of Rome, Rome, Italy.
| | | | | | | | | | | | | |
Collapse
|
15
|
Ho KF, Ho CH, Chung MH. Theoretical integration of user satisfaction and technology acceptance of the nursing process information system. PLoS One 2019; 14:e0217622. [PMID: 31163076 PMCID: PMC6548361 DOI: 10.1371/journal.pone.0217622] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 05/15/2019] [Indexed: 12/01/2022] Open
Abstract
Background The nursing process system (NPS) is used to establish the nursing process involving assessment, diagnosis, planning, intervention, and evaluation in solving the health problems of patients. Objectives The factors influencing the use of the NPS by nurses were analyzed based on user satisfaction and technology acceptance within the 3Q (service quality, information quality, and system quality) model. Methods In this cross-sectional quantitative study, the valid responses of 222 nurses to a questionnaire were obtained; these nurses worked at eight hospitals affiliated with public organizations in Taiwan. Structural equation modeling was used to analyze information quality, system quality, service quality, user satisfaction, perceived usefulness, perceived ease of use, perceived enjoyment, behavioral attitude, and intention after the nurses had used the NPS system for more than 1 month. Results Information quality, service quality, and system quality influenced user satisfaction. User satisfaction affected perceived usefulness, perceived ease of use, and perceived enjoyment and had the highest explanatory power (R2 = 0.75). Furthermore, perceived usefulness, perceived ease of use, and perceived enjoyment influenced behavioral attitude and intention to use the system. The proposed model explained 53% of the variance in the intention to use the NPS. Conclusions The relationships between the variables of the 3Q model were successfully used to examine the intention of nurses toward using the NPS. Using the findings of this study, designers and programmers can comprehensively understand the perceptions of nurses and further improve the performance of the NPS.
Collapse
Affiliation(s)
- Kuei-Fang Ho
- School of Nursing, College of Nursing, Taipei Medical University, Taipei, Taiwan
| | - Cheng-Hsun Ho
- Graduate Institute of Information Management, National Taipei University, New Taipei City, Taiwan
| | - Min-Huey Chung
- School of Nursing, College of Nursing, Taipei Medical University, Taipei, Taiwan
- Department of Nursing, Shuang Ho Hospital, Taipei Medical University, New Taipei City, Taiwan
- * E-mail:
| |
Collapse
|
16
|
Gulden C, Kirchner M, Schüttler C, Hinderer M, Kampf M, Prokosch HU, Toddenroth D. Extractive summarization of clinical trial descriptions. Int J Med Inform 2019; 129:114-121. [PMID: 31445245 DOI: 10.1016/j.ijmedinf.2019.05.019] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2018] [Revised: 04/06/2019] [Accepted: 05/21/2019] [Indexed: 10/26/2022]
Abstract
PURPOSE Text summarization of clinical trial descriptions has the potential to reduce the time required to familiarize oneself with the subject of studies by condensing long-form detailed descriptions to concise, meaning-preserving synopses. This work describes the process and quality of automatically generated summaries of clinical trial descriptions using extractive text summarization methods. METHODS We generated a novel dataset from the detailed descriptions and brief summaries of trials registered on clinicaltrials.gov. We executed several text summarization algorithms on the detailed descriptions in this corpus and calculated the standard ROUGE metrics using the brief summaries included in the record as a reference. To investigate the correlation of these metrics with human sentiments, four reviewers assessed the content-completeness of the generated summaries and the helpfulness of both the generated and reference summaries via a Likert scale questionnaire. RESULTS The filtering stages of the dataset generation process reduce the 277,228 trials registered on clinicaltrials.gov to 101,016 records usable for the summarization task. On average, the summaries in this corpus are 25% the length of the detailed descriptions. Of the evaluated text summarization methods, the TextRank algorithm exhibits the overall best performance with a ROUGE-1 F1 score of 0.3531, ROUGE-2 F1 score of 0.1723, and ROUGE-L F1 score of 0.3003. These scores correlate with the assessment of the helpfulness and content similarity by the human reviewers. Inter-rater agreement for the helpfulness and content similarity was slight and fair respectively (Fleiss' kappa of 0.12 and 0.22). CONCLUSIONS Extractive summarization is a viable tool for generating meaning-preserving synopses of detailed clinical trial descriptions. Further, the human evaluation has shown that the ROUGE-L F1 score is useful for rating the general quality of generated summaries of clinical trial descriptions in an automated way.
Collapse
Affiliation(s)
- Christian Gulden
- Medical Informatics, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Erlangen, Germany.
| | - Melanie Kirchner
- Medical Center for Information and Communication Technology, University Hospital Erlangen, Erlangen, Germany
| | - Christina Schüttler
- Medical Informatics, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Marc Hinderer
- Medical Informatics, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Marvin Kampf
- Medical Center for Information and Communication Technology, University Hospital Erlangen, Erlangen, Germany
| | - Hans-Ulrich Prokosch
- Medical Informatics, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Dennis Toddenroth
- Medical Informatics, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| |
Collapse
|
17
|
Luo YF, Sun W, Rumshisky A. MCN: A comprehensive corpus for medical concept normalization. J Biomed Inform 2019; 92:103132. [DOI: 10.1016/j.jbi.2019.103132] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 01/18/2019] [Accepted: 02/15/2019] [Indexed: 11/25/2022]
|
18
|
Topaz M, Murga L, Gaddis KM, McDonald MV, Bar-Bachar O, Goldberg Y, Bowles KH. Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform 2019; 90:103103. [PMID: 30639392 DOI: 10.1016/j.jbi.2019.103103] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 11/14/2018] [Accepted: 12/31/2018] [Indexed: 10/27/2022]
Abstract
BACKGROUND Natural language processing (NLP) of health-related data is still an expertise demanding, and resource expensive process. We created a novel, open source rapid clinical text mining system called NimbleMiner. NimbleMiner combines several machine learning techniques (word embedding models and positive only labels learning) to facilitate the process in which a human rapidly performs text mining of clinical narratives, while being aided by the machine learning components. OBJECTIVE This manuscript describes the general system architecture and user Interface and presents results of a case study aimed at classifying fall-related information (including fall history, fall prevention interventions, and fall risk) in homecare visit notes. METHODS We extracted a corpus of homecare visit notes (n = 1,149,586) for 89,459 patients from a large US-based homecare agency. We used a gold standard testing dataset of 750 notes annotated by two human reviewers to compare the NimbleMiner's ability to classify documents regarding whether they contain fall-related information with a previously developed rule-based NLP system. RESULTS NimbleMiner outperformed the rule-based system in almost all domains. The overall F- score was 85.8% compared to 81% by the rule based-system with the best performance for identifying general fall history (F = 89% vs. F = 85.1% rule-based), followed by fall risk (F = 87% vs. F = 78.7% rule-based), fall prevention interventions (F = 88.1% vs. F = 78.2% rule-based) and fall within 2 days of the note date (F = 83.1% vs. F = 80.6% rule-based). The rule-based system achieved slightly better performance for fall within 2 weeks of the note date (F = 81.9% vs. F = 84% rule-based). DISCUSSION & CONCLUSIONS NimbleMiner outperformed other systems aimed at fall information classification, including our previously developed rule-based approach. These promising results indicate that clinical text mining can be implemented without the need for large labeled datasets necessary for other types of machine learning. This is critical for domains with little NLP developments, like nursing or allied health professions.
Collapse
Affiliation(s)
- Maxim Topaz
- School of Nursing & Data Science Institute, Columbia University, New York, NY, USA; The Visiting Nurse Service of New York, New York, NY, USA.
| | - Ludmila Murga
- Cheryl Spencer Department of Nursing, University of Haifa, Haifa, Israel
| | | | | | - Ofrit Bar-Bachar
- Cheryl Spencer Department of Nursing, University of Haifa, Haifa, Israel
| | - Yoav Goldberg
- Department of Computer Science, Bar Ilan University, Tel Aviv, Israel
| | - Kathryn H Bowles
- The Visiting Nurse Service of New York, New York, NY, USA; School of Nursing, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
19
|
Gehrmann S, Dernoncourt F, Li Y, Carlson ET, Wu JT, Welt J, Foote J, Moseley ET, Grant DW, Tyler PD, Celi LA. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives. PLoS One 2018; 13:e0192360. [PMID: 29447188 PMCID: PMC5813927 DOI: 10.1371/journal.pone.0192360] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 01/21/2018] [Indexed: 01/22/2023] Open
Abstract
In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.
Collapse
Affiliation(s)
- Sebastian Gehrmann
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Harvard SEAS, Harvard University, Cambridge, MA, United States of America
- * E-mail:
| | - Franck Dernoncourt
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Massachusetts Institute of Technology, Cambridge, MA, United States of America
- Adobe Research, San Jose, CA, United States of America
| | - Yeran Li
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Harvard T.H. Chan School of Public Health, Cambridge, MA, United States of America
| | - Eric T. Carlson
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Philips Research North America, Cambridge, MA, United States of America
| | - Joy T. Wu
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Harvard T.H. Chan School of Public Health, Cambridge, MA, United States of America
| | - Jonathan Welt
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Wellman Center for Photomedicine, Massachusetts General Hospital, Boston, MA, United States of America
| | - John Foote
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Tufts University School of Medicine, Cambridge, MA, United States of America
| | - Edward T. Moseley
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- College of Science and Mathematics, University of Massachusetts, Boston, MA, United States of America
| | - David W. Grant
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Washington University School of Medicine, St. Louis, MO, United States of America
| | - Patrick D. Tyler
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Department of Internal Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States of America
| | - Leo A. Celi
- MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America
- Massachusetts Institute of Technology, Cambridge, MA, United States of America
| |
Collapse
|
20
|
Manias E, Gray K, Wickramasinghe N. Patient and family engagement with hospital electronic systems: Juggling for co-existence. Int J Nurs Stud 2017; 68:A1-A3. [PMID: 28187902 DOI: 10.1016/j.ijnurstu.2017.01.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Elizabeth Manias
- School of Nursing and Midwifery, Centre for Quality and Patient Safety Research, Deakin University, Burwood, Australia; Department of Medicine, Royal Melbourne Hospital, The University of Melbourne, Parkville, Australia; Melbourne School of Health Sciences, The University of Melbourne, Parkville, Australia, Australia.
| | - Kathleen Gray
- School of Computing and Information Systems, The University of Melbourne, Parkville, Australia
| | - Nilmini Wickramasinghe
- Office of the Faculty of Health, Deakin University, Burwood, Australia; Epworth HealthCare, Richmond, Australia
| |
Collapse
|