1
|
Xu Z, Scharp D, Hobensek M, Ye J, Zou J, Ding S, Shang J, Topaz M. Machine learning-based infection diagnostic and prognostic models in post-acute care settings: a systematic review. J Am Med Inform Assoc 2024:ocae278. [PMID: 39530740 DOI: 10.1093/jamia/ocae278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 10/10/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024] Open
Abstract
OBJECTIVES This study aims to (1) review machine learning (ML)-based models for early infection diagnostic and prognosis prediction in post-acute care (PAC) settings, (2) identify key risk predictors influencing infection-related outcomes, and (3) examine the quality and limitations of these models. MATERIALS AND METHODS PubMed, Web of Science, Scopus, IEEE Xplore, CINAHL, and ACM digital library were searched in February 2024. Eligible studies leveraged PAC data to develop and evaluate ML models for infection-related risks. Data extraction followed the CHARMS checklist. Quality appraisal followed the PROBAST tool. Data synthesis was guided by the socio-ecological conceptual framework. RESULTS Thirteen studies were included, mainly focusing on respiratory infections and nursing homes. Most used regression models with structured electronic health record data. Since 2020, there has been a shift toward advanced ML algorithms and multimodal data, biosensors, and clinical notes being significant sources of unstructured data. Despite these advances, there is insufficient evidence to support performance improvements over traditional models. Individual-level risk predictors, like impaired cognition, declined function, and tachycardia, were commonly used, while contextual-level predictors were barely utilized, consequently limiting model fairness. Major sources of bias included lack of external validation, inadequate model calibration, and insufficient consideration of data complexity. DISCUSSION AND CONCLUSION Despite the growth of advanced modeling approaches in infection-related models in PAC settings, evidence supporting their superiority remains limited. Future research should leverage a socio-ecological lens for predictor selection and model construction, exploring optimal data modalities and ML model usage in PAC, while ensuring rigorous methodologies and fairness considerations.
Collapse
Affiliation(s)
- Zidu Xu
- School of Nursing, Columbia University, New York, NY 10032, United States
| | - Danielle Scharp
- School of Nursing, Columbia University, New York, NY 10032, United States
| | - Mollie Hobensek
- Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Jiancheng Ye
- Weill Cornell Medicine, Cornell University, New York, NY 10065, United States
| | - Jungang Zou
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, United States
| | - Sirui Ding
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Jingjing Shang
- School of Nursing, Columbia University, New York, NY 10032, United States
| | - Maxim Topaz
- School of Nursing, Columbia University, New York, NY 10032, United States
- Center for Home Care Policy & Research, VNS Health, New York, NY 10001, United States
- Data Science Institute, Columbia University, New York, NY 10027, United States
| |
Collapse
|
2
|
Xu Z, Evans L, Song J, Chae S, Davoudi A, Bowles KH, McDonald MV, Topaz M. Exploring home healthcare clinicians' needs for using clinical decision support systems for early risk warning. J Am Med Inform Assoc 2024; 31:2641-2650. [PMID: 39302103 PMCID: PMC11491664 DOI: 10.1093/jamia/ocae247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 07/05/2024] [Accepted: 09/11/2024] [Indexed: 09/22/2024] Open
Abstract
OBJECTIVES To explore home healthcare (HHC) clinicians' needs for Clinical Decision Support Systems (CDSS) information delivery for early risk warning within HHC workflows. METHODS Guided by the CDS "Five-Rights" framework, we conducted semi-structured interviews with multidisciplinary HHC clinicians from April 2023 to August 2023. We used deductive and inductive content analysis to investigate informants' responses regarding CDSS information delivery. RESULTS Interviews with thirteen HHC clinicians yielded 16 codes mapping to the CDS "Five-Rights" framework (right information, right person, right format, right channel, right time) and 11 codes for unintended consequences and training needs. Clinicians favored risk levels displayed in color-coded horizontal bars, concrete risk indicators in bullet points, and actionable instructions in the existing EHR system. They preferred non-intrusive risk alerts requiring mandatory confirmation. Clinicians anticipated risk information updates aligned with patient's condition severity and their visit pace. Additionally, they requested training to understand the CDSS's underlying logic, and raised concerns about information accuracy and data privacy. DISCUSSION While recognizing CDSS's value in enhancing early risk warning, clinicians highlighted concerns about increased workload, alert fatigue, and CDSS misuse. The top risk factors identified by machine learning algorithms, especially text features, can be ambiguous due to a lack of context. Future research should ensure that CDSS outputs align with clinical evidence and are explainable. CONCLUSION This study identified HHC clinicians' expectations, preferences, adaptations, and unintended uses of CDSS for early risk warning. Our findings endorse operationalizing the CDS "Five-Rights" framework to optimize CDSS information delivery and integration into HHC workflows.
Collapse
Affiliation(s)
- Zidu Xu
- School of Nursing, Columbia University, New York, NY 10032, United States
| | - Lauren Evans
- Center for Home Care Policy & Research, VNS Health, New York, NY 10017, United States
| | - Jiyoun Song
- School of Nursing, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Sena Chae
- College of Nursing, The University of Iowa, Iowa City, IA 52242, United States
| | - Anahita Davoudi
- Center for Home Care Policy & Research, VNS Health, New York, NY 10017, United States
| | - Kathryn H Bowles
- Center for Home Care Policy & Research, VNS Health, New York, NY 10017, United States
- School of Nursing, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Margaret V McDonald
- Center for Home Care Policy & Research, VNS Health, New York, NY 10017, United States
| | - Maxim Topaz
- School of Nursing, Columbia University, New York, NY 10032, United States
- Center for Home Care Policy & Research, VNS Health, New York, NY 10017, United States
| |
Collapse
|
3
|
Scharp D, Hobensack M, Davoudi A, Topaz M. Natural Language Processing Applied to Clinical Documentation in Post-acute Care Settings: A Scoping Review. J Am Med Dir Assoc 2024; 25:69-83. [PMID: 37838000 PMCID: PMC10792659 DOI: 10.1016/j.jamda.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 10/16/2023]
Abstract
OBJECTIVES To determine the scope of the application of natural language processing to free-text clinical notes in post-acute care and provide a foundation for future natural language processing-based research in these settings. DESIGN Scoping review; reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines. SETTING AND PARTICIPANTS Post-acute care (ie, home health care, long-term care, skilled nursing facilities, and inpatient rehabilitation facilities). METHODS PubMed, Cumulative Index of Nursing and Allied Health Literature, and Embase were searched in February 2023. Eligible studies had quantitative designs that used natural language processing applied to clinical documentation in post-acute care settings. The quality of each study was appraised. RESULTS Twenty-one studies were included. Almost all studies were conducted in home health care settings. Most studies extracted data from electronic health records to examine the risk for negative outcomes, including acute care utilization, medication errors, and suicide mortality. About half of the studies did not report age, sex, race, or ethnicity data or use standardized terminologies. Only 8 studies included variables from socio-behavioral domains. Most studies fulfilled all quality appraisal indicators. CONCLUSIONS AND IMPLICATIONS The application of natural language processing is nascent in post-acute care settings. Future research should apply natural language processing using standardized terminologies to leverage free-text clinical notes in post-acute care to promote timely, comprehensive, and equitable care. Natural language processing could be integrated with predictive models to help identify patients who are at risk of negative outcomes. Future research should incorporate socio-behavioral determinants and diverse samples to improve health equity in informatics tools.
Collapse
Affiliation(s)
| | | | - Anahita Davoudi
- VNS Health, Center for Home Care Policy & Research, New York, NY, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York, NY, USA
| |
Collapse
|
4
|
Song J, Min SH, Chae S, Bowles KH, McDonald MV, Hobensack M, Barrón Y, Sridharan S, Davoudi A, Oh S, Evans L, Topaz M. Uncovering hidden trends: identifying time trajectories in risk factors documented in clinical notes and predicting hospitalizations and emergency department visits during home health care. J Am Med Inform Assoc 2023; 30:1801-1810. [PMID: 37339524 PMCID: PMC10586044 DOI: 10.1093/jamia/ocad101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 05/04/2023] [Accepted: 06/02/2023] [Indexed: 06/22/2023] Open
Abstract
OBJECTIVE This study aimed to identify temporal risk factor patterns documented in home health care (HHC) clinical notes and examine their association with hospitalizations or emergency department (ED) visits. MATERIALS AND METHODS Data for 73 350 episodes of care from one large HHC organization were analyzed using dynamic time warping and hierarchical clustering analysis to identify the temporal patterns of risk factors documented in clinical notes. The Omaha System nursing terminology represented risk factors. First, clinical characteristics were compared between clusters. Next, multivariate logistic regression was used to examine the association between clusters and risk for hospitalizations or ED visits. Omaha System domains corresponding to risk factors were analyzed and described in each cluster. RESULTS Six temporal clusters emerged, showing different patterns in how risk factors were documented over time. Patients with a steep increase in documented risk factors over time had a 3 times higher likelihood of hospitalization or ED visit than patients with no documented risk factors. Most risk factors belonged to the physiological domain, and only a few were in the environmental domain. DISCUSSION An analysis of risk factor trajectories reflects a patient's evolving health status during a HHC episode. Using standardized nursing terminology, this study provided new insights into the complex temporal dynamics of HHC, which may lead to improved patient outcomes through better treatment and management plans. CONCLUSION Incorporating temporal patterns in documented risk factors and their clusters into early warning systems may activate interventions to prevent hospitalizations or ED visits in HHC.
Collapse
Affiliation(s)
- Jiyoun Song
- Columbia University School of Nursing, New York City, New York, USA
| | - Se Hee Min
- Columbia University School of Nursing, New York City, New York, USA
| | - Sena Chae
- College of Nursing, University of Iowa, Iowa City, Iowa, USA
| | - Kathryn H Bowles
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
- Center for Home Care Policy & Research, VNS Health, New York, New York, USA
| | | | - Mollie Hobensack
- Columbia University School of Nursing, New York City, New York, USA
| | - Yolanda Barrón
- Center for Home Care Policy & Research, VNS Health, New York, New York, USA
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, VNS Health, New York, New York, USA
| | - Anahita Davoudi
- Center for Home Care Policy & Research, VNS Health, New York, New York, USA
| | - Sungho Oh
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
| | - Lauren Evans
- Center for Home Care Policy & Research, VNS Health, New York, New York, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, New York, USA
- Center for Home Care Policy & Research, VNS Health, New York, New York, USA
- Data Science Institute, Columbia University, New York City, New York, USA
| |
Collapse
|
5
|
Mitha S, Schwartz J, Hobensack M, Cato K, Woo K, Smaldone A, Topaz M. Natural Language Processing of Nursing Notes: An Integrative Review. Comput Inform Nurs 2023; 41:377-384. [PMID: 36730744 PMCID: PMC11499545 DOI: 10.1097/cin.0000000000000967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Natural language processing includes a variety of techniques that help to extract meaning from narrative data. In healthcare, medical natural language processing has been a growing field of study; however, little is known about its use in nursing. We searched PubMed, EMBASE, and CINAHL and found 689 studies, narrowed to 43 eligible studies using natural language processing in nursing notes. Data related to the study purpose, patient population, methodology, performance evaluation metrics, and quality indicators were extracted for each study. The majority (86%) of the studies were conducted from 2015 to 2021. Most of the studies (58%) used inpatient data. One of four studies used data from open-source databases. The most common standard terminologies used were the Unified Medical Language System and Systematized Nomenclature of Medicine, whereas nursing-specific standard terminologies were used only in eight studies. Full system performance metrics (eg, F score) were reported for 61% of applicable studies. The overall number of nursing natural language processing publications remains relatively small compared with the other medical literature. Future studies should evaluate and report appropriate performance metrics and use existing standard nursing terminologies to enable future scalability of the methods and findings.
Collapse
Affiliation(s)
- Shazia Mitha
- Author Affiliations : Columbia University School of Nursing, New York
| | | | | | | | | | | | | |
Collapse
|
6
|
Song J, Chae S, Bowles KH, McDonald MV, Barrón Y, Cato K, Collins Rossetti S, Hobensack M, Sridharan S, Evans L, Davoudi A, Topaz M. The identification of clusters of risk factors and their association with hospitalizations or emergency department visits in home health care. J Adv Nurs 2023; 79:593-604. [PMID: 36414419 PMCID: PMC10163408 DOI: 10.1111/jan.15498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 09/30/2022] [Accepted: 10/31/2022] [Indexed: 11/24/2022]
Abstract
AIMS To identify clusters of risk factors in home health care and determine if the clusters are associated with hospitalizations or emergency department visits. DESIGN A retrospective cohort study. METHODS This study included 61,454 patients pertaining to 79,079 episodes receiving home health care between 2015 and 2017 from one of the largest home health care organizations in the United States. Potential risk factors were extracted from structured data and unstructured clinical notes analysed by natural language processing. A K-means cluster analysis was conducted. Kaplan-Meier analysis was conducted to identify the association between clusters and hospitalizations or emergency department visits during home health care. RESULTS A total of 11.6% of home health episodes resulted in hospitalizations or emergency department visits. Risk factors formed three clusters. Cluster 1 is characterized by a combination of risk factors related to "impaired physical comfort with pain," defined as situations where patients may experience increased pain. Cluster 2 is characterized by "high comorbidity burden" defined as multiple comorbidities or other risks for hospitalization (e.g., prior falls). Cluster 3 is characterized by "impaired cognitive/psychological and skin integrity" including dementia or skin ulcer. Compared to Cluster 1, the risk of hospitalizations or emergency department visits increased by 1.95 times for Cluster 2 and by 2.12 times for Cluster 3 (all p < .001). CONCLUSION Risk factors were clustered into three types describing distinct characteristics for hospitalizations or emergency department visits. Different combinations of risk factors affected the likelihood of these negative outcomes. IMPACT Cluster-based risk prediction models could be integrated into early warning systems to identify patients at risk for hospitalizations or emergency department visits leading to more timely, patient-centred care, ultimately preventing these events. PATIENT OR PUBLIC CONTRIBUTION There was no involvement of patients in developing the research question, determining the outcome measures, or implementing the study.
Collapse
Affiliation(s)
- Jiyoun Song
- Columbia University School of Nursing, New York City, New York, USA
| | - Sena Chae
- College of Nursing, The University of Iowa, Iowa City, Iowa, USA
| | - Kathryn H. Bowles
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
| | - Margaret V. McDonald
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
| | - Yolanda Barrón
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
| | - Kenrick Cato
- Columbia University School of Nursing, New York City, New York, USA
- Emergency Medicine, Columbia University Irving Medical Center, New York City, New York, USA
| | - Sarah Collins Rossetti
- Columbia University School of Nursing, New York City, New York, USA
- Department of Biomedical Informatics, Columbia University, New York City, New York, USA
| | - Mollie Hobensack
- Columbia University School of Nursing, New York City, New York, USA
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
| | - Lauren Evans
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
| | - Anahita Davoudi
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, New York, USA
- Center for Home Care Policy & Research, VNS Health, New York, New York City, USA
- Data Science Institute, Columbia University, New York City, New York, USA
| |
Collapse
|
7
|
Hobensack M, Song J, Scharp D, Bowles KH, Topaz M. Machine learning applied to electronic health record data in home healthcare: A scoping review. Int J Med Inform 2023; 170:104978. [PMID: 36592572 PMCID: PMC9869861 DOI: 10.1016/j.ijmedinf.2022.104978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/13/2022] [Accepted: 12/23/2022] [Indexed: 12/31/2022]
Abstract
OBJECTIVE Despite recent calls for home healthcare (HHC) to integrate informatics, the application of machine learning in HHC is relatively unknown. Thus, this study aimed to synthesize and appraise the literature describing the application of machine learning to predict adverse outcomes (e.g., hospitalization, mortality) using electronic health record (EHR) data in the HHC setting. Our secondary aim was to evaluate the comprehensiveness of predictors used in the machine learning algorithms guided by the Biopsychosocial Model. METHODS During March 2022 we conducted a literature search in four databases: PubMed, Embase, CINAHL, and Scopus. Inclusion criteria were 1) describing services provided in the HHC setting, 2) applying machine learning algorithms to predict adverse outcomes, defined as outcomes related to patient deterioration, 3) using EHR data and 4) focusing on the adult population. Predictors were mapped to the Biopsychosocial Model. A risk of bias analysis was conducted using the Prediction Model Risk Of Bias Assessment Tool. RESULTS The final sample included 20 studies. Eighteen studies used predictors from standardized assessments integrated in the EHR. The most common outcome of interest was hospitalization (55%), followed by mortality (25%). Psychological predictors were frequently excluded (35%). Tree based algorithms were most frequently applied (75%). Most studies demonstrated high or unclear risk of bias (75%). CONCLUSION Future studies in HHC should consider incorporating machine learning algorithms into clinical decision support systems to identify patients at risk. Based on the Biopsychosocial model, psychological and interpersonal characteristics should be used along with biological characteristics to enhance risk prediction. To facilitate the widespread adoption of machine learning, stakeholders should encourage standardization in the HHC setting.
Collapse
Affiliation(s)
| | - Jiyoun Song
- Columbia University School of Nursing, New York, NY, USA.
| | | | - Kathryn H Bowles
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA.
| | - Maxim Topaz
- Columbia University School of Nursing, New York, NY, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA; Data Science Institute, Columbia University, New York, NY, USA.
| |
Collapse
|
8
|
Song J, Ojo M, Bowles KH, McDonald MV, Cato K, Rossetti SC, Adams V, Chae S, Hobensack M, Kennedy E, Tark A, Kang MJ, Woo K, Barrón Y, Sridharan S, Topaz M. Detecting Language Associated With Home Healthcare Patient's Risk for Hospitalization and Emergency Department Visit. Nurs Res 2022; 71:285-294. [PMID: 35171126 PMCID: PMC9246992 DOI: 10.1097/nnr.0000000000000586] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
BACKGROUND About one in five patients receiving home healthcare (HHC) services are hospitalized or visit an emergency department (ED) during a home care episode. Early identification of at-risk patients can prevent these negative outcomes. However, risk indicators, including language in clinical notes that indicate a concern about a patient, are often hidden in narrative documentation throughout their HHC episode. OBJECTIVE The aim of the study was to develop an automated natural language processing (NLP) algorithm to identify concerning language indicative of HHC patients' risk of hospitalizations or ED visits. METHODS This study used the Omaha System-a standardized nursing terminology that describes problems/signs/symptoms that can occur in the community setting. First, five HHC experts iteratively reviewed the Omaha System and identified concerning concepts indicative of HHC patients' risk of hospitalizations or ED visits. Next, we developed and tested an NLP algorithm to identify these concerning concepts in HHC clinical notes automatically. The resulting NLP algorithm was applied on a large subset of narrative notes (2.3 million notes) documented for 66,317 unique patients ( n = 87,966 HHC episodes) admitted to one large HHC agency in the Northeast United States between 2015 and 2017. RESULTS A total of 160 Omaha System signs/symptoms were identified as concerning concepts for hospitalizations or ED visits in HHC. These signs/symptoms belong to 31 of the 42 available Omaha System problems. Overall, the NLP algorithm showed good performance in identifying concerning concepts in clinical notes. More than 18% of clinical notes were detected as having at least one concerning concept, and more than 90% of HHC episodes included at least one Omaha System problem. The most frequently documented concerning concepts were pain, followed by issues related to neuromusculoskeletal function, circulation, mental health, and communicable/infectious conditions. CONCLUSION Our findings suggest that concerning problems or symptoms that could increase the risk of hospitalization or ED visit were frequently documented in narrative clinical notes. NLP can automatically extract information from narrative clinical notes to improve our understanding of care needs in HHC. Next steps are to evaluate which concerning concepts identified in clinical notes predict hospitalization or ED visit.
Collapse
|
9
|
Song J, Hobensack M, Bowles KH, McDonald MV, Cato K, Rossetti SC, Chae S, Kennedy E, Barrón Y, Sridharan S, Topaz M. Clinical notes: An untapped opportunity for improving risk prediction for hospitalization and emergency department visit during home health care. J Biomed Inform 2022; 128:104039. [PMID: 35231649 PMCID: PMC9825202 DOI: 10.1016/j.jbi.2022.104039] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/22/2022] [Accepted: 02/23/2022] [Indexed: 01/11/2023]
Abstract
BACKGROUND/OBJECTIVE Between 10 and 25% patients are hospitalized or visit emergency department (ED) during home healthcare (HHC). Given that up to 40% of these negative clinical outcomes are preventable, early and accurate prediction of hospitalization risk can be one strategy to prevent them. In recent years, machine learning-based predictive modeling has become widely used for building risk models. This study aimed to compare the predictive performance of four risk models built with various data sources for hospitalization and ED visits in HHC. METHODS Four risk models were built using different variables from two data sources: structured data (i.e., Outcome and Assessment Information Set (OASIS) and other assessment items from the electronic health record (EHR)) and unstructured narrative-free text clinical notes for patients who received HHC services from the largest non-profit HHC organization in New York between 2015 and 2017. Then, five machine learning algorithms (logistic regression, Random Forest, Bayesian network, support vector machine (SVM), and Naïve Bayes) were used on each risk model. Risk model performance was evaluated using the F-score and Precision-Recall Curve (PRC) area metrics. RESULTS During the study period, 8373/86,823 (9.6%) HHC episodes resulted in hospitalization or ED visits. Among five machine learning algorithms on each model, the SVM showed the highest F-score (0.82), while the Random Forest showed the highest PRC area (0.864). Adding information extracted from clinical notes significantly improved the risk prediction ability by up to 16.6% in F-score and 17.8% in PRC. CONCLUSION All models showed relatively good hospitalization or ED visit risk predictive performance in HHC. Information from clinical notes integrated with the structured data improved the ability to identify patients at risk for these emergent care events.
Collapse
Affiliation(s)
- Jiyoun Song
- Columbia University School of Nursing, New York City, NY, USA,Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA,Corresponding author at: Columbia University School of Nursing, 560 West 168th Street, New York, NY 10032, USA. (J. Song)
| | | | - Kathryn H. Bowles
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA,University of Pennsylvania School of Nursing, Department of Biobehavioral Health Sciences, Philadelphia, PA, USA
| | - Margaret V. McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA
| | - Kenrick Cato
- Columbia University School of Nursing, New York City, NY, USA,Emergency Medicine, Columbia University Irving Medical Center, New York, NY, USA
| | - Sarah Collins Rossetti
- Columbia University School of Nursing, New York City, NY, USA,Columbia University, Department of Biomedical Informatics, New York City, NY, USA
| | - Sena Chae
- College of Nursing, University of Iowa, Iowa City, IA, USA
| | - Erin Kennedy
- University of Pennsylvania School of Nursing, Department of Biobehavioral Health Sciences, Philadelphia, PA, USA
| | - Yolanda Barrón
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, NY, USA,Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA,Data Science Institute, Columbia University, New York City, NY, USA
| |
Collapse
|