1
|
Topaz M, Davoudi A, Evans L, Sridharan S, Song J, Chae S, Barrón Y, Hobensack M, Scharp D, Cato K, Rossetti SC, Kapela P, Xu Z, Gupta P, Zhang Z, Mcdonald MV, Bowles KH. Building a Time-Series Model to Predict Hospitalization Risks in Home Health Care: Insights Into Development, Accuracy, and Fairness. J Am Med Dir Assoc 2025; 26:105417. [PMID: 39689864 DOI: 10.1016/j.jamda.2024.105417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 11/09/2024] [Accepted: 11/12/2024] [Indexed: 12/19/2024]
Abstract
OBJECTIVES Home health care (HHC) serves more than 5 million older adults annually in the United States, aiming to prevent unnecessary hospitalizations and emergency department (ED) visits. Despite efforts, up to 25% of patients in HHC experience these adverse events. The underutilization of clinical notes, aggregated data approaches, and potential demographic biases have limited previous HHC risk prediction models. This study aimed to develop a time-series risk model to predict hospitalizations and ED visits in patients in HHC, examine model performance over various prediction windows, identify top predictive variables and map them to data standards, and assess model fairness across demographic subgroups. SETTING AND PARTICIPANTS A total of 27,222 HHC episodes between 2015 and 2017. METHODS The study used health care process modeling of electronic health records, including clinical notes processed with natural language processing techniques and Medicare claims data. A Light Gradient Boosting Machine algorithm was used to develop the risk prediction model, with performance evaluated using 5-fold cross-validation. Model fairness was assessed across gender, race/ethnicity, and socioeconomic subgroups. RESULTS The model achieved high predictive performance, with an F1 score of 0.84 for a 5-day prediction window. Twenty top predictive variables were identified, including novel indicators such as the length of nurse-patient visits and visit frequency. Eighty-five percent of these variables mapped completely to the US Core Data for Interoperability standard. Fairness assessment revealed performance disparities across demographic and socioeconomic groups, with lower model effectiveness for more historically underserved populations. CONCLUSIONS AND IMPLICATIONS This study developed a robust time-series risk model for predicting adverse events in patients in HHC, incorporating diverse data types and demonstrating high predictive accuracy. The findings highlight the importance of considering established and novel risk factors in HHC. Importantly, the observed performance disparities across subgroups emphasize the need for fairness adjustments to ensure equitable risk prediction across all patient populations.
Collapse
Affiliation(s)
- Maxim Topaz
- Columbia University School of Nursing, New York City, NY, USA; Data Science Institute, Columbia University, New York City, NY, USA; Center for Home Care Policy and Research, VNS Health, New York City, NY, USA.
| | - Anahita Davoudi
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA
| | - Lauren Evans
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA
| | - Sridevi Sridharan
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA
| | - Jiyoun Song
- Department Behavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, USA
| | - Sena Chae
- College of Nursing, The University of Iowa, Iowa City, IA, USA
| | - Yolanda Barrón
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA
| | | | - Danielle Scharp
- Columbia University School of Nursing, New York City, NY, USA
| | - Kenrick Cato
- School of Nursing, University of Pennsylvania, Philadelphia, PA, USA
| | - Sarah Collins Rossetti
- Columbia University School of Nursing, New York City, NY, USA; Department of Biomedical Informatics, Columbia University, New York City, NY, USA
| | - Piotr Kapela
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA
| | - Zidu Xu
- Columbia University School of Nursing, New York City, NY, USA
| | - Pallavi Gupta
- Columbia University School of Nursing, New York City, NY, USA
| | - Zhihong Zhang
- Columbia University School of Nursing, New York City, NY, USA; Data Science Institute, Columbia University, New York City, NY, USA
| | - Margaret V Mcdonald
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA
| | - Kathryn H Bowles
- Center for Home Care Policy and Research, VNS Health, New York City, NY, USA; Department Behavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, USA; New Courtland Center for Transitions and Health, School of Nursing, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
2
|
Hobensack M, Scharp D, Song J, Topaz M. Documentation of social determinants of health across individuals from different racial and ethnic groups in home healthcare. J Nurs Scholarsh 2025; 57:39-46. [PMID: 38739091 DOI: 10.1111/jnu.12980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 05/14/2024]
Abstract
INTRODUCTION Home healthcare (HHC) enables patients to receive healthcare services within their homes to manage chronic conditions and recover from illnesses. Recent research has identified disparities in HHC based on race or ethnicity. Social determinants of health (SDOH) describe the external factors influencing a patient's health, such as access to care and social support. Individuals from racially or ethnically minoritized communities are known to be disproportionately affected by SDOH. Existing evidence suggests that SDOH are documented in clinical notes. However, no prior study has investigated the documentation of SDOH across individuals from different racial or ethnic backgrounds in the HHC setting. This study aimed to (1) describe frequencies of SDOH documented in clinical notes by race or ethnicity and (2) determine associations between race or ethnicity and SDOH documentation. DESIGN Retrospective data analysis. METHODS We conducted a cross-sectional secondary data analysis of 86,866 HHC episodes representing 65,693 unique patients from one large HHC agency in New York collected between January 1, 2015, and December 31, 2017. We reported the frequency of six SDOH (physical environment, social environment, housing and economic circumstances, food insecurity, access to care, and education and literacy) documented in clinical notes across individuals reported as Asian/Pacific Islander, Black, Hispanic, multi-racial, Native American, or White. We analyzed differences in SDOH documentation by race or ethnicity using logistic regression models. RESULTS Compared to patients reported as White, patients across other racial or ethnic groups had higher frequencies of SDOH documented in their clinical notes. Our results suggest that race or ethnicity is associated with SDOH documentation in HHC. CONCLUSION As the study of SDOH in HHC continues to evolve, our results provide a foundation to evaluate social information in the HHC setting and understand how it influences the quality of care provided. CLINICAL RELEVANCE The results of this exploratory study can help clinicians understand the differences in SDOH across individuals from different racial and ethnic groups and serve as a foundation for future research aimed at fostering more inclusive HHC documentation practices.
Collapse
Affiliation(s)
- Mollie Hobensack
- Department of Geriatrics and Palliative Care, Icahn School of Medicine at Mount Sinai, New York City, New York, USA
| | - Danielle Scharp
- Columbia University School of Nursing, New York City, New York, USA
| | - Jiyoun Song
- University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, New York, USA
- Data Science Institute, Columbia University, New York City, New York, USA
- Center for Home Care Policy & Research, VNS Health, New York City, New York, USA
| |
Collapse
|
4
|
Sun S, Zack T, Williams CYK, Sushil M, Butte AJ. Topic modeling on clinical social work notes for exploring social determinants of health factors. JAMIA Open 2024; 7:ooad112. [PMID: 38223407 PMCID: PMC10788143 DOI: 10.1093/jamiaopen/ooad112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 12/17/2023] [Accepted: 12/23/2023] [Indexed: 01/16/2024] Open
Abstract
Objective Existing research on social determinants of health (SDoH) predominantly focuses on physician notes and structured data within electronic medical records. This study posits that social work notes are an untapped, potentially rich source for SDoH information. We hypothesize that clinical notes recorded by social workers, whose role is to ameliorate social and economic factors, might provide a complementary information source of data on SDoH compared to physician notes, which primarily concentrate on medical diagnoses and treatments. We aimed to use word frequency analysis and topic modeling to identify prevalent terms and robust topics of discussion within a large cohort of social work notes including both outpatient and in-patient consultations. Materials and methods We retrieved a diverse, deidentified corpus of 0.95 million clinical social work notes from 181 644 patients at the University of California, San Francisco. We conducted word frequency analysis related to ICD-10 chapters to identify prevalent terms within the notes. We then applied Latent Dirichlet Allocation (LDA) topic modeling analysis to characterize this corpus and identify potential topics of discussion, which was further stratified by note types and disease groups. Results Word frequency analysis primarily identified medical-related terms associated with specific ICD10 chapters, though it also detected some subtle SDoH terms. In contrast, the LDA topic modeling analysis extracted 11 topics explicitly related to social determinants of health risk factors, such as financial status, abuse history, social support, risk of death, and mental health. The topic modeling approach effectively demonstrated variations between different types of social work notes and across patients with different types of diseases or conditions. Discussion Our findings highlight LDA topic modeling's effectiveness in extracting SDoH-related themes and capturing variations in social work notes, demonstrating its potential for informing targeted interventions for at-risk populations. Conclusion Social work notes offer a wealth of unique and valuable information on an individual's SDoH. These notes present consistent and meaningful topics of discussion that can be effectively analyzed and utilized to improve patient care and inform targeted interventions for at-risk populations.
Collapse
Affiliation(s)
- Shenghuan Sun
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Travis Zack
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, United States
- Division of Hematology/Oncology, Department of Medicine, UCSF, San Francisco, CA 94143, United States
| | - Christopher Y K Williams
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Madhumita Sushil
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, United States
- Center for Data-driven Insights and Innovation, University of California, Office of the President, Oakland, CA 94607, United States
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94143, United States
| |
Collapse
|
5
|
Scharp D, Hobensack M, Davoudi A, Topaz M. Natural Language Processing Applied to Clinical Documentation in Post-acute Care Settings: A Scoping Review. J Am Med Dir Assoc 2024; 25:69-83. [PMID: 37838000 PMCID: PMC10792659 DOI: 10.1016/j.jamda.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 10/16/2023]
Abstract
OBJECTIVES To determine the scope of the application of natural language processing to free-text clinical notes in post-acute care and provide a foundation for future natural language processing-based research in these settings. DESIGN Scoping review; reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines. SETTING AND PARTICIPANTS Post-acute care (ie, home health care, long-term care, skilled nursing facilities, and inpatient rehabilitation facilities). METHODS PubMed, Cumulative Index of Nursing and Allied Health Literature, and Embase were searched in February 2023. Eligible studies had quantitative designs that used natural language processing applied to clinical documentation in post-acute care settings. The quality of each study was appraised. RESULTS Twenty-one studies were included. Almost all studies were conducted in home health care settings. Most studies extracted data from electronic health records to examine the risk for negative outcomes, including acute care utilization, medication errors, and suicide mortality. About half of the studies did not report age, sex, race, or ethnicity data or use standardized terminologies. Only 8 studies included variables from socio-behavioral domains. Most studies fulfilled all quality appraisal indicators. CONCLUSIONS AND IMPLICATIONS The application of natural language processing is nascent in post-acute care settings. Future research should apply natural language processing using standardized terminologies to leverage free-text clinical notes in post-acute care to promote timely, comprehensive, and equitable care. Natural language processing could be integrated with predictive models to help identify patients who are at risk of negative outcomes. Future research should incorporate socio-behavioral determinants and diverse samples to improve health equity in informatics tools.
Collapse
Affiliation(s)
| | | | - Anahita Davoudi
- VNS Health, Center for Home Care Policy & Research, New York, NY, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York, NY, USA
| |
Collapse
|