1
|
Gu Z, He L, Naeem A, Chan PM, Mohamed A, Khalil H, Guo Y, Shi W, Dupre ME, Xiao G, Peterson ED, Xie Y, Navar AM, Yang DM. SBDH-Reader: an LLM-powered method for extracting social and behavioral determinants of health from medical notes. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.02.19.25322576. [PMID: 40034759 PMCID: PMC11875322 DOI: 10.1101/2025.02.19.25322576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Introduction Social and behavioral determinants of health (SBDH) are increasingly recognized as essential for prognostication and informing targeted interventions. While medical notes contain rich SBDH details, these are unstructured and conventional extraction methods tend to be labor intensive, inaccurate, and/or unscalable. The emergence of large language models (LLMs) presents an opportunity to develop more effective approaches for extracting SBDH data. Materials and Methods We developed the SBDH-Reader, an LLM-powered method to extract structured SBDH data from full-length medical notes through prompt engineering. Six SBDH categories were queried including: employment, housing, marital relationship, and substance use including alcohol, tobacco, and drug use. The development dataset included 7,225 notes from 6,382 patients in the MIMIC-III database. The method was then independently tested on 971 notes from 437 patients at UT Southwestern Medical Center (UTSW). We evaluated SBDH-Reader's performance using precision, recall, F1, and confusion matrix. Results When tested on the UTSW validation set, the GPT-4o-based SBDH-Reader achieved a macro-average F1 ranging from 0.85 to 0.98 across six SBDH categories. For clinically relevant adverse attributes, F1 ranged from 0.94 (employment) to 0.99 (tobacco use). When extracting any adverse attributes across all SBDH categories, the SBDH-Reader achieved an F1 of 0.96, recall of 0.97, and precision of 0.96 in this independent validation set. Conclusion A general-purpose LLM can accurately extract structured SBDH data through effective prompt engineering. The SBDH-Reader has the potential to serve as a scalable and effective method for collecting real-time, patient-level SBDH data to support clinical research and care.
Collapse
Affiliation(s)
- Zifan Gu
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Lesi He
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Awais Naeem
- School of Information, University of Texas at Austin, Austin, Texas, USA
| | - Pui Man Chan
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Asim Mohamed
- The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Hafsa Khalil
- The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Yujia Guo
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Wenqi Shi
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Matthew E. Dupre
- Department of Population Health Sciences, Duke University, Durham, North Carolina, USA
- Department of Sociology, Duke University, Durham, North Carolina, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Eric D. Peterson
- Department of Internal Medicine, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Yang Xie
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
- Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Ann Marie Navar
- Department of Internal Medicine, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Donghan M. Yang
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
2
|
Farmer HR, Xu H, Granger BB, Thomas KL, Dupre ME. Factors associated with racial differences in all-cause 30-day readmission in adults with cardiovascular disease: an observational study of a large healthcare system. BMJ Open 2022; 12:e051661. [PMID: 36424114 PMCID: PMC9693888 DOI: 10.1136/bmjopen-2021-051661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
OBJECTIVE To examine factors contributing to racial differences in 30-day readmission in patients with cardiovascular disease (CVD). DESIGN Patients were enrolled from 1 January 2015 to 31 August 2017 and data were collected from electronic health records and a standardised interview administered prior to discharge. SETTING Duke Heart Center in the Duke University Health System. PARTICIPANTS Patients aged 18 and older admitted for the treatment of cardiovascular-related conditions (n=734). MAIN OUTCOME AND MEASURES All-cause readmission within 30 days was the main outcome. Multivariate logistic regression models were used to examine whether and to what extent socioeconomic, psychosocial, behavioural and healthcare-related factors contributed to 30-day readmissions in Black and White CVD patients. RESULTS The median age of patients was 66 years and 18.1% (n=133) were readmitted within 30 days after discharge. Black patients were more likely than White patients to be readmitted (OR 1.62; 95% CI 1.18 to 2.23) and the racial difference in readmissions was largely reduced after taking into account differences in a wide range of clinical and non-clinical factors (OR 1.37; 95% CI 0.98 to 1.91). In Black patients, readmission risks were especially high in those who were retired (OR 3.71; 95% CI 1.71 to 8.07), never married (OR 2.21; 95% CI 1.21 to 4.05), had difficulty accessing their routine care (OR 2.88; 95% CI 1.70 to 4.88) or had been hospitalised in the prior year (OR 1.97; 95% CI 1.16 to 3.37). In White patients, being widowed (OR 2.39; 95% CI 1.41 to 4.07) and reporting a higher number of depressive symptoms (OR 1.07; 95% CI 1.00 to 1.13) were the key factors associated with higher risks of readmission. CONCLUSIONS AND RELEVANCE Black patients were more likely than White patients to be readmitted within 30 days after hospitalisation for CVD. The factors contributing to readmission differed by race and offer important clues for identifying patients at high risk of readmission and tailoring interventions to reduce these risks.
Collapse
Affiliation(s)
- Heather R Farmer
- Department of Human Development and Family Sciences, University of Delaware, Newark, Delaware, USA
| | - Hanzhang Xu
- Duke University School of Nursing, Durham, North Carolina, USA
| | - Bradi B Granger
- Duke University School of Nursing, Durham, North Carolina, USA
| | - Kevin L Thomas
- Division of Cardiology, Department of Medicine, Duke University, Durham, North Carolina, USA
- Duke Clinical Research Institute, Durham, North Carolina, USA
| | - Matthew E Dupre
- Duke Clinical Research Institute, Durham, North Carolina, USA
- Department of Population Health Sciences, Duke University, Durham, North Carolina, USA
- Department of Sociology, Duke University, Durham, North Carolina, USA
| |
Collapse
|
3
|
Xu H, Farmer HR, Granger BB, Thomas KL, Peterson ED, Dupre ME. Perceived Versus Actual Risks of 30-Day Readmission in Patients With Cardiovascular Disease. Circ Cardiovasc Qual Outcomes 2021; 14:e006586. [PMID: 33430612 DOI: 10.1161/circoutcomes.120.006586] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Cardiovascular disease (CVD) is the leading cause of hospitalization in the United States, and patients with CVD are at a high risk of readmission after discharge. We examined whether patients' perceived risk of readmission at discharge was associated with actual 30-day readmissions in patients hospitalized with CVD. METHODS We recruited 730 patients from the Duke Heart Center who were admitted for treatment of CVD between January 1, 2015, and August 31, 2017. A standardized survey was linked with electronic health records to ascertain patients' perceived risk of readmission, and other sociodemographic, psychosocial, behavioral, and clinical data before discharge. All-cause readmission within 30 days after discharge was examined. RESULTS Nearly 1-in-3 patients perceived a high risk of readmission at index admission and those who perceived a high risk had significantly more readmissions within 30 days than patients who perceived low risks of readmission (23.6% versus 15.8%, P=0.016). Among those who perceived a high risk of readmission, non-White patients (odds ratio [OR], 2.07 [95% CI, 1.28-3.36]), those with poor self-rated health (OR, 2.30 [95% CI, 1.38-3.85]), difficulty accessing care (OR, 2.72 [95% CI, 1.24-6.00]), and prior hospitalizations in the past year (OR, 2.13 [95% CI, 1.21-3.74]) were more likely to be readmitted. Among those who perceived a low risk of readmission, patients who were widowed (OR, 2.69 [95% CI, 1.60-4.51]) and reported difficulty accessing care (OR, 1.89 [95% CI, 1.07-3.33]) were more likely to be readmitted. CONCLUSIONS Patients who perceived a high risk of readmission had a higher rate of 30-day readmission than patients who perceived a low risk. These findings have important implications for identifying CVD patients at a high risk of 30-day readmission and targeting the factors associated with perceived and actual risks of readmission.
Collapse
Affiliation(s)
- Hanzhang Xu
- Department of Family Medicine and Community Health (H.X.), Duke University, Durham, NC.,Duke University School of Nursing (H.X., B.B.G.), Duke University, Durham, NC.,Center for the Study of Aging and Human Development (H.X., M.E.D.), Duke University, Durham, NC
| | - Heather R Farmer
- Department of Human Development and Family Sciences, University of Delaware, Newark (H.R.F.)
| | - Bradi B Granger
- Duke University School of Nursing (H.X., B.B.G.), Duke University, Durham, NC
| | - Kevin L Thomas
- Duke Clinical Research Institute (K.L.T., E.D.P., M.E.D.), Duke University, Durham, NC.,Division of Cardiology, Department of Medicine (K.L.T.), Duke University, Durham, NC
| | - Eric D Peterson
- Duke Clinical Research Institute (K.L.T., E.D.P., M.E.D.), Duke University, Durham, NC.,Department of Internal Medicine, University of Texas Southwestern Medical Center (E.D.P)
| | - Matthew E Dupre
- Center for the Study of Aging and Human Development (H.X., M.E.D.), Duke University, Durham, NC.,Duke Clinical Research Institute (K.L.T., E.D.P., M.E.D.), Duke University, Durham, NC.,Department of Population Health Sciences (M.E.D.), Duke University, Durham, NC.,Department of Sociology (M.E.D.), Duke University, Durham, NC
| |
Collapse
|
4
|
de Albuquerque NLS, de Araujo TL, de Oliveira Lopes MV, Moreira TMM. Hierarchical analysis of factors associated with hospital readmissions for coronary heart disease: A case-control study. J Clin Nurs 2020; 29:2329-2337. [PMID: 32222077 DOI: 10.1111/jocn.15244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 01/18/2020] [Accepted: 03/14/2020] [Indexed: 12/20/2022]
Abstract
AIMS AND OBJECTIVES To analyse, hierarchically, factors associated with hospital readmissions for acute coronary syndrome. BACKGROUND Hospital readmissions have risen, especially in patients with multiple comorbidities, which are most often chronic. The leading causes of hospital readmission include acute coronary syndrome, which is costly and often preventable. Determining clinical and nonclinical variables that increase the chances of readmission is important to assess and evaluate patients hospitalised for coronary heart diseases. DESIGN A case-control study whose dependent variable was hospital readmission for acute coronary syndrome. METHODS The study included 277 inpatients, of whom 132 were in their first hospitalisation and 145 had already been hospitalised for acute coronary syndrome. The independent variables for this hierarchical model were sociodemographic conditions, life habits, access to health services and physical health measures. Data were obtained by interviews, anthropometric measurements and patient records. Logistic regression analysis was performed using the stepwise technique, with Microsoft Excel and R version 3.2.3. The research was reported via strengthening the reporting of observational studies in epidemiology (STROBE). RESULTS In the final hierarchical logistic model, the following risk factors were associated with readmission for acute coronary syndrome: inadequate drug therapy adherence, stress, history of smoking for 30 years or more, and the lack of use of primary healthcare services. CONCLUSIONS Clinical and nonclinical variables are related to hospital readmission for acute coronary syndrome and can increase the chance of readmission by up to six times. RELEVANCE TO CLINICAL PRACTICE The predictive model can be used to avoid readmission for acute coronary syndrome, and it represents an advance in the prediction of the occurrence of the outcome. This implies the need for a reorientation of the network for postdischarge care in the first hospitalisation for acute coronary syndrome.
Collapse
|
5
|
Dupre ME, Xu H, Granger BB, Lynch SM, Nelson A, Churchill E, Willis JM, Curtis LH, Peterson ED. Access to routine care and risks for 30-day readmission in patients with cardiovascular disease. Am Heart J 2018; 196:9-17. [PMID: 29421019 PMCID: PMC5919257 DOI: 10.1016/j.ahj.2017.10.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 10/02/2017] [Indexed: 01/19/2023]
Abstract
BACKGROUND Studies have shown that access to routine medical care is associated with the prevention, diagnosis, and treatment of chronic diseases. However, studies have not examined whether patient-reported difficulties in access to care are associated with rehospitalization in patients with cardiovascular disease. METHODS Electronic medical records and a standardized survey were used to examine cardiovascular patients admitted to a large medical center from January 1, 2015 through January 10, 2017 (n=520). All-cause readmission within 30 days of discharge was the primary outcome for analysis. Logistic regression models were used to examine the association between access to care and 30-day readmission while adjusting for patient demographics, socioeconomic status, healthcare utilization, and health status. RESULTS Nearly 1-in-6 patients (15.7%) reported difficulty in accessing routine medical care; and those who were younger, male, non-white, uninsured, with heart failure, and had low social support were significantly more likely to report difficulty. Patients who reported difficulty in accessing care had significantly higher rates of 30-day readmission than patients who did not report difficulty (33.3% vs. 17.9%; P=.001); and the risks remained largely unchanged after accounting for nearly two dozen covariates (unadjusted odds ratio [OR]=2.29; 95% CI, 1.46-3.60 vs. adjusted OR=2.17; 95% CI, 1.29-3.66). Risks for readmission were especially high for patients who reported issues with transportation (OR=3.24; 95% CI, 1.28-8.16) and scheduling appointments (OR=3.56; 95% CI, 1.43-8.84), but not for other reasons (OR=1.47; 95% CI, 0.61-3.54). CONCLUSIONS Cardiovascular patients who reported difficulty in accessing routine care had substantial risks of readmission within 30 days after discharge. These findings have important implications for identifying high-risk patients and developing interventions to improve access to routine medical care.
Collapse
Affiliation(s)
- Matthew E Dupre
- Duke Clinical Research Institute, Duke University, Durham, NC; Department of Population Health Sciences, Duke University, Durham, NC; Department of Sociology, Duke University, Durham, NC.
| | - Hanzhang Xu
- Duke School of Nursing, Duke University Medical Center, Durham, NC
| | - Bradi B Granger
- Duke School of Nursing, Duke University Medical Center, Durham, NC
| | - Scott M Lynch
- Department of Sociology, Duke University, Durham, NC
| | - Alicia Nelson
- Department of Population Health Sciences, Duke University, Durham, NC
| | - Erik Churchill
- Duke Office of Clinical Research, Duke University Medical Center, Durham, NC
| | - Janese M Willis
- Department of Population Health Sciences, Duke University, Durham, NC
| | - Lesley H Curtis
- Duke Clinical Research Institute, Duke University, Durham, NC; Department of Population Health Sciences, Duke University, Durham, NC
| | - Eric D Peterson
- Duke Clinical Research Institute, Duke University, Durham, NC; Department of Medicine, Division of Cardiology, Duke University, Durham, NC
| |
Collapse
|
6
|
Dupre ME, Nelson A, Lynch SM, Granger BB, Xu H, Churchill E, Willis JM, Curtis LH, Peterson ED. Socioeconomic, Psychosocial and Behavioral Characteristics of Patients Hospitalized With Cardiovascular Disease. Am J Med Sci 2017; 354:565-572. [PMID: 29208253 DOI: 10.1016/j.amjms.2017.07.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Revised: 06/13/2017] [Accepted: 07/24/2017] [Indexed: 11/18/2022]
Abstract
BACKGROUND Recent studies have drawn attention to nonclinical factors to better understand disparities in the development, treatment and prognosis of patients with cardiovascular disease. However, there has been limited research describing the nonclinical characteristics of patients hospitalized for cardiovascular care. METHODS Data for this study come from 520 patients admitted to the Duke Heart Center from January 1, 2015 through January 10, 2017. Electronic medical records and a standardized survey administered before discharge were used to ascertain detailed information on patients' demographic (age, sex, race, marital status and living arrangement), socioeconomic (education, employment and health insurance), psychosocial (health literacy, health self-efficacy, social support, stress and depressive symptoms) and behavioral (smoking, drinking and medication adherence) attributes. RESULTS Study participants were of a median age of 65 years, predominantly male (61.4%), non-Hispanic white (67.1%), hospitalized for 5.11 days and comparable to all patients admitted during this period. Results from the survey showed significant heterogeneity among patients in their demographic, socioeconomic and behavioral characteristics. We also found that the patients' levels of psychosocial risks and resources were significantly associated with many of these nonclinical characteristics. Patients who were older, women, nonwhite and unmarried had generally lower levels of health literacy, self-efficacy and social support, and higher levels of stress and depressive symptoms than their counterparts. CONCLUSIONS Patients hospitalized with cardiovascular disease have diverse nonclinical profiles that have important implications for targeting interventions. A better understanding of these characteristics will enhance the personalized delivery of care and improve outcomes in vulnerable patient groups.
Collapse
Affiliation(s)
- Matthew E Dupre
- Duke Clinical Research Institute, Duke University, Durham, North Carolina; Department of Population Health Sciences, Duke University, Durham, North Carolina; Department of Sociology, Duke University, Durham, North Carolina.
| | - Alicia Nelson
- Department of Community and Family Medicine, Duke University, Durham, North Carolina
| | - Scott M Lynch
- Department of Sociology, Duke University, Durham, North Carolina
| | - Bradi B Granger
- Duke School of Nursing, Duke University Medical Center, Durham, North Carolina
| | - Hanzhang Xu
- Duke School of Nursing, Duke University Medical Center, Durham, North Carolina
| | - Erik Churchill
- Duke Office of Clinical Research, Duke University Medical Center, Durham, North Carolina
| | - Janese M Willis
- Department of Community and Family Medicine, Duke University, Durham, North Carolina
| | - Lesley H Curtis
- Duke Clinical Research Institute, Duke University, Durham, North Carolina; Department of Population Health Sciences, Duke University, Durham, North Carolina
| | - Eric D Peterson
- Duke Clinical Research Institute, Duke University, Durham, North Carolina; Department of Medicine, Duke University, Durham, North Carolina
| |
Collapse
|