Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bejan CA, Angiolillo J, Conway D, Nash R, Shirey-Rice JK, Lipworth L, Cronin RM, Pulley J, Kripalani S, Barkin S, Johnson KB, Denny JC. Mining 100 million notes to find homelessness and adverse childhood experiences: 2 case studies of rare and severe social determinants of health in electronic health records. J Am Med Inform Assoc 2018;25:61-71. [PMID: 29016793 PMCID: PMC6080810 DOI: 10.1093/jamia/ocx059] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Revised: 04/22/2017] [Accepted: 05/10/2017] [Indexed: 01/25/2023] Open

For:	Bejan CA, Angiolillo J, Conway D, Nash R, Shirey-Rice JK, Lipworth L, Cronin RM, Pulley J, Kripalani S, Barkin S, Johnson KB, Denny JC. Mining 100 million notes to find homelessness and adverse childhood experiences: 2 case studies of rare and severe social determinants of health in electronic health records. J Am Med Inform Assoc 2018;25:61-71. [PMID: 29016793 PMCID: PMC6080810 DOI: 10.1093/jamia/ocx059] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Revised: 04/22/2017] [Accepted: 05/10/2017] [Indexed: 01/25/2023] Open

Number

Cited by Other Article(s)

Farcas AM, Crowe RP, Kennel J, Little N, Haamid A, Camacho MA, Pleasant T, Owusu-Ansah S, Joiner AP, Tripp R, Kimbrell J, Grover JM, Ashford S, Burton B, Uribe J, Innes JC, Page DI, Taigman M, Dorsett M. Achieving Equity in EMS Care and Patient Outcomes Through Quality Management Systems: A Position Statement. PREHOSP EMERG CARE 2024:1-11. [PMID: 38727731 DOI: 10.1080/10903127.2024.2352582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 04/29/2024] [Indexed: 05/18/2024]

Ralevski A, Taiyab N, Nossal M, Mico L, Piekos SN, Hadlock J. Using Large Language Models to Annotate Complex Cases of Social Determinants of Health in Longitudinal Clinical Records. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.25.24306380. [PMID: 38712224 PMCID: PMC11071574 DOI: 10.1101/2024.04.25.24306380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]

Abstract

Social Determinants of Health (SDoH) are an important part of the exposome and are known to have a large impact on variation in health outcomes. In particular, housing stability is known to be intricately linked to a patient's health status, and pregnant women experiencing housing instability (HI) are known to have worse health outcomes. Most SDoH information is stored in electronic health records (EHRs) as free text (unstructured) clinical notes, which traditionally required natural language processing (NLP) for automatic identification of relevant text or keywords. A patient's housing status can be ambiguous or subjective, and can change from note to note or within the same note, making it difficult to use existing NLP solutions. New developments in NLP allow researchers to prompt LLMs to perform complex, subjective annotation tasks that require reasoning that previously could only be attempted by human annotators. For example, large language models (LLMs) such as GPT (Generative Pre-trained Transformer) enable researchers to analyze complex, unstructured data using simple prompts. We used a secure platform within a large healthcare system to compare the ability of GPT-3.5 and GPT-4 to identify instances of both current and past housing instability, as well as general housing status, from 25,217 notes from 795 pregnant women. Results from these LLMs were compared with results from manual annotation, a named entity recognition (NER) model, and regular expressions (RegEx). We developed a chain-of-thought prompt requiring evidence and justification for each note from the LLMs, to help maximize the chances of finding relevant text related to HI while minimizing hallucinations and false positives. Compared with GPT-3.5 and the NER model, GPT-4 had the highest performance and had a much higher recall (0.924) than human annotators (0.702) in identifying patients experiencing current or past housing instability, although precision was lower (0.850) compared with human annotators (0.971). In most cases, the evidence output by GPT-4 was similar or identical to that of human annotators, and there was no evidence of hallucinations in any of the outputs from GPT-4. Most cases where the annotators and GPT-4 differed were ambiguous or subjective, such as "living in an apartment with too many people". We also looked at GPT-4 performance on de-identified versions of the same notes and found that precision improved slightly (0.936 original, 0.939 de-identified), while recall dropped (0.781 original, 0.704 de-identified). This work demonstrates that, while manual annotation is likely to yield slightly more accurate results overall, LLMs, when compared with manual annotation, provide a scalable, cost-effective solution with the advantage of greater recall. At the same time, further evaluation is needed to address the risk of missed cases and bias in the initial selection of housing-related notes. Additionally, while it was possible to reduce confabulation, signs of unusual justifications remained. Given these factors, together with changes in both LLMs and charting over time, this approach is not yet appropriate for use as a fully-automated process. However, these results demonstrate the potential for using LLMs for computer-assisted annotation with human review, reducing cost and increasing recall. More efficient methods for obtaining structured SDoH data can help accelerate inclusion of exposome variables in biomedical research, and support healthcare systems in identifying patients who could benefit from proactive outreach.

Collapse

Scherbakov D, Mollalo A, Lenert L. Stressful life events in electronic health records: a scoping review. J Am Med Inform Assoc 2024;31:1025-1035. [PMID: 38349862 PMCID: PMC10990522 DOI: 10.1093/jamia/ocae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 01/19/2024] [Accepted: 01/27/2024] [Indexed: 02/15/2024] Open

Sun S, Zack T, Williams CYK, Sushil M, Butte AJ. Topic modeling on clinical social work notes for exploring social determinants of health factors. JAMIA Open 2024;7:ooad112. [PMID: 38223407 PMCID: PMC10788143 DOI: 10.1093/jamiaopen/ooad112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 12/17/2023] [Accepted: 12/23/2023] [Indexed: 01/16/2024] Open

Abstract

Objective

Existing research on social determinants of health (SDoH) predominantly focuses on physician notes and structured data within electronic medical records. This study posits that social work notes are an untapped, potentially rich source for SDoH information. We hypothesize that clinical notes recorded by social workers, whose role is to ameliorate social and economic factors, might provide a complementary information source of data on SDoH compared to physician notes, which primarily concentrate on medical diagnoses and treatments. We aimed to use word frequency analysis and topic modeling to identify prevalent terms and robust topics of discussion within a large cohort of social work notes including both outpatient and in-patient consultations.

Materials and methods

We retrieved a diverse, deidentified corpus of 0.95 million clinical social work notes from 181 644 patients at the University of California, San Francisco. We conducted word frequency analysis related to ICD-10 chapters to identify prevalent terms within the notes. We then applied Latent Dirichlet Allocation (LDA) topic modeling analysis to characterize this corpus and identify potential topics of discussion, which was further stratified by note types and disease groups.

Results

Word frequency analysis primarily identified medical-related terms associated with specific ICD10 chapters, though it also detected some subtle SDoH terms. In contrast, the LDA topic modeling analysis extracted 11 topics explicitly related to social determinants of health risk factors, such as financial status, abuse history, social support, risk of death, and mental health. The topic modeling approach effectively demonstrated variations between different types of social work notes and across patients with different types of diseases or conditions.

Discussion

Our findings highlight LDA topic modeling's effectiveness in extracting SDoH-related themes and capturing variations in social work notes, demonstrating its potential for informing targeted interventions for at-risk populations.

Conclusion

Social work notes offer a wealth of unique and valuable information on an individual's SDoH. These notes present consistent and meaningful topics of discussion that can be effectively analyzed and utilized to improve patient care and inform targeted interventions for at-risk populations.

Collapse

Hatef E, Chang HY, Richards TM, Kitchen C, Budaraju J, Foroughmand I, Lasser EC, Weiner JP. Development of a Social Risk Score in the Electronic Health Record to Identify Social Needs Among Underserved Populations: Retrospective Study. JMIR Form Res 2024;8:e54732. [PMID: 38470477 DOI: 10.2196/54732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/02/2024] [Accepted: 02/08/2024] [Indexed: 03/13/2024] Open

Abstract

BACKGROUND

Patients with unmet social needs and social determinants of health (SDOH) challenges continue to face a disproportionate risk of increased prevalence of disease, health care use, higher health care costs, and worse outcomes. Some existing predictive models have used the available data on social needs and SDOH challenges to predict health-related social needs or the need for various social service referrals. Despite these one-off efforts, the work to date suggests that many technical and organizational challenges must be surmounted before SDOH-integrated solutions can be implemented on an ongoing, wide-scale basis within most US-based health care organizations.

OBJECTIVE

We aimed to retrieve available information in the electronic health record (EHR) relevant to the identification of persons with social needs and to develop a social risk score for use within clinical practice to better identify patients at risk of having future social needs.

METHODS

We conducted a retrospective study using EHR data (2016-2021) and data from the US Census American Community Survey. We developed a prospective model using current year-1 risk factors to predict future year-2 outcomes within four 2-year cohorts. Predictors of interest included demographics, previous health care use, comorbidity, previously identified social needs, and neighborhood characteristics as reflected by the area deprivation index. The outcome variable was a binary indicator reflecting the likelihood of the presence of a patient with social needs. We applied a generalized estimating equation approach, adjusting for patient-level risk factors, the possible effect of geographically clustered data, and the effect of multiple visits for each patient.

RESULTS

The study population of 1,852,228 patients included middle-aged (mean age range 53.76-55.95 years), White (range 324,279/510,770, 63.49% to 290,688/488,666, 64.79%), and female (range 314,741/510,770, 61.62% to 278,488/448,666, 62.07%) patients from neighborhoods with high socioeconomic status (mean area deprivation index percentile range 28.76-30.31). Between 8.28% (37,137/448,666) and 11.55% (52,037/450,426) of patients across the study cohorts had at least 1 social need documented in their EHR, with safety issues and economic challenges (ie, financial resource strain, employment, and food insecurity) being the most common documented social needs (87,152/1,852,228, 4.71% and 58,242/1,852,228, 3.14% of overall patients, respectively). The model had an area under the curve of 0.702 (95% CI 0.699-0.705) in predicting prospective social needs in the overall study population. Previous social needs (odds ratio 3.285, 95% CI 3.237-3.335) and emergency department visits (odds ratio 1.659, 95% CI 1.634-1.684) were the strongest predictors of future social needs.

CONCLUSIONS

Our model provides an opportunity to make use of available EHR data to help identify patients with high social needs. Our proposed social risk score could help identify the subset of patients who would most benefit from further social needs screening and data collection to avoid potentially more burdensome primary data collection on all patients in a target population of interest.

Collapse

Li C, Mowery DL, Ma X, Yang R, Vurgun U, Hwang S, Donnelly HK, Bandhey H, Akhtar Z, Senathirajah Y, Sadhu EM, Getzen E, Freda PJ, Long Q, Becich MJ. Realizing the Potential of Social Determinants Data: A Scoping Review of Approaches for Screening, Linkage, Extraction, Analysis and Interventions. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.04.24302242. [PMID: 38370703 PMCID: PMC10871446 DOI: 10.1101/2024.02.04.24302242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]

Guevara M, Chen S, Thomas S, Chaunzwa TL, Franco I, Kann BH, Moningi S, Qian JM, Goldstein M, Harper S, Aerts HJWL, Catalano PJ, Savova GK, Mak RH, Bitterman DS. Large language models to identify social determinants of health in electronic health records. NPJ Digit Med 2024;7:6. [PMID: 38200151 PMCID: PMC10781957 DOI: 10.1038/s41746-023-00970-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 11/15/2023] [Indexed: 01/12/2024] Open

Affiliation(s)

Marco Guevara Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Shan Chen Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Spencer Thomas Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA Computational Health Informatics Program, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
Tafadzwa L Chaunzwa Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Idalid Franco Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Benjamin H Kann Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Shalini Moningi Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Jack M Qian Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Madeleine Goldstein Adult Resource Office, Dana-Farber Cancer Institute, Boston, MA, USA
Susan Harper Adult Resource Office, Dana-Farber Cancer Institute, Boston, MA, USA
Hugo J W L Aerts Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA Radiology and Nuclear Medicine, GROW & CARIM, Maastricht University, Maastricht, The Netherlands
Paul J Catalano Department of Data Science, Dana-Farber Cancer Institute and Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Guergana K Savova Computational Health Informatics Program, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
Raymond H Mak Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
Danielle S Bitterman Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA. Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA.

Collapse

Scherbakov D, Mollalo A, Lenert L. Stressful life events in electronic health records: a scoping review. RESEARCH SQUARE 2023:rs.3.rs-3458708. [PMID: 37886439 PMCID: PMC10602151 DOI: 10.21203/rs.3.rs-3458708/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]

Scherbakov D, Mollalo A, Lenert L. Stressful life events in electronic health records: a scoping review. RESEARCH SQUARE 2023:rs.3.rs-3458708. [PMID: 37886439 PMCID: PMC10602151 DOI: 10.21203/rs.3.rs-3458708/v2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]

Shafer PR, Davis A, Clark JA. Finding social need-les in a haystack: ascertaining social needs of Medicare patients recorded in the notes of care managers. BMC Health Serv Res 2023;23:1400. [PMID: 38087286 PMCID: PMC10717654 DOI: 10.1186/s12913-023-10446-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/06/2023] [Indexed: 12/18/2023] Open

Harris DR, Anthony N, Quesinberry D, Delcher C. Evidence of housing instability identified by addresses, clinical notes, and diagnostic codes in a real-world population with substance use disorders. J Clin Transl Sci 2023;7:e196. [PMID: 37771412 PMCID: PMC10523293 DOI: 10.1017/cts.2023.626] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/23/2023] [Accepted: 08/25/2023] [Indexed: 09/30/2023] Open

Edgcomb JB, Tseng CH, Pan M, Klomhaus A, Zima BT. Assessing Detection of Children With Suicide-Related Emergencies: Evaluation and Development of Computable Phenotyping Approaches. JMIR Ment Health 2023;10:e47084. [PMID: 37477974 PMCID: PMC10403798 DOI: 10.2196/47084] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 05/11/2023] [Accepted: 05/29/2023] [Indexed: 07/22/2023] Open

Abstract

BACKGROUND

Although suicide is a leading cause of death among children, the optimal approach for using health care data sets to detect suicide-related emergencies among children is not known.

OBJECTIVE

This study aimed to assess the performance of suicide-related International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) codes and suicide-related chief complaint in detecting self-injurious thoughts and behaviors (SITB) among children compared with clinician chart review. The study also aimed to examine variations in performance by child sociodemographics and type of self-injury, as well as develop machine learning models trained on codified health record data (features) and clinician chart review (gold standard) and test model detection performance.

METHODS

A gold standard classification of suicide-related emergencies was determined through clinician manual review of clinical notes from 600 emergency department visits between 2015 and 2019 by children aged 10 to 17 years. Visits classified with nonfatal suicide attempt or intentional self-harm using the Centers for Disease Control and Prevention surveillance case definition list of ICD-10-CM codes and suicide-related chief complaint were compared with the gold standard classification. Machine learning classifiers (least absolute shrinkage and selection operator-penalized logistic regression and random forest) were then trained and tested using codified health record data (eg, child sociodemographics, medications, disposition, and laboratory testing) and the gold standard classification. The accuracy, sensitivity, and specificity of each detection approach and relative importance of features were examined.

RESULTS

SITB accounted for 47.3% (284/600) of the visits. Suicide-related diagnostic codes missed nearly one-third (82/284, 28.9%) and suicide-related chief complaints missed more than half (153/284, 53.9%) of the children presenting to emergency departments with SITB. Sensitivity was significantly lower for male children than for female children (0.69, 95% CI 0.61-0.77 vs 0.84, 95% CI 0.78-0.90, respectively) and for preteens compared with adolescents (0.66, 95% CI 0.54-0.78 vs 0.86, 95% CI 0.80-0.92, respectively). Specificity was significantly lower for detecting preparatory acts (0.68, 95% CI 0.64-0.72) and attempts (0.67, 95% CI 0.63-0.71) than for detecting ideation (0.79, 95% CI 0.75-0.82). Machine learning-based models significantly improved the sensitivity of detection compared with suicide-related codes and chief complaint alone. Models considering all 84 features performed similarly to models considering only mental health-related ICD-10-CM codes and chief complaints (34 features) and models considering non-ICD-10-CM code indicators and mental health-related chief complaints (53 features).

CONCLUSIONS

The capacity to detect children with SITB may be strengthened by applying a machine learning-based approach to codified health record data. To improve integration between clinical research informatics and child mental health care, future research is needed to evaluate the potential benefits of implementing detection approaches at the point of care and identifying precise targets for suicide prevention interventions in children.

Collapse

Romanowski B, Ben Abacha A, Fan Y. Extracting social determinants of health from clinical note text with classification and sequence-to-sequence approaches. J Am Med Inform Assoc 2023;30:1448-1455. [PMID: 37100768 PMCID: PMC10354779 DOI: 10.1093/jamia/ocad071] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 03/07/2023] [Accepted: 04/18/2023] [Indexed: 04/28/2023] Open

Allen KS, Hood DR, Cummins J, Kasturi S, Mendonca EA, Vest JR. Natural language processing-driven state machines to extract social factors from unstructured clinical documentation. JAMIA Open 2023;6:ooad024. [PMID: 37081945 PMCID: PMC10112959 DOI: 10.1093/jamiaopen/ooad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/08/2023] [Accepted: 03/28/2023] [Indexed: 04/22/2023] Open

van Baar JM, Shields-Zeeman L, Stronks K, Hagenaars LL. Lifestyle versus social determinants of health in the Dutch parliament: An automated analysis of debate transcripts. SSM Popul Health 2023;22:101399. [PMID: 37114238 PMCID: PMC10127107 DOI: 10.1016/j.ssmph.2023.101399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 03/12/2023] [Accepted: 04/06/2023] [Indexed: 04/29/2023] Open

Wang X, Gupta D, Killian M, He Z. Benchmarking Transformer-Based Models for Identifying Social Determinants of Health in Clinical Notes. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2023;2023:570-574. [PMID: 38239824 PMCID: PMC10795706 DOI: 10.1109/ichi57859.2023.00102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2024]

Derton A, Guevara M, Chen S, Moningi S, Kozono DE, Liu D, Miller TA, Savova GK, Mak RH, Bitterman DS. Natural Language Processing Methods to Empirically Explore Social Contexts and Needs in Cancer Patient Notes. JCO Clin Cancer Inform 2023;7:e2200196. [PMID: 37235847 DOI: 10.1200/cci.22.00196] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 02/22/2023] [Accepted: 03/23/2023] [Indexed: 05/28/2023] Open

Abstract

PURPOSE

There is an unmet need to empirically explore and understand drivers of cancer disparities, particularly social determinants of health. We explored natural language processing methods to automatically and empirically extract clinical documentation of social contexts and needs that may underlie disparities.

METHODS

This was a retrospective analysis of 230,325 clinical notes from 5,285 patients treated with radiotherapy from 2007 to 2019. We compared linguistic features among White versus non-White, low-income insurance versus other insurance, and male versus female patients' notes. Log odds ratios with an informative Dirichlet prior were calculated to compare words over-represented in each group. A variational autoencoder topic model was applied, and topic probability was compared between groups. The presence of machine-learnable bias was explored by developing statistical and neural demographic group classifiers.

RESULTS

Terms associated with varied social contexts and needs were identified for all demographic group comparisons. For example, notes of non-White and low-income insurance patients were over-represented with terms associated with housing and transportation, whereas notes of White and other insurance patients were over-represented with terms related to physical activity. Topic models identified a social history topic, and topic probability varied significantly between the demographic group comparisons. Classification models performed poorly at classifying notes of non-White and low-income insurance patients (F1 of 0.30 and 0.23, respectively).

CONCLUSION

Exploration of linguistic differences in clinical notes between patients of different race/ethnicity, insurance status, and sex identified social contexts and needs in patients with cancer and revealed high-level differences in notes. Future work is needed to validate whether these findings may play a role in cancer disparities.

Collapse

Lituiev DS, Lacar B, Pak S, Abramowitsch PL, De Marchis EH, Peterson TA. Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients. J Am Med Inform Assoc 2023:7133957. [PMID: 37080559 DOI: 10.1093/jamia/ocad054] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 02/15/2023] [Accepted: 03/18/2023] [Indexed: 04/22/2023] Open

Lee RY, Kross EK, Torrence J, Li KS, Sibley J, Cohen T, Lober WB, Engelberg RA, Curtis JR. Assessment of Natural Language Processing of Electronic Health Records to Measure Goals-of-Care Discussions as a Clinical Trial Outcome. JAMA Netw Open 2023;6:e231204. [PMID: 36862411 PMCID: PMC9982698 DOI: 10.1001/jamanetworkopen.2023.1204] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/03/2023] Open

Abstract

IMPORTANCE

Many clinical trial outcomes are documented in free-text electronic health records (EHRs), making manual data collection costly and infeasible at scale. Natural language processing (NLP) is a promising approach for measuring such outcomes efficiently, but ignoring NLP-related misclassification may lead to underpowered studies.

OBJECTIVE

To evaluate the performance, feasibility, and power implications of using NLP to measure the primary outcome of EHR-documented goals-of-care discussions in a pragmatic randomized clinical trial of a communication intervention.

DESIGN, SETTING, AND PARTICIPANTS

This diagnostic study compared the performance, feasibility, and power implications of measuring EHR-documented goals-of-care discussions using 3 approaches: (1) deep-learning NLP, (2) NLP-screened human abstraction (manual verification of NLP-positive records), and (3) conventional manual abstraction. The study included hospitalized patients aged 55 years or older with serious illness enrolled between April 23, 2020, and March 26, 2021, in a pragmatic randomized clinical trial of a communication intervention in a multihospital US academic health system.

MAIN OUTCOMES AND MEASURES

Main outcomes were natural language processing performance characteristics, human abstractor-hours, and misclassification-adjusted statistical power of methods of measuring clinician-documented goals-of-care discussions. Performance of NLP was evaluated with receiver operating characteristic (ROC) curves and precision-recall (PR) analyses and examined the effects of misclassification on power using mathematical substitution and Monte Carlo simulation.

RESULTS

A total of 2512 trial participants (mean [SD] age, 71.7 [10.8] years; 1456 [58%] female) amassed 44 324 clinical notes during 30-day follow-up. In a validation sample of 159 participants, deep-learning NLP trained on a separate training data set from identified patients with documented goals-of-care discussions with moderate accuracy (maximal F1 score, 0.82; area under the ROC curve, 0.924; area under the PR curve, 0.879). Manual abstraction of the outcome from the trial data set would require an estimated 2000 abstractor-hours and would power the trial to detect a risk difference of 5.4% (assuming 33.5% control-arm prevalence, 80% power, and 2-sided α = .05). Measuring the outcome by NLP alone would power the trial to detect a risk difference of 7.6%. Measuring the outcome by NLP-screened human abstraction would require 34.3 abstractor-hours to achieve estimated sensitivity of 92.6% and would power the trial to detect a risk difference of 5.7%. Monte Carlo simulations corroborated misclassification-adjusted power calculations.

CONCLUSIONS AND RELEVANCE

In this diagnostic study, deep-learning NLP and NLP-screened human abstraction had favorable characteristics for measuring an EHR outcome at scale. Adjusted power calculations accurately quantified power loss from NLP-related misclassification, suggesting that incorporation of this approach into the design of studies using NLP would be beneficial.

Collapse

Affiliation(s)

Robert Y. Lee Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, University of Washington, Seattle
Erin K. Kross Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, University of Washington, Seattle
Janaki Torrence Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, University of Washington, Seattle
Kevin S. Li Division of Biomedical and Health Informatics, Department of Biomedical Informatics and Medical Education, University of Washington, Seattle
James Sibley Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle
Trevor Cohen Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Biomedical and Health Informatics, Department of Biomedical Informatics and Medical Education, University of Washington, Seattle
William B. Lober Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Biomedical and Health Informatics, Department of Biomedical Informatics and Medical Education, University of Washington, Seattle Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle Department of Global Health, University of Washington, Seattle
Ruth A. Engelberg Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, University of Washington, Seattle
J. Randall Curtis Cambia Palliative Care Center of Excellence at UW Medicine, University of Washington, Seattle Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, University of Washington, Seattle Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle Department of Health Systems and Population Health, University of Washington, Seattle

Collapse

Stewart de Ramirez S, Shallat J, McClure K, Foulger R, Barenblat L. Screening for Social Determinants of Health: Active and Passive Information Retrieval Methods. Popul Health Manag 2022;25:781-788. [PMID: 36454231 DOI: 10.1089/pop.2022.0228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Yang R, Zhu D, Howard LE, De Hoedt A, Williams SB, Freedland SJ, Klaassen Z. Identification of Patients With Metastatic Prostate Cancer With Natural Language Processing and Machine Learning. JCO Clin Cancer Inform 2022;6:e2100071. [PMID: 36215673 DOI: 10.1200/cci.21.00071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open

Abstract

PURPOSE

Understanding treatment patterns and effectiveness for patients with metastatic prostate cancer (mPCa) is dependent on accurate assessment of metastatic status. The objective was to develop a natural language processing (NLP) model for identifying patients with mPCa and evaluate the model's performance against chart-reviewed data and an International Classification of Diseases (ICD) 9/10 code-based method.

METHODS

In total, 139,057 radiology reports on 6,211 unique patients from the Department of Veterans Affairs were used. The gold standard was metastases by detailed chart review of radiology reports. NLP performance was assessed by sensitivity, specificity, positive predictive value, negative predictive value, and date of metastases detection. Receiver operating characteristic curves was used to assess model performance.

RESULTS

When compared with chart review, the NLP model had high sensitivity and specificity (85% and 96%, respectively). The NLP model was able to predict patient-level metastasis status with a sensitivity of 91% and specificity of 81%, whereas sensitivity and specificity using ICD9/10 billing codes were 73% and 86%, respectively. For the NLP model, date of metastases detection was exactly concordant and within < 1 week in 55% and 58% of patients, compared with 8% and 17%, respectively, using the ICD9/10 billing codes method. The area under the curve for the NLP model was 0.911. A limitation is the NLP model was developed on the basis of a subset of patients with mPCa and may not be generalizable to all patients with mPCa.

CONCLUSION

This population-level NLP model for identifying patients with mPCa was more accurate than using ICD9/10 billing codes when compared with chart-reviewed data. Upon further validation, this model may allow for efficient population-level identification of patients with mPCa.

Collapse

Improving ascertainment of suicidal ideation and suicide attempt with natural language processing. Sci Rep 2022;12:15146. [PMID: 36071081 PMCID: PMC9452591 DOI: 10.1038/s41598-022-19358-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 08/29/2022] [Indexed: 12/03/2022] Open

Dorr DA, Quiñones AR, King T, Wei MY, White K, Bejan CA. Prediction of Future Health Care Utilization Through Note-extracted Psychosocial Factors. Med Care 2022;60:570-578. [PMID: 35658116 PMCID: PMC9262845 DOI: 10.1097/mlr.0000000000001742] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Abstract

BACKGROUND

Persons with multimorbidity (≥2 chronic conditions) face an increased risk of poor health outcomes, especially as they age. Psychosocial factors such as social isolation, chronic stress, housing insecurity, and financial insecurity have been shown to exacerbate these outcomes, but are not routinely assessed during the clinical encounter. Our objective was to extract these concepts from chart notes using natural language processing and predict their impact on health care utilization for patients with multimorbidity.

METHODS

A cohort study to predict the 1-year likelihood of hospitalizations and emergency department visits for patients 65+ with multimorbidity with and without psychosocial factors. Psychosocial factors were extracted from narrative notes; all other covariates were extracted from electronic health record data from a large academic medical center using validated algorithms and concept sets. Logistic regression was performed to predict the likelihood of hospitalization and emergency department visit in the next year.

RESULTS

In all, 76,479 patients were eligible; the majority were White (89%), 54% were female, with mean age 73. Those with psychosocial factors were older, had higher baseline utilization, and more chronic illnesses. The 4 psychosocial factors all independently predicted future utilization (odds ratio=1.27-2.77, C -statistic=0.63). Accounting for demographics, specific conditions, and previous utilization, 3 of 4 of the extracted factors remained predictive (odds ratio=1.13-1.86) for future utilization. Compared with models with no psychosocial factors, they had improved discrimination. Individual predictions were mixed, with social isolation predicting depression and morbidity; stress predicting atherosclerotic cardiovascular disease onset; and housing insecurity predicting substance use disorder morbidity.

DISCUSSION

Psychosocial factors are known to have adverse health impacts, but are rarely measured; using natural language processing, we extracted factors that identified a higher risk segment of older adults with multimorbidity. Combining these extraction techniques with other measures of social determinants may help catalyze population health efforts to address psychosocial factors to mitigate their health impacts.

Collapse

Shah-Mohammadi F, Cui W, Bachi K, Hurd Y, Finkelstein J. Using Natural Language Processing of Clinical Notes to Predict Outcomes of Opioid Treatment Program. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022;2022:4415-4420. [PMID: 36085896 PMCID: PMC9472807 DOI: 10.1109/embc48229.2022.9871960] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

A case for developing domain-specific vocabularies for extracting suicide factors from healthcare notes. J Psychiatr Res 2022;151:328-338. [PMID: 35533516 DOI: 10.1016/j.jpsychires.2022.04.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 04/09/2022] [Accepted: 04/18/2022] [Indexed: 11/23/2022]

Boch S, Hussain SA, Bambach S, DeShetler C, Chisolm D, Linwood S. Locating Youth Exposed to Parental Justice Involvement in the Electronic Health Record: Development of a Natural Language Processing Model. JMIR Pediatr Parent 2022;5:e33614. [PMID: 35311681 PMCID: PMC8981008 DOI: 10.2196/33614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 01/16/2022] [Accepted: 01/25/2022] [Indexed: 12/29/2022] Open

Patel SB, Nguyen NT. Creation of a Mapped, Machine-Readable Taxonomy to Facilitate Extraction of Social Determinants of Health Data from Electronic Health Records. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022;2021:959-968. [PMID: 35308929 PMCID: PMC8861691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Park Y, Mulligan N, Gleize M, Kristiansen M, Bettencourt-Silva JH. Discovering Associations between Social Determinants and Health Outcomes: Merging Knowledge Graphs from Literature and Electronic Health Data. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022;2021:940-949. [PMID: 35308956 PMCID: PMC8861749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Hatef E, Rouhizadeh M, Nau C, Xie F, Rouillard C, Abu-Nasser M, Padilla A, Lyons LJ, Kharrazi H, Weiner JP, Roblin D. Development and assessment of a natural language processing model to identify residential instability in electronic health records’ unstructured data: a comparison of 3 integrated healthcare delivery systems. JAMIA Open 2022;5:ooac006. [PMID: 35224458 PMCID: PMC8867582 DOI: 10.1093/jamiaopen/ooac006] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 01/03/2022] [Accepted: 01/27/2022] [Indexed: 11/14/2022] Open

Abstract Abstract Objective To evaluate whether a natural language processing (NLP) algorithm could be adapted to extract, with acceptable validity, markers of residential instability (ie, homelessness and housing insecurity) from electronic health records (EHRs) of 3 healthcare systems. Materials and methods We included patients 18 years and older who received care at 1 of 3 healthcare systems from 2016 through 2020 and had at least 1 free-text note in the EHR during this period. We conducted the study independently; the NLP algorithm logic and method of validity assessment were identical across sites. The approach to the development of the gold standard for assessment of validity differed across sites. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule-based NLP system made up of expert-developed patterns indicating residential instability at the lead site and enriched the NLP system using insight gained from its application at the other 2 sites. We adapted the algorithm at each site then validated the algorithm using a split-sample approach. We assessed the performance of the algorithm by measures of positive predictive value (precision), sensitivity (recall), and specificity. Results The NLP algorithm performed with moderate precision (0.45, 0.73, and 1.0) at 3 sites. The sensitivity and specificity of the NLP algorithm varied across 3 sites (sensitivity: 0.68, 0.85, and 0.96; specificity: 0.69, 0.89, and 1.0). Discussion The performance of this NLP algorithm to identify residential instability in 3 different healthcare systems suggests the algorithm is generally valid and applicable in other healthcare systems with similar EHRs. Conclusion The NLP approach developed in this project is adaptable and can be modified to extract types of social needs other than residential instability from EHRs across different healthcare systems. Collapse

Shah-Mohammadi F, Cui W, Bachi K, Hurd Y, Finkelstein J. Comparative Analysis of Patient Distress in Opioid Treatment Programs using Natural Language Processing. BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, INTERNATIONAL JOINT CONFERENCE, BIOSTEC ... REVISED SELECTED PAPERS. BIOSTEC (CONFERENCE) 2022;2022:319-326. [PMID: 35265945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Edgcomb J, Coverdale J, Aggarwal R, Guerrero APS, Brenner AM. Applications of Clinical Informatics to Child Mental Health Care: a Call to Action to Bridge Practice and Training. ACADEMIC PSYCHIATRY : THE JOURNAL OF THE AMERICAN ASSOCIATION OF DIRECTORS OF PSYCHIATRIC RESIDENCY TRAINING AND THE ASSOCIATION FOR ACADEMIC PSYCHIATRY 2022;46:11-17. [PMID: 35175570 PMCID: PMC8852995 DOI: 10.1007/s40596-022-01595-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Patra BG, Sharma MM, Vekaria V, Adekkanattu P, Patterson OV, Glicksberg B, Lepow LA, Ryu E, Biernacka JM, Furmanchuk A, George TJ, Hogan W, Wu Y, Yang X, Bian J, Weissman M, Wickramaratne P, Mann JJ, Olfson M, Campion TR, Weiner M, Pathak J. Extracting social determinants of health from electronic health records using natural language processing: a systematic review. J Am Med Inform Assoc 2021;28:2716-2727. [PMID: 34613399 PMCID: PMC8633615 DOI: 10.1093/jamia/ocab170] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 07/09/2021] [Accepted: 08/04/2021] [Indexed: 11/27/2022] Open

Abstract

OBJECTIVE

Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs.

MATERIALS AND METHODS

A broad literature search was conducted in February 2021 using 3 scholarly databases (ACL Anthology, PubMed, and Scopus) following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 6402 publications were initially identified, and after applying the study inclusion criteria, 82 publications were selected for the final review.

RESULTS

Smoking status (n = 27), substance use (n = 21), homelessness (n = 20), and alcohol use (n = 15) are the most frequently studied SDoH categories. Homelessness (n = 7) and other less-studied SDoH (eg, education, financial problems, social isolation and support, family problems) are mostly identified using rule-based approaches. In contrast, machine learning approaches are popular for identifying smoking status (n = 13), substance use (n = 9), and alcohol use (n = 9).

CONCLUSION

NLP offers significant potential to extract SDoH data from narrative clinical notes, which in turn can aid in the development of screening tools, risk prediction models, and clinical decision support systems.

Collapse

Affiliation(s)

Braja G Patra Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
Mohit M Sharma Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
Veer Vekaria Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
Prakash Adekkanattu Information Technologies and Services, Weill Cornell Medicine, New York, New York, USA
Olga V Patterson Department of Internal Medicine, Division of Epidemiology, University of Utah, Salt Lake City, Utah, USA US Department of Veterans Affairs, Salt Lake City, Utah, USA
Benjamin Glicksberg Icahn School of Medicine at Mount Sinai, New York, New York, USA
Lauren A Lepow Icahn School of Medicine at Mount Sinai, New York, New York, USA
Euijung Ryu Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA
Joanna M Biernacka Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA
Al’ona Furmanchuk Northwestern University, Chicago, Illinois, USA
Thomas J George Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
William Hogan Division of Hematology & Oncology, Department of Medicine, College of Medicine, University of Florida, Gainesville, Florida, USA, and
Yonghui Wu Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
Xi Yang Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
Jiang Bian Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
Myrna Weissman Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
Priya Wickramaratne Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
J John Mann Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
Mark Olfson Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
Thomas R Campion Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA Information Technologies and Services, Weill Cornell Medicine, New York, New York, USA
Mark Weiner Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
Jyotishman Pathak Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA

Collapse

Bompelli A, Wang Y, Wan R, Singh E, Zhou Y, Xu L, Oniani D, Kshatriya BSA, Balls-Berry J(JE, Zhang R. Social and Behavioral Determinants of Health in the Era of Artificial Intelligence with Electronic Health Records: A Scoping Review. HEALTH DATA SCIENCE 2021;2021:9759016. [PMID: 38487504 PMCID: PMC10880156 DOI: 10.34133/2021/9759016] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 06/28/2021] [Indexed: 03/17/2024]

Stemerman R, Arguello J, Brice J, Krishnamurthy A, Houston M, Kitzmiller R. Identification of social determinants of health using multi-label classification of electronic health record clinical notes. JAMIA Open 2021;4:ooaa069. [PMID: 34514351 PMCID: PMC8423426 DOI: 10.1093/jamiaopen/ooaa069] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/16/2020] [Accepted: 11/20/2020] [Indexed: 11/13/2022] Open

Reeves RM, Christensen L, Brown JR, Conway M, Levis M, Gobbel GT, Shah RU, Goodrich C, Ricket I, Minter F, Bohm A, Bray BE, Matheny ME, Chapman W. Adaptation of an NLP system to a new healthcare environment to identify social determinants of health. J Biomed Inform 2021;120:103851. [PMID: 34174396 DOI: 10.1016/j.jbi.2021.103851] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 06/16/2021] [Accepted: 06/21/2021] [Indexed: 11/18/2022]

Affiliation(s)

Ruth M Reeves Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States; Geriatric Research Education and Clinical Care Center, Tennessee Valley Healthcare System VA, Nashville, TN, United States.
Lee Christensen Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, United States
Jeremiah R Brown Department of Epidemiology and Biomedical Data Science, Dartmouth Geisel School of Medicine, Hanover, NH, United States
Michael Conway Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, United States
Maxwell Levis Department of Epidemiology and Biomedical Data Science, Dartmouth Geisel School of Medicine, Hanover, NH, United States
Glenn T Gobbel Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States; Division of General Internal Medicine, Vanderbilt University Medical Center, Nashville, TN, United States; Geriatric Research Education and Clinical Care Center, Tennessee Valley Healthcare System VA, Nashville, TN, United States
Rashmee U Shah Division of Cardiovascular Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States
Christine Goodrich Department of Epidemiology and Biomedical Data Science, Dartmouth Geisel School of Medicine, Hanover, NH, United States
Iben Ricket Department of Epidemiology and Biomedical Data Science, Dartmouth Geisel School of Medicine, Hanover, NH, United States
Freneka Minter Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
Andrew Bohm Department of Epidemiology and Biomedical Data Science, Dartmouth Geisel School of Medicine, Hanover, NH, United States
Bruce E Bray Division of Cardiovascular Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States; Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, United States
Michael E Matheny Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States; Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, United States; Division of General Internal Medicine, Vanderbilt University Medical Center, Nashville, TN, United States; Geriatric Research Education and Clinical Care Center, Tennessee Valley Healthcare System VA, Nashville, TN, United States
Wendy Chapman Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, United States; Centre for Clinical and Public Health Informatics, University of Melbourne, Melbourne, Australia

Collapse

Makridis CA, Strebel T, Marconi V, Alterovitz G. Designing COVID-19 mortality predictions to advance clinical outcomes: Evidence from the Department of Veterans Affairs. BMJ Health Care Inform 2021;28:bmjhci-2020-100312. [PMID: 34108143 PMCID: PMC8190987 DOI: 10.1136/bmjhci-2020-100312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/17/2021] [Accepted: 03/31/2021] [Indexed: 12/21/2022] Open

Bear Don't Walk Iv OJ, Sun T, Perotte A, Elhadad N. Clinically relevant pretraining is all you need. J Am Med Inform Assoc 2021;28:1970-1976. [PMID: 34151966 DOI: 10.1093/jamia/ocab086] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 04/19/2021] [Accepted: 05/03/2021] [Indexed: 11/14/2022] Open

Chen M, Tan X, Padman R. Social determinants of health in electronic health records and their impact on analysis and risk prediction: A systematic review. J Am Med Inform Assoc 2021;27:1764-1773. [PMID: 33202021 DOI: 10.1093/jamia/ocaa143] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 06/10/2020] [Accepted: 06/20/2020] [Indexed: 11/13/2022] Open

Makridis CA, Zhao DY, Bejan CA, Alterovitz G. Leveraging machine learning to characterize the role of socio-economic determinants on physical health and well-being among veterans. Comput Biol Med 2021;133:104354. [PMID: 33845269 DOI: 10.1016/j.compbiomed.2021.104354] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 03/07/2021] [Accepted: 03/20/2021] [Indexed: 02/07/2023]

Unertl KM, Walsh CG, Clayton EW. Combatting human trafficking in the United States: how can medical informatics help? J Am Med Inform Assoc 2021;28:384-388. [PMID: 33120418 DOI: 10.1093/jamia/ocaa142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 05/11/2020] [Accepted: 06/15/2020] [Indexed: 11/14/2022] Open

Stemerman R, Bunning T, Grover J, Kitzmiller R, Patel MD. Identifying Patient Phenotype Cohorts Using Prehospital Electronic Health Record Data. PREHOSP EMERG CARE 2021:1-14. [PMID: 33315497 DOI: 10.1080/10903127.2020.1859658] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 12/01/2020] [Indexed: 10/22/2022]

Abstract

Objective: Emergency medical services (EMS) provide critical interventions for patients with acute illness and injury and are important in implementing prehospital emergency care research. Retrospective, manual patient record review, the current reference-standard for identifying patient cohorts, requires significant time and financial investment. We developed automated classification models to identify eligible patients for prehospital clinical trials using EMS clinical notes and compared model performance to manual review.Methods: With eligibility criteria for an ongoing prehospital study of chest pain patients, we used EMS clinical notes (n = 1208) to manually classify patients as eligible, ineligible, and indeterminate. We randomly split these same records into training and test sets to develop and evaluate machine-learning (ML) algorithms using natural language processing (NLP) for feature (variable) selection. We compared models to the manual classification to calculate sensitivity, specificity, accuracy, positive predictive value, and F1 measure. We measured clinical expert time to perform review for manual and automated methods.Results: ML models' sensitivity, specificity, accuracy, positive predictive value, and F1 measure ranged from 0.93 to 0.98. Compared to manual classification (N = 363 records), the automated method excluded 90.9% of records as ineligible and leaving only 33 records for manual review.Conclusions: Our ML derived approach demonstrates the feasibility of developing a high-performing, automated classification system using EMS clinical notes to streamline the identification of a specific cardiac patient cohort. This efficient approach can be leveraged to facilitate prehospital patient-trial matching, patient phenotyping (i.e. influenza-like illness), and create prehospital patient registries.

Collapse

Affiliation(s)

Rachel Stemerman Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Thomas Bunning Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Joseph Grover Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Rebecca Kitzmiller Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Mehul D Patel Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020

Collapse

Decker BM, Hill CE, Baldassano SN, Khankhanian P. Can antiepileptic efficacy and epilepsy variables be studied from electronic health records? A review of current approaches. Seizure 2021;85:138-144. [PMID: 33461032 DOI: 10.1016/j.seizure.2020.11.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/16/2020] [Accepted: 11/17/2020] [Indexed: 12/16/2022] Open

Bensken WP, Krieger NI, Berg KA, Einstadter D, Dalton JE, Perzynski AT. Health Status and Chronic Disease Burden of the Homeless Population: An Analysis of Two Decades of Multi-Institutional Electronic Medical Records. J Health Care Poor Underserved 2021;32:1619-1634. [PMID: 34421052 PMCID: PMC8477616 DOI: 10.1353/hpu.2021.0153] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Lee RY, Brumback LC, Lober WB, Sibley J, Nielsen EL, Treece PD, Kross EK, Loggers ET, Fausto JA, Lindvall C, Engelberg RA, Curtis JR. Identifying Goals of Care Conversations in the Electronic Health Record Using Natural Language Processing and Machine Learning. J Pain Symptom Manage 2021;61:136-142.e2. [PMID: 32858164 PMCID: PMC7769906 DOI: 10.1016/j.jpainsymman.2020.08.024] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 08/14/2020] [Accepted: 08/20/2020] [Indexed: 11/19/2022]

Affiliation(s)

Robert Y Lee Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, Harborview Medical Center, University of Washington, Seattle, Washington, USA
Lyndia C Brumback Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Department of Biostatistics, University of Washington, Seattle, Washington, USA
William B Lober Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle, Washington, USA; Department of Bioinformatics and Medical Education, University of Washington, Seattle, Washington, USA
James Sibley Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle, Washington, USA; Department of Bioinformatics and Medical Education, University of Washington, Seattle, Washington, USA
Elizabeth L Nielsen Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, Harborview Medical Center, University of Washington, Seattle, Washington, USA
Patsy D Treece Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, Harborview Medical Center, University of Washington, Seattle, Washington, USA; Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle, Washington, USA
Erin K Kross Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, Harborview Medical Center, University of Washington, Seattle, Washington, USA
Elizabeth T Loggers Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA; Seattle Cancer Care Alliance, Seattle, Washington, USA
James A Fausto Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Department of Family Medicine, University of Washington, Seattle, Washington, USA
Charlotta Lindvall Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
Ruth A Engelberg Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, Harborview Medical Center, University of Washington, Seattle, Washington, USA
J Randall Curtis Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, Harborview Medical Center, University of Washington, Seattle, Washington, USA; Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle, Washington, USA; Department of Bioethics and Humanities, University of Washington, Seattle, Washington, USA.

Collapse

Montgomery AE, Tsai J, Blosnich JR. Demographic Correlates of Veterans' Adverse Social Determinants of Health. Am J Prev Med 2020;59:828-836. [PMID: 33220754 DOI: 10.1016/j.amepre.2020.05.024] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 04/28/2020] [Accepted: 05/14/2020] [Indexed: 10/23/2022]

Abstract

INTRODUCTION

Identifying patient populations most affected by adverse social determinants of health can direct epidemiologic investigation, guide development of tailored interventions, and improve clinical care and outcomes. This study explores how demographic characteristics are associated with specific types-and cumulative burden-of adverse social determinants of health among Veterans seeking Veterans Health Administration health care.

METHODS

Data included electronic health records for 293,872 patients of Veterans Health Administration facilities in one region of the country between October 1, 2015 and September 30, 2016. A series of multiple logistic regressions conducted between August and December 2019 examined how demographic variables are associated with 7 adverse social determinants of health. A negative binomial regression examined the association between demographic characteristics and cumulative burden of social determinants of health.

RESULTS

Demographic characteristics were associated with increased odds of each type of adverse social determinant of health: minority race, unmarried status, and Veterans' service connected disability status. Conversely, living in a rural area and being aged >40 years were associated with decreased odds of most of the adverse social determinants of health studied here. Hispanic ethnicity and female sex were inconsistently associated with increased odds of some adverse social determinants of health and decreased odds of others. These results are mirrored in the analysis of predictors of cumulative burden of adverse social determinants of health.

CONCLUSIONS

There is increasing and ongoing interest in ways to identify and respond to patients' experiences of or exposures to adverse social determinants of health. Demographic characteristics may signal the need to assess for adverse social determinants of health. Analyses exploring latent factors among these social determinants (e.g., poverty) may inform strategies to identify patients experiencing adverse social determinants of health and provide responsive interventions.

Collapse

Lynch KE, Gatsby E, Viernes B, Schliep KC, Whitcomb BW, Alba PR, DuVall SL, Blosnich JR. Evaluation of Suicide Mortality Among Sexual Minority US Veterans From 2000 to 2017. JAMA Netw Open 2020;3:e2031357. [PMID: 33369662 PMCID: PMC7770555 DOI: 10.1001/jamanetworkopen.2020.31357] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open

Abstract

IMPORTANCE

Identification of subgroups at greatest risk for suicide mortality is essential for prevention efforts and targeting interventions. Sexual minority individuals may have an increased risk for suicide compared with heterosexual individuals, but a lack of sufficiently powered studies with rigorous methods for determining sexual orientation has limited the knowledge on this potential health disparity.

OBJECTIVE

To investigate suicide mortality among sexual minority veterans using Veterans Health Administration (VHA) electronic health record data.

DESIGN, SETTING, AND PARTICIPANTS

This retrospective population-based cohort study used data on 8.1 million US veterans enrolled in the VHA after fiscal year 1999 that were obtained from VHA electronic health records from October 1, 1999 to September 30, 2017. Data analysis was carried out from March 1, 2020 to October 31, 2020.

EXPOSURE

Veterans with documentation of a minority sexual orientation. Documentation of sexual minority status was obtained through natural language processing of clinical notes and extraction of structured administrative data for sexual orientation in VHA electronic health records.

MAIN OUTCOMES AND MEASURES

Suicide mortality rate using data on the underlying cause of death obtained from the National Death Index. Crude and age-adjusted mortality rates were calculated for all-cause death and death from suicide among sexual minority veterans compared with the general US population and the general population of veterans.

RESULTS

Among the 96 893 veterans with at least 1 sexual minority documentation in the electronic health record, the mean (SD) age was 46 (16) years, 68% were male, and 70% were White. Of the 12 591 total deaths, 3.5% were from suicide. Veterans had a significantly higher rate of mortality from suicide (standardized mortality ratio, 4.50; 95% CI, 4.13-4.99) compared with the general US population. Suicide was the fifth leading cause of death in 2017 among sexual minority veterans (3.8% of deaths) and the tenth leading cause of death in the general US population (1.7% of deaths). The crude suicide rate among sexual minority veterans (82.5 per 100 000 person-years) was higher than the rate in the general veteran population (37.7 per 100 000 person-years).

CONCLUSIONS AND RELEVANCE

The results of this population-based cohort study suggest that sexual minority veterans have a greater risk for suicide than the general US population and the general veteran population. Further research is needed to determine whether and how suicide prevention efforts reach sexual minority veterans.

Collapse

Byrne T, Baggett T, Land T, Bernson D, Hood ME, Kennedy-Perez C, Monterrey R, Smelson D, Dones M, Bharel M. A classification model of homelessness using integrated administrative data: Implications for targeting interventions to improve the housing status, health and well-being of a highly vulnerable population. PLoS One 2020;15:e0237905. [PMID: 32817717 PMCID: PMC7446866 DOI: 10.1371/journal.pone.0237905] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 08/06/2020] [Indexed: 11/19/2022] Open

Cohen DJ, Wyte-Lake T, Dorr DA, Gold R, Holden RJ, Koopman RJ, Colasurdo J, Warren N. Unmet information needs of clinical teams delivering care to complex patients and design strategies to address those needs. J Am Med Inform Assoc 2020;27:690-699. [PMID: 32134456 PMCID: PMC7647291 DOI: 10.1093/jamia/ocaa010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Revised: 01/06/2020] [Accepted: 01/16/2020] [Indexed: 12/31/2022] Open

Feller DJ, Zucker J, Walk OBD, Yin MT, Gordon P, Elhadad N. Longitudinal analysis of social and behavioral determinants of health in the EHR: exploring the impact of patient trajectories and documentation practices. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020;2019:399-407. [PMID: 32308833 PMCID: PMC7153098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Feller DJ, Bear Don't Walk Iv OJ, Zucker J, Yin MT, Gordon P, Elhadad N. Detecting Social and Behavioral Determinants of Health with Structured and Free-Text Clinical Data. Appl Clin Inform 2020;11:172-181. [PMID: 32131117 DOI: 10.1055/s-0040-1702214] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open

Abstract

BACKGROUND

Social and behavioral determinants of health (SBDH) are environmental and behavioral factors that often impede disease management and result in sexually transmitted infections. Despite their importance, SBDH are inconsistently documented in electronic health records (EHRs) and typically collected only in an unstructured format. Evidence suggests that structured data elements present in EHRs can contribute further to identify SBDH in the patient record.

OBJECTIVE

Explore the automated inference of both the presence of SBDH documentation and individual SBDH risk factors in patient records. Compare the relative ability of clinical notes and structured EHR data, such as laboratory measurements and diagnoses, to support inference.

METHODS

We attempt to infer the presence of SBDH documentation in patient records, as well as patient status of 11 SBDH, including alcohol abuse, homelessness, and sexual orientation. We compare classification performance when considering clinical notes only, structured data only, and notes and structured data together. We perform an error analysis across several SBDH risk factors.

RESULTS

Classification models inferring the presence of SBDH documentation achieved good performance (F1 score: 92.7-78.7; F1 considered as the primary evaluation metric). Performance was variable for models inferring patient SBDH risk status; results ranged from F1 = 82.7 for LGBT (lesbian, gay, bisexual, and transgender) status to F1 = 28.5 for intravenous drug use. Error analysis demonstrated that lexical diversity and documentation of historical SBDH status challenge inference of patient SBDH status. Three of five classifiers inferring topic-specific SBDH documentation and 10 of 11 patient SBDH status classifiers achieved highest performance when trained using both clinical notes and structured data.

CONCLUSION

Our findings suggest that combining clinical free-text notes and structured data provide the best approach in classifying patient SBDH status. Inferring patient SBDH status is most challenging among SBDH with low prevalence and high lexical diversity.

Collapse