Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Topaz M, Murga L, Gaddis KM, McDonald MV, Bar-Bachar O, Goldberg Y, Bowles KH. Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform 2019;90:103103. [PMID: 30639392 DOI: 10.1016/j.jbi.2019.103103] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 11/14/2018] [Accepted: 12/31/2018] [Indexed: 10/27/2022]

For:	Topaz M, Murga L, Gaddis KM, McDonald MV, Bar-Bachar O, Goldberg Y, Bowles KH. Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform 2019;90:103103. [PMID: 30639392 DOI: 10.1016/j.jbi.2019.103103] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 11/14/2018] [Accepted: 12/31/2018] [Indexed: 10/27/2022]

Number

Cited by Other Article(s)

Gan S, Kim C, Chang J, Lee DY, Park RW. Enhancing readmission prediction models by integrating insights from home healthcare notes: Retrospective cohort study. Int J Nurs Stud 2024;158:104850. [PMID: 39024965 DOI: 10.1016/j.ijnurstu.2024.104850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/20/2024]

Abstract

BACKGROUND

Hospital readmission is an important indicator of inpatient care quality and a significant driver of increasing medical costs. Therefore, it is important to explore the effects of postdischarge information, particularly from home healthcare notes, on enhancing readmission prediction models. Despite the use of Natural Language Processing (NLP) and machine learning in prediction model development, current studies often overlook insights from home healthcare notes.

OBJECTIVE

This study aimed to develop prediction models for 30-day readmissions using home healthcare notes and structured data. In addition, it explored the development of 14- and 180-day prediction models using variables in the 30-day model.

DESIGN

A retrospective observational cohort study.

SETTING(S)

This study was conducted at Ajou University School of Medicine in South Korea.

PARTICIPANTS

Data from electronic health records, encompassing demographic characteristics of 1819 participants, along with information on conditions, drug, and home healthcare, were utilized.

METHODS

Two distinct models were developed for each prediction window (30-, 14-, 180-day): the traditional model, which utilized structured variables alone, and the common data model (CDM)-NLP model, which incorporated structured and topic variables extracted from home healthcare notes. BERTopic facilitated topic generation and risk probability, representing the likelihood of documents being assigned to specific topics. Feature selection involved experimenting with various algorithms. The best-performing algorithm, determined using the area under the receiver operating characteristic curve (AUROC), was used for model development. Model performance was assessed using various learning metrics including AUROC.

RESULTS

Among 1819 patients, 251 (13.80 %) experienced 30-day readmission. The least absolute shrinkage and selection operator was used for feature extraction and model development. The 15 structured features were used in the traditional model. Moreover, five additional topic variables from the home healthcare notes were applied in the CDM-NLP model. The AUROC of the traditional model was 0.739 (95 % CI: 0.672-0.807). The AUROC of the CDM-NLP model was high at 0.824 (95 % CI: 0.768-0.880), which indicated an outstanding performance. The topics in the CDM-NLP model included emotional distress, daily living functions, nutrition, postoperative status, and cardiorespiratory issues. In extended prediction model development for 14- and 180-day readmissions, the CDM-NLP consistently outperformed the traditional model.

CONCLUSIONS

This study developed effective prediction models using both structured and unstructured data, thereby emphasizing the significance of postdischarge information from home healthcare notes in readmission prediction.

Collapse

Albashayreh A, Bandyopadhyay A, Zeinali N, Zhang M, Fan W, Gilbertson White S. Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives. JCO Clin Cancer Inform 2024;8:e2300235. [PMID: 39116379 DOI: 10.1200/cci.23.00235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/29/2024] [Accepted: 05/30/2024] [Indexed: 08/10/2024] Open

Abstract

PURPOSE

Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer.

METHODS

We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing.

RESULTS

The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes).

CONCLUSION

We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.

Collapse

Park J, Ahn H. Translating innovative technology-based interventions into nursing practice. Res Nurs Health 2024;47:366-367. [PMID: 38752681 DOI: 10.1002/nur.22392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 05/08/2024] [Indexed: 07/11/2024]

Zeinali N, Albashayreh A, Fan W, White SG. Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes. J Pain Symptom Manage 2024;68:190-198.e1. [PMID: 38789092 DOI: 10.1016/j.jpainsymman.2024.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/08/2024] [Accepted: 05/14/2024] [Indexed: 05/26/2024]

Abstract

CONTEXT

Extracting cancer symptom documentation allows clinicians to develop highly individualized symptom prediction algorithms to deliver symptom management care. Leveraging advanced language models to detect symptom data in clinical narratives can significantly enhance this process.

OBJECTIVE

This study uses a pretrained large language model to detect and extract cancer symptoms in clinical notes.

METHODS

We developed a pretrained language model to identify cancer symptoms in clinical notes based on a clinical corpus from the Enterprise Data Warehouse for Research at a healthcare system in the Midwestern United States. This study was conducted in 4 phases:1 pretraining a Bio-Clinical BERT model on one million unlabeled clinical documents,2 fine-tuning Symptom-BERT for detecting 13 cancer symptom groups within 1112 annotated clinical notes,3 generating 180 synthetic clinical notes using ChatGPT-4 for external validation, and4 comparing the internal and external performance of Symptom-BERT against a non-pretrained version and six other BERT implementations.

RESULTS

The Symptom-BERT model effectively detected cancer symptoms in clinical notes. It achieved results with a micro-averaged F1-score of 0.933, an AUC of 0.929 internally, and 0.831 and 0.834 externally. Our analysis shows that physical symptoms, like Pruritus, are typically identified with higher performance than psychological symptoms, such as anxiety.

CONCLUSION

This study underscores the transformative potential of specialized pretraining on domain-specific data in boosting the performance of language models for medical applications. The Symptom-BERT model's exceptional efficacy in detecting cancer symptoms heralds a groundbreaking stride in patient-centered AI technologies, offering a promising path to elevate symptom management and cultivate superior patient self-care outcomes.

Collapse

Song J, Topaz M, Landau AY, Klitzman RL, Shang J, Stone PW, McDonald MV, Cohen B. Natural Language Processing to Identify Home Health Care Patients at Risk for Becoming Incapacitated With No Evident Advance Directives or Surrogates. J Am Med Dir Assoc 2024;25:105019. [PMID: 38754475 DOI: 10.1016/j.jamda.2024.105019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 05/18/2024]

Abstract

OBJECTIVES

Home health care patients who are at risk for becoming Incapacitated with No Evident Advance Directives or Surrogates (INEADS) may benefit from timely intervention to assist them with advance care planning. This study aimed to develop natural language processing algorithms for identifying home care patients who do not have advance directives, family members, or close social contacts who can serve as surrogate decision-makers in the event that they lose decisional capacity.

DESIGN

Cross-sectional study of electronic health records.

SETTING AND PARTICIPANTS

Patients receiving post-acute care discharge services from a large home health agency in New York City in 2019 (n = 45,390 enrollment episodes).

METHODS

We developed a natural language processing algorithm for identifying information documented in free-text clinical notes (n = 1,429,030 notes) related to 4 categories: evidence of close relationships, evidence of advance directives, evidence suggesting lack of close relationships, and evidence suggesting lack of advance directives. We validated the algorithm against Gold Standard clinician review for 50 patients (n = 314 notes) to calculate precision, recall, and F-score.

RESULTS

Algorithm performance for identifying text related to the 4 categories was excellent (average F-score = 0.91), with the best results for "evidence of close relationships" (F-score = 0.99) and the worst results for "evidence of advance directives" (F-score = 0.86). The algorithm identified 22% of all clinical notes (313,290 of 1,429,030) as having text related to 1 or more categories. More than 98% of enrollment episodes (48,164 of 49,141) included at least 1 clinical note containing text related to 1 or more categories.

CONCLUSIONS AND IMPLICATIONS

This study establishes the feasibility of creating an automated screening algorithm to aid home health care agencies with identifying patients at risk of becoming INEADS. This screening algorithm can be applied as part of a multipronged approach to facilitate clinician support for advance care planning with patients at risk of becoming INEADS.

Collapse

Osman M, Cooper R, Sayer AA, Witham MD. The use of natural language processing for the identification of ageing syndromes including sarcopenia, frailty and falls in electronic healthcare records: a systematic review. Age Ageing 2024;53:afae135. [PMID: 38970549 PMCID: PMC11227113 DOI: 10.1093/ageing/afae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Indexed: 07/08/2024] Open

Abstract

BACKGROUND

Recording and coding of ageing syndromes in hospital records is known to be suboptimal. Natural Language Processing algorithms may be useful to identify diagnoses in electronic healthcare records to improve the recording and coding of these ageing syndromes, but the feasibility and diagnostic accuracy of such algorithms are unclear.

METHODS

We conducted a systematic review according to a predefined protocol and in line with Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Searches were run from the inception of each database to the end of September 2023 in PubMed, Medline, Embase, CINAHL, ACM digital library, IEEE Xplore and Scopus. Eligible studies were identified via independent review of search results by two coauthors and data extracted from each study to identify the computational method, source of text, testing strategy and performance metrics. Data were synthesised narratively by ageing syndrome and computational method in line with the Studies Without Meta-analysis guidelines.

RESULTS

From 1030 titles screened, 22 studies were eligible for inclusion. One study focussed on identifying sarcopenia, one frailty, twelve falls, five delirium, five dementia and four incontinence. Sensitivity (57.1%-100%) of algorithms compared with a reference standard was reported in 20 studies, and specificity (84.0%-100%) was reported in only 12 studies. Study design quality was variable with results relevant to diagnostic accuracy not always reported, and few studies undertaking external validation of algorithms.

CONCLUSIONS

Current evidence suggests that Natural Language Processing algorithms can identify ageing syndromes in electronic health records. However, algorithms require testing in rigorously designed diagnostic accuracy studies with appropriate metrics reported.

Collapse

Park JI, Park JW, Zhang K, Kim D. Advancing equity in breast cancer care: natural language processing for analysing treatment outcomes in under-represented populations. BMJ Health Care Inform 2024;31:e100966. [PMID: 38955389 PMCID: PMC11218025 DOI: 10.1136/bmjhci-2023-100966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 06/21/2024] [Indexed: 07/04/2024] Open

Abstract

OBJECTIVE

The study aimed to develop natural language processing (NLP) algorithms to automate extracting patient-centred breast cancer treatment outcomes from clinical notes in electronic health records (EHRs), particularly for women from under-represented populations.

METHODS

The study used clinical notes from 2010 to 2021 from a tertiary hospital in the USA. The notes were processed through various NLP techniques, including vectorisation methods (term frequency-inverse document frequency (TF-IDF), Word2Vec, Doc2Vec) and classification models (support vector classification, K-nearest neighbours (KNN), random forest (RF)). Feature selection and optimisation through random search and fivefold cross-validation were also conducted.

RESULTS

The study annotated 100 out of 1000 clinical notes, using 970 notes to build the text corpus. TF-IDF and Doc2Vec combined with RF showed the highest performance, while Word2Vec was less effective. RF classifier demonstrated the best performance, although with lower recall rates, suggesting more false negatives. KNN showed lower recall due to its sensitivity to data noise.

DISCUSSION

The study highlights the significance of using NLP in analysing clinical notes to understand breast cancer treatment outcomes in under-represented populations. The TF-IDF and Doc2Vec models were more effective in capturing relevant information than Word2Vec. The study observed lower recall rates in RF models, attributed to the dataset's imbalanced nature and the complexity of clinical notes.

CONCLUSION

The study developed high-performing NLP pipeline to capture treatment outcomes for breast cancer in under-represented populations, demonstrating the importance of document-level vectorisation and ensemble methods in clinical notes analysis. The findings provide insights for more equitable healthcare strategies and show the potential for broader NLP applications in clinical settings.

Collapse

Miller M, Jorm L, Partyka C, Burns B, Habig K, Oh C, Immens S, Ballard N, Gallego B. Identifying prehospital trauma patients from ambulance patient care records; comparing two methods using linked data in New South Wales, Australia. Injury 2024;55:111570. [PMID: 38664086 DOI: 10.1016/j.injury.2024.111570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 04/11/2024] [Accepted: 04/14/2024] [Indexed: 06/16/2024]

Abstract

BACKGROUND

Linked datasets for trauma system monitoring should ideally follow patients from the prehospital scene to hospital admission and post-discharge. Having a well-defined cohort when using administrative datasets is essential because they must capture the representative population. Unlike hospital electronic health records (EHR), ambulance patient-care records lack access to sources beyond immediate clinical notes. Relying on a limited set of variables to define a study population might result in missed patient inclusion. We aimed to compare two methods of identifying prehospital trauma patients: one using only those documented under a trauma protocol and another incorporating additional data elements from ambulance patient care records.

METHODS

We analyzed data from six routinely collected administrative datasets from 2015 to 2018, including ambulance patient-care records, aeromedical data, emergency department visits, hospitalizations, rehabilitation outcomes, and death records. Three prehospital trauma cohorts were created: an Extended-T-protocol cohort (patients transported under a trauma protocol and/or patients with prespecified criteria from structured data fields), T-protocol cohort (only patients documented as transported under a trauma protocol) and non-T-protocol (extended-T-protocol population not in the T-protocol cohort). Patient-encounter characteristics, mortality, clinical and post-hospital discharge outcomes were compared. A conservative p-value of 0.01 was considered significant RESULTS: Of 1 038 263 patient-encounters included in the extended-T-population 814 729 (78.5 %) were transported, with 438 893 (53.9 %) documented as a T-protocol patient. Half (49.6 %) of the non-T-protocol sub-cohort had an International Classification of Disease 10th edition injury or external cause code, indicating 79644 missed patients when a T-protocol-only definition was used. The non-T-protocol sub-cohort also identified additional patients with intubation, prehospital blood transfusion and positive eFAST. A higher proportion of non-T protocol patients than T-protocol patients were admitted to the ICU (4.6% vs 3.6 %), ventilated (1.8% vs 1.3 %), received in-hospital transfusion (7.9 vs 6.8 %) or died (1.8% vs 1.3 %). Urgent trauma surgery was similar between groups (1.3% vs 1.4 %).

CONCLUSION

The extended-T-population definition identified 50 % more admitted patients with an ICD-10-AM code consistent with an injury, including patients with severe trauma. Developing an EHR phenotype incorporating multiple data fields of ambulance-transported trauma patients for use with linked data may avoid missing these patients.

Collapse

Yin K, Xu W, Ren S, Xu Q, Zhang S, Zhang R, Jiang M, Zhang Y, Xu D, Li R. Machine Learning Accelerates De Novo Design of Antimicrobial Peptides. Interdiscip Sci 2024;16:392-403. [PMID: 38416364 DOI: 10.1007/s12539-024-00612-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/17/2024] [Accepted: 01/23/2024] [Indexed: 02/29/2024]

Affiliation(s)

Kedong Yin Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China College of Information Science and Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
Wen Xu Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China. Law College, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China.
Shiming Ren Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China College of Biological Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
Qingpeng Xu Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
Shaojie Zhang Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China College of Biological Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
Ruiling Zhang Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China School of Economics and Trade, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
Mengwan Jiang School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
Yuhong Zhang School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
Degang Xu College of Information Science and Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China.
Ruifang Li Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China. College of Biological Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China.

Collapse

Colomer-Lahiguera S, Gentizon J, Christofis M, Darnac C, Serena A, Eicher M. Achieving Comprehensive, Patient-Centered Cancer Services: Optimizing the Role of Advanced Practice Nurses at the Core of Precision Health. Semin Oncol Nurs 2024;40:151629. [PMID: 38584046 DOI: 10.1016/j.soncn.2024.151629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/11/2024] [Accepted: 03/13/2024] [Indexed: 04/09/2024]

Abstract

OBJECTIVES

The field of oncology has been revolutionized by precision medicine, driven by advancements in molecular and genomic profiling. High-throughput genomic sequencing and non-invasive diagnostic methods have deepened our understanding of cancer biology, leading to personalized treatment approaches. Precision health expands on precision medicine, emphasizing holistic healthcare, integrating molecular profiling and genomics, physiology, behavioral, and social and environmental factors. Precision health encompasses traditional and emerging data, including electronic health records, patient-generated health data, and artificial intelligence-based health technologies. This article aims to explore the opportunities and challenges faced by advanced practice nurses (APNs) within the precision health paradigm.

METHODS

We searched for peer-reviewed and professional relevant studies and articles on advanced practice nursing, oncology, precision medicine and precision health, and symptom science.

RESULTS

APNs' roles and competencies align with the core principles of precision health, allowing for personalized interventions based on comprehensive patient characteristics. We identified educational needs and policy gaps as limitations faced by APNs in fully embracing precision health.

CONCLUSION

APNs, including nurse practitioners and clinical nurse specialists, are ideally positioned to advance precision health. Nevertheless, it is imperative to overcome a series of barriers to fully leverage APNs' potential in this context.

IMPLICATIONS FOR NURSING PRACTICE

APNs can significantly contribute to precision health through their competencies in predictive, preventive, and health promotion strategies, personalized and collaborative care plans, ethical considerations, and interdisciplinary collaboration. However, there is a need to foster education in genetics and genomics, encourage continuous professional development, and enhance understanding of artificial intelligence-related technologies and digital health. Furthermore, APNs' scope of practice needs to be reflected in policy making and legislation to enable effective contribution of APNs to precision health.

Collapse

Scharp D, Hobensack M, Davoudi A, Topaz M. Natural Language Processing Applied to Clinical Documentation in Post-acute Care Settings: A Scoping Review. J Am Med Dir Assoc 2024;25:69-83. [PMID: 37838000 PMCID: PMC10792659 DOI: 10.1016/j.jamda.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 10/16/2023]

Abstract

OBJECTIVES

To determine the scope of the application of natural language processing to free-text clinical notes in post-acute care and provide a foundation for future natural language processing-based research in these settings.

DESIGN

Scoping review; reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines.

SETTING AND PARTICIPANTS

Post-acute care (ie, home health care, long-term care, skilled nursing facilities, and inpatient rehabilitation facilities).

METHODS

PubMed, Cumulative Index of Nursing and Allied Health Literature, and Embase were searched in February 2023. Eligible studies had quantitative designs that used natural language processing applied to clinical documentation in post-acute care settings. The quality of each study was appraised.

RESULTS

Twenty-one studies were included. Almost all studies were conducted in home health care settings. Most studies extracted data from electronic health records to examine the risk for negative outcomes, including acute care utilization, medication errors, and suicide mortality. About half of the studies did not report age, sex, race, or ethnicity data or use standardized terminologies. Only 8 studies included variables from socio-behavioral domains. Most studies fulfilled all quality appraisal indicators.

CONCLUSIONS AND IMPLICATIONS

The application of natural language processing is nascent in post-acute care settings. Future research should apply natural language processing using standardized terminologies to leverage free-text clinical notes in post-acute care to promote timely, comprehensive, and equitable care. Natural language processing could be integrated with predictive models to help identify patients who are at risk of negative outcomes. Future research should incorporate socio-behavioral determinants and diverse samples to improve health equity in informatics tools.

Collapse

Trinh VQN, Zhang S, Kovoor J, Gupta A, Chan WO, Gilbert T, Bacchi S. The use of natural language processing in detecting and predicting falls within the healthcare setting: a systematic review. Int J Qual Health Care 2023;35:mzad077. [PMID: 37758209 PMCID: PMC10585351 DOI: 10.1093/intqhc/mzad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 08/30/2023] [Accepted: 09/23/2023] [Indexed: 10/03/2023] Open

Mishra AK, Chappell MJ, Emerson S, Skubic M. Fall Risk Prediction in Older Adults Using Free-Text Nursing Notes and Medications in Electronic Health Records. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023;2023:1-4. [PMID: 38082830 DOI: 10.1109/embc40787.2023.10341127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]

Mitha S, Schwartz J, Hobensack M, Cato K, Woo K, Smaldone A, Topaz M. Natural Language Processing of Nursing Notes: An Integrative Review. Comput Inform Nurs 2023;41:377-384. [PMID: 36730744 DOI: 10.1097/cin.0000000000000967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Lituiev DS, Lacar B, Pak S, Abramowitsch PL, De Marchis EH, Peterson TA. Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients. J Am Med Inform Assoc 2023:7133957. [PMID: 37080559 DOI: 10.1093/jamia/ocad054] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 02/15/2023] [Accepted: 03/18/2023] [Indexed: 04/22/2023] Open

Topaz M, Song J, Davoudi A, McDonald M, Taylor J, Sittig S, Bowles K. Home Health Care Clinicians' Use of Judgment Language for Black and Hispanic Patients: Natural Language Processing Study. JMIR Nurs 2023;6:e42552. [PMID: 37067893 PMCID: PMC10152333 DOI: 10.2196/42552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/05/2022] [Accepted: 03/16/2023] [Indexed: 03/18/2023] Open

Abstract

BACKGROUND

A clinician's biased behavior toward patients can affect the quality of care. Recent literature reviews report on widespread implicit biases among clinicians. Although emerging studies in hospital settings show racial biases in the language used in clinical documentation within electronic health records, no studies have yet investigated the extent of judgment language in home health care.

OBJECTIVE

We aimed to examine racial differences in judgment language use and the relationship between judgment language use and the amount of time clinicians spent on home visits as a reflection of care quality in home health care.

METHODS

This study is a retrospective observational cohort study. Study data were extracted from a large urban home health care organization in the Northeastern United States. Study data set included patients (N=45,384) who received home health care services between January 1 and December 31, 2019. The study applied a natural language processing algorithm to automatically detect the language of judgment in clinical notes.

RESULTS

The use of judgment language was observed in 38% (n=17,141) of the patients. The highest use of judgment language was found in Hispanic (7,167/66,282, 10.8% of all clinical notes), followed by Black (7,010/65,628, 10.7%), White (10,206/107,626, 9.5%), and Asian (1,756/22,548, 7.8%) patients. Black and Hispanic patients were 14% more likely to have notes with judgment language than White patients. The length of a home health care visit was reduced by 21 minutes when judgment language was used.

CONCLUSIONS

Racial differences were identified in judgment language use. When judgment language is used, clinicians spend less time at patients' homes. Because the language clinicians use in documentation is associated with the time spent providing care, further research is needed to study the impact of using judgment language on quality of home health care. Policy, education, and clinical practice improvements are needed to address the biases behind judgment language.

Collapse

Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023;30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVE

Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used.

MATERIALS AND METHODS

We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

RESULTS

Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions.

DISCUSSION

Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

CONCLUSION

Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Collapse

Bull NJ, Honan B, Spratt NJ, Quilty S. A method for rapid machine learning development for data mining with doctor-in-the-loop. PLoS One 2023;18:e0284965. [PMID: 37163511 PMCID: PMC10171605 DOI: 10.1371/journal.pone.0284965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 04/13/2023] [Indexed: 05/12/2023] Open

Abstract

Classifying free-text from historical databases into research-compatible formats is a barrier for clinicians undertaking audit and research projects. The aim of this study was to (a) develop interactive active machine-learning model training methodology using readily available software that was (b) easily adaptable to a wide range of natural language databases and allowed customised researcher-defined categories, and then (c) evaluate the accuracy and speed of this model for classifying free text from two unique and unrelated clinical notes into coded data. A user interface for medical experts to train and evaluate the algorithm was created. Data requiring coding in the form of two independent databases of free-text clinical notes, each of unique natural language structure. Medical experts defined categories relevant to research projects and performed 'label-train-evaluate' loops on the training data set. A separate dataset was used for validation, with the medical experts blinded to the label given by the algorithm. The first dataset was 32,034 death certificate records from Northern Territory Births Deaths and Marriages, which were coded into 3 categories: haemorrhagic stroke, ischaemic stroke or no stroke. The second dataset was 12,039 recorded episodes of aeromedical retrieval from two prehospital and retrieval services in Northern Territory, Australia, which were coded into 5 categories: medical, surgical, trauma, obstetric or psychiatric. For the first dataset, macro-accuracy of the algorithm was 94.7%. For the second dataset, macro-accuracy was 92.4%. The time taken to develop and train the algorithm was 124 minutes for the death certificate coding, and 144 minutes for the aeromedical retrieval coding. This machine-learning training method was able to classify free-text clinical notes quickly and accurately from two different health datasets into categories of relevance to clinicians undertaking health service research.

Collapse

Cusick M, Velupillai S, Downs J, Campion TR, Sholle ET, Dutta R, Pathak J. Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records. JOURNAL OF AFFECTIVE DISORDERS REPORTS 2022;10:100430. [PMID: 36644339 PMCID: PMC9835770 DOI: 10.1016/j.jadr.2022.100430] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Marchesin S, Giachelle F, Marini N, Atzori M, Boytcheva S, Buttafuoco G, Ciompi F, Di Nunzio GM, Fraggetta F, Irrera O, Müller H, Primov T, Vatrano S, Silvello G. Empowering digital pathology applications through explainable knowledge extraction tools. J Pathol Inform 2022;13:100139. [PMID: 36268087 PMCID: PMC9577130 DOI: 10.1016/j.jpi.2022.100139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/06/2022] [Accepted: 09/07/2022] [Indexed: 11/25/2022] Open

Abuzaid MM, Elshami W, Fadden SM. Integration of artificial intelligence into nursing practice. HEALTH AND TECHNOLOGY 2022;12:1109-1115. [PMID: 36117522 PMCID: PMC9470236 DOI: 10.1007/s12553-022-00697-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 08/15/2022] [Accepted: 08/16/2022] [Indexed: 10/31/2022]

Chae S, Song J, Ojo M, Bowles KH, McDonald MV, Barrón Y, Hobensack M, Kennedy E, Sridharan S, Evans L, Topaz M. Factors associated with poor self-management documented in home health care narrative notes for patients with heart failure. Heart Lung 2022;55:148-154. [PMID: 35597164 PMCID: PMC11021173 DOI: 10.1016/j.hrtlng.2022.05.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 05/03/2022] [Accepted: 05/07/2022] [Indexed: 11/04/2022]

Bi Q, Kuang Z, Haihong E, Song M, Tan L, Tang X, Liu X. Research on early warning of renal damage in hypertensive patients based on the stacking strategy. BMC Med Inform Decis Mak 2022;22:212. [PMID: 35945608 PMCID: PMC9361646 DOI: 10.1186/s12911-022-01889-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 03/31/2022] [Indexed: 11/26/2022] Open

Abstract

Background

Among the problems caused by hypertension, early renal damage is often ignored. It can not be diagnosed until the condition is severe and irreversible damage occurs. So we decided to screen and explore related risk factors for hypertensive patients with early renal damage and establish the early-warning model of renal damage based on the data-mining method to achieve an early diagnosis for hypertensive patients with renal damage.

Methods

With the aid of an electronic information management system for hypertensive out-patients, we collected 513 cases of original, untreated hypertensive patients. We recorded their demographic data, ambulatory blood pressure parameters, blood routine index, and blood biochemical index to establish the clinical database. Then we screen risk factors for early renal damage through feature engineering and use Random Forest, Extra-Trees, and XGBoost to build an early-warning model, respectively. Finally, we build a new model by model fusion based on the Stacking strategy. We use cross-validation to evaluate the stability and reliability of each model to determine the best risk assessment model.

Results

According to the degree of importance, the descending order of features selected by feature engineering is the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, the average diastolic blood pressure at daytime, body surface area, smoking, age, and HDL. The average precision of the two-dimensional fusion model with full features based on the Stacking strategy is 0.89685, and selected features are 0.93824, which is greatly improved.

Conclusions

Through feature engineering and risk factor analysis, we select the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, and the average diastolic blood pressure at daytime as early-warning factors of early renal damage in patients with hypertension. On this basis, the two-dimensional fusion model based on the Stacking strategy has a better effect than the single model, which can be used for risk assessment of early renal damage in hypertensive patients.

Collapse

Paladino MS. Cuidado e inteligencia artificial: una reflexión necesaria. PERSONA Y BIOÉTICA 2022. [DOI: 10.5294/pebi.2021.25.2.8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Song J, Hobensack M, Bowles KH, McDonald MV, Cato K, Rossetti SC, Chae S, Kennedy E, Barrón Y, Sridharan S, Topaz M. Clinical notes: An untapped opportunity for improving risk prediction for hospitalization and emergency department visit during home health care. J Biomed Inform 2022;128:104039. [PMID: 35231649 PMCID: PMC9825202 DOI: 10.1016/j.jbi.2022.104039] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/22/2022] [Accepted: 02/23/2022] [Indexed: 01/11/2023]

Abstract

BACKGROUND/OBJECTIVE

Between 10 and 25% patients are hospitalized or visit emergency department (ED) during home healthcare (HHC). Given that up to 40% of these negative clinical outcomes are preventable, early and accurate prediction of hospitalization risk can be one strategy to prevent them. In recent years, machine learning-based predictive modeling has become widely used for building risk models. This study aimed to compare the predictive performance of four risk models built with various data sources for hospitalization and ED visits in HHC.

METHODS

Four risk models were built using different variables from two data sources: structured data (i.e., Outcome and Assessment Information Set (OASIS) and other assessment items from the electronic health record (EHR)) and unstructured narrative-free text clinical notes for patients who received HHC services from the largest non-profit HHC organization in New York between 2015 and 2017. Then, five machine learning algorithms (logistic regression, Random Forest, Bayesian network, support vector machine (SVM), and Naïve Bayes) were used on each risk model. Risk model performance was evaluated using the F-score and Precision-Recall Curve (PRC) area metrics.

RESULTS

During the study period, 8373/86,823 (9.6%) HHC episodes resulted in hospitalization or ED visits. Among five machine learning algorithms on each model, the SVM showed the highest F-score (0.82), while the Random Forest showed the highest PRC area (0.864). Adding information extracted from clinical notes significantly improved the risk prediction ability by up to 16.6% in F-score and 17.8% in PRC.

CONCLUSION

All models showed relatively good hospitalization or ED visit risk predictive performance in HHC. Information from clinical notes integrated with the structured data improved the ability to identify patients at risk for these emergent care events.

Collapse

Combining supervised and unsupervised named entity recognition to detect psychosocial risk factors in occupational health checks. Int J Med Inform 2022;160:104695. [DOI: 10.1016/j.ijmedinf.2022.104695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 10/25/2021] [Accepted: 01/16/2022] [Indexed: 11/17/2022]

Zolnoori M, Song J, McDonald MV, Barrón Y, Cato K, Sockolow P, Sridharan S, Onorato N, Bowles KH, Topaz M. Exploring Reasons for Delayed Start-of-Care Nursing Visits in Home Health Care: Algorithm Development and Data Science Study. JMIR Nurs 2021;4:e31038. [PMID: 34967749 PMCID: PMC8759020 DOI: 10.2196/31038] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 08/31/2021] [Accepted: 10/28/2021] [Indexed: 01/27/2023] Open

Von Gerich H, Moen H, Block LJ, Chu CH, DeForest H, Hobensack M, Michalowski M, Mitchell J, Nibber R, Olalia MA, Pruinelli L, Ronquillo CE, Topaz M, Peltonen LM. Artificial Intelligence -based technologies in nursing: A scoping literature review of the evidence. Int J Nurs Stud 2021;127:104153. [DOI: 10.1016/j.ijnurstu.2021.104153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 11/23/2021] [Accepted: 12/01/2021] [Indexed: 12/20/2022]

Santus E, Schuster T, Tahmasebi AM, Li C, Yala A, Lanahan CR, Prinsen P, Thompson SF, Coons S, Mynderse L, Barzilay R, Hughes K. Exploiting Rules to Enhance Machine Learning in Extracting Information From Multi-Institutional Prostate Pathology Reports. JCO Clin Cancer Inform 2021;4:865-874. [PMID: 33006906 DOI: 10.1200/cci.20.00028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Ronquillo CE, Peltonen LM, Pruinelli L, Chu CH, Bakken S, Beduschi A, Cato K, Hardiker N, Junger A, Michalowski M, Nyrup R, Rahimi S, Reed DN, Salakoski T, Salanterä S, Walton N, Weber P, Wiegand T, Topaz M. Artificial intelligence in nursing: Priorities and opportunities from an international invitational think-tank of the Nursing and Artificial Intelligence Leadership Collaborative. J Adv Nurs 2021;77:3707-3717. [PMID: 34003504 PMCID: PMC7612744 DOI: 10.1111/jan.14855] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 03/21/2021] [Indexed: 01/23/2023]

Abstract

Aim

To develop a consensus paper on the central points of an international invitational think‐tank on nursing and artificial intelligence (AI).

Methods

We established the Nursing and Artificial Intelligence Leadership (NAIL) Collaborative, comprising interdisciplinary experts in AI development, biomedical ethics, AI in primary care, AI legal aspects, philosophy of AI in health, nursing practice, implementation science, leaders in health informatics practice and international health informatics groups, a representative of patients and the public, and the Chair of the ITU/WHO Focus Group on Artificial Intelligence for Health. The NAIL Collaborative convened at a 3‐day invitational think tank in autumn 2019. Activities included a pre‐event survey, expert presentations and working sessions to identify priority areas for action, opportunities and recommendations to address these. In this paper, we summarize the key discussion points and notes from the aforementioned activities.

Implications for nursing

Nursing's limited current engagement with discourses on AI and health posts a risk that the profession is not part of the conversations that have potentially significant impacts on nursing practice.

Conclusion

There are numerous gaps and a timely need for the nursing profession to be among the leaders and drivers of conversations around AI in health systems.

Impact

We outline crucial gaps where focused effort is required for nursing to take a leadership role in shaping AI use in health systems. Three priorities were identified that need to be addressed in the near future: (a) Nurses must understand the relationship between the data they collect and AI technologies they use; (b) Nurses need to be meaningfully involved in all stages of AI: from development to implementation; and (c) There is a substantial untapped and an unexplored potential for nursing to contribute to the development of AI technologies for global health and humanitarian efforts.

Collapse

Affiliation(s)

Charlene Esteban Ronquillo Daphne Cockwell School of Nursing, Faculty of Community Services, Ryerson University, Toronto, ON, Canada.,School of Nursing, Faculty of Health and Social Development, University of British Columbia Okanagan, Kelowna, BC, Canada.,International Medical Informatics Association, Student and Emerging Professionals Special Interest Group
Laura-Maria Peltonen International Medical Informatics Association, Student and Emerging Professionals Special Interest Group.,Department of Nursing Science, University of Turku, Turku, Finland
Lisiane Pruinelli School of Nursing, University of Minnesota, Minneapolis, MN, USA
Charlene H Chu Lawrence S. Bloomberg Faculty of Nursing, University of Toronto, Toronto, ON, Canada
Suzanne Bakken School of Nursing, Department of Biomedical Informatics, Data Science Institute, Columbia University, New York, NY, USA.,Precision in Symptom Self-Management (PriSSM) Center, Reducing Health Disparities Through Informatics Training Program (RHeaDI), Columbia University, New York, NY, USA
Ana Beduschi Law School, University of Exeter, Exeter, UK
Kenrick Cato School of Nursing, Department of Biomedical Informatics, Data Science Institute, Columbia University, New York, NY, USA
Nicholas Hardiker School of Human & Health Sciences, University of Huddersfield, Huddersfield, UK
Alain Junger Nursing Direction, Nursing Information System Unit, Centre Hospitalier Universitaire Vaudois (CHUV) Lausanne, Lausanne, Switzerland
Martin Michalowski School of Nursing, University of Minnesota, Minneapolis, MN, USA
Rune Nyrup Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK
Samira Rahimi Department of Family Medicine, McGill University, Lady Davis Institute for Medical Research of Jewish General Hospital, Mila Quebec Artificial Intelligence Institute, Montreal, QC, Canada
Donald Nigel Reed College of Medicine and Health, University of Exeter, Exeter, UK
Tapio Salakoski Department of Mathematics and Statistics, University of Turku, Turku, Finland
Sanna Salanterä Department of Nursing Science, University of Turku and Turku University Hospital, Turku, Finland
Nancy Walton Daphne Cockwell School of Nursing, Faculty of Community Services, Ryerson University, Toronto, ON, Canada.,Research Ethics Board, Women's College Hospital, Toronto, ON, Canada.,Health Canada and Public Health Agency of Canada's Research Ethics Board, Toronto, ON, Canada
Patrick Weber NICE Computing SA, Lausanne, Switzerland.,European Federation for Medical Informatics (EFMI)
Thomas Wiegand ITU/WHO Focus Group on Artificial Intelligence for Health (FG-AI4H).,Fraunhofer Heinrich Hertz Institute, Berlin, Germany.,Berlin Institute of Technology, Berlin, Germany
Maxim Topaz International Medical Informatics Association, Student and Emerging Professionals Special Interest Group.,School of Nursing, Department of Biomedical Informatics, Data Science Institute, Columbia University, New York, NY, USA

Collapse

Koleck TA, Tatonetti NP, Bakken S, Mitha S, Henderson MM, George M, Miaskowski C, Smaldone A, Topaz M. Identifying Symptom Information in Clinical Notes Using Natural Language Processing. Nurs Res 2021;70:173-183. [PMID: 33196504 PMCID: PMC9109773 DOI: 10.1097/nnr.0000000000000488] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Abstract

BACKGROUND

Symptoms are a core concept of nursing interest. Large-scale secondary data reuse of notes in electronic health records (EHRs) has the potential to increase the quantity and quality of symptom research. However, the symptom language used in clinical notes is complex. A need exists for methods designed specifically to identify and study symptom information from EHR notes.

OBJECTIVES

We aim to describe a method that combines standardized vocabularies, clinical expertise, and natural language processing to generate comprehensive symptom vocabularies and identify symptom information in EHR notes. We piloted this method with five diverse symptom concepts: constipation, depressed mood, disturbed sleep, fatigue, and palpitations.

METHODS

First, we obtained synonym lists for each pilot symptom concept from the Unified Medical Language System. Then, we used two large bodies of text (clinical notes from Columbia University Irving Medical Center and PubMed abstracts containing Medical Subject Headings or key words related to the pilot symptoms) to further expand our initial vocabulary of synonyms for each pilot symptom concept. We used NimbleMiner, an open-source natural language processing tool, to accomplish these tasks and evaluated NimbleMiner symptom identification performance by comparison to a manually annotated set of nurse- and physician-authored common EHR note types.

RESULTS

Compared to the baseline Unified Medical Language System synonym lists, we identified up to 11 times more additional synonym words or expressions, including abbreviations, misspellings, and unique multiword combinations, for each symptom concept. Natural language processing system symptom identification performance was excellent.

DISCUSSION

Using our comprehensive symptom vocabularies and NimbleMiner to label symptoms in clinical notes produced excellent performance metrics. The ability to extract symptom information from EHR notes in an accurate and scalable manner has the potential to greatly facilitate symptom science research.

Collapse

Kulshrestha S, Dligach D, Joyce C, Gonzalez R, O'Rourke AP, Glazer JM, Stey A, Kruser JM, Churpek MM, Afshar M. Comparison and interpretability of machine learning models to predict severity of chest injury. JAMIA Open 2021;4:ooab015. [PMID: 33709067 PMCID: PMC7935500 DOI: 10.1093/jamiaopen/ooab015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/08/2021] [Accepted: 02/12/2021] [Indexed: 11/15/2022] Open

Parikh S, Davoudi A, Yu S, Giraldo C, Schriver E, Mowery D. Lexicon Development for COVID-19-related Concepts Using Open-source Word Embedding Sources: An Intrinsic and Extrinsic Evaluation. JMIR Med Inform 2021;9:e21679. [PMID: 33544689 PMCID: PMC7901592 DOI: 10.2196/21679] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 09/20/2020] [Accepted: 01/31/2021] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Scientists are developing new computational methods and prediction models to better clinically understand COVID-19 prevalence, treatment efficacy, and patient outcomes. These efforts could be improved by leveraging documented COVID-19-related symptoms, findings, and disorders from clinical text sources in an electronic health record. Word embeddings can identify terms related to these clinical concepts from both the biomedical and nonbiomedical domains, and are being shared with the open-source community at large. However, it's unclear how useful openly available word embeddings are for developing lexicons for COVID-19-related concepts.

OBJECTIVE

Given an initial lexicon of COVID-19-related terms, this study aims to characterize the returned terms by similarity across various open-source word embeddings and determine common semantic and syntactic patterns between the COVID-19 queried terms and returned terms specific to the word embedding source.

METHODS

We compared seven openly available word embedding sources. Using a series of COVID-19-related terms for associated symptoms, findings, and disorders, we conducted an interannotator agreement study to determine how accurately the most similar returned terms could be classified according to semantic types by three annotators. We conducted a qualitative study of COVID-19 queried terms and their returned terms to detect informative patterns for constructing lexicons. We demonstrated the utility of applying such learned synonyms to discharge summaries by reporting the proportion of patients identified by concept among three patient cohorts: pneumonia (n=6410), acute respiratory distress syndrome (n=8647), and COVID-19 (n=2397).

RESULTS

We observed high pairwise interannotator agreement (Cohen kappa) for symptoms (0.86-0.99), findings (0.93-0.99), and disorders (0.93-0.99). Word embedding sources generated based on characters tend to return more synonyms (mean count of 7.2 synonyms) compared to token-based embedding sources (mean counts range from 2.0 to 3.4). Word embedding sources queried using a qualifier term (eg, dry cough or muscle pain) more often returned qualifiers of the similar semantic type (eg, "dry" returns consistency qualifiers like "wet" and "runny") compared to a single term (eg, cough or pain) queries. A higher proportion of patients had documented fever (0.61-0.84), cough (0.41-0.55), shortness of breath (0.40-0.59), and hypoxia (0.51-0.56) retrieved than other clinical features. Terms for dry cough returned a higher proportion of patients with COVID-19 (0.07) than the pneumonia (0.05) and acute respiratory distress syndrome (0.03) populations.

CONCLUSIONS

Word embeddings are valuable technology for learning related terms, including synonyms. When leveraging openly available word embedding sources, choices made for the construction of the word embeddings can significantly influence the words learned.

Collapse

Home Healthcare Clinical Notes Predict Patient Hospitalization and Emergency Department Visits. Nurs Res 2021;69:448-454. [PMID: 32852359 DOI: 10.1097/nnr.0000000000000470] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

Woo K, Adams V, Wilson P, Fu LH, Cato K, Rossetti SC, McDonald M, Shang J, Topaz M. Identifying Urinary Tract Infection-Related Information in Home Care Nursing Notes. J Am Med Dir Assoc 2021;22:1015-1021.e2. [PMID: 33434568 PMCID: PMC8106637 DOI: 10.1016/j.jamda.2020.12.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 07/28/2020] [Accepted: 12/06/2020] [Indexed: 12/12/2022]

Abstract

Objectives:

Urinary tract infection (UTI) is common in home care but not easily captured with standard assessment. This study aimed to examine the value of nursing notes in detecting UTI signs and symptoms in home care.

Design:

The study developed a natural language processing (NLP) algorithm to automatically identify UTI-related information in nursing notes.

Setting and Participants:

Home care visit notes (n = 1,149,586) and care coordination notes (n = 1,461,171) for 89,459 patients treated in the largest nonprofit home care agency in the United States during 2014.

Measures:

We generated 6 categories of UTI-related information from literature and used the Unified Medical Language System (UMLS) to identify a preliminary list of terms. The NLP algorithm was tested on a gold standard set of 300 clinical notes annotated by clinical experts. We used structured Outcome and Assessment Information Set data to extract the frequency of UTI-related emergency department (ED) visits or hospitalizations and explored time-patterns in documentation of UTI-related information.

Results:

The NLP system achieved very good overall performance (F measure = 0.9, 95% CI: 0.87–0.93) based on the test results obtained by using the notes for patients admitted to the ED or hospital due to UTI. UTI-related information was significantly more prevalent (P < .01 for all the tests) in home care episodes with UTI-related ED admission or hospitalization vs the general patient population; 81% of home care episodes with UTI-related hospitalization or ED admission had at least 1 category of UTI-related information vs 21.6% among episodes without UTI-related hospitalization or ED admission. Frequency of UTI-related information documentation increased in advance of UTI-related hospitalization or ED admission, peaking within a few days before the event.

Conclusions and Implications:

Information in nursing notes is often overlooked by stakeholders and not integrated into predictive modeling for decision-making support, but our findings highlight their value in early risk identification and care guidance. Health care administrators should consider using NLP to extract clinical data from nursing notes to improve early detection and treatment, which may lead to quality improvement and cost reduction.

Collapse

Sockolow PS, Bowles KH, Topaz M, Koru G, Hellesø R, O'Connor M, Bass EJ. The Time is Now: Informatics Research Opportunities in Home Health Care. Appl Clin Inform 2021;12:100-106. [PMID: 33598906 PMCID: PMC7889426 DOI: 10.1055/s-0040-1722222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/21/2020] [Indexed: 10/22/2022] Open

Topaz M, Koleck TA, Onorato N, Smaldone A, Bakken S. Nursing documentation of symptoms is associated with higher risk of emergency department visits and hospitalizations in homecare patients. Nurs Outlook 2020;69:435-446. [PMID: 33386145 DOI: 10.1016/j.outlook.2020.12.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 10/23/2020] [Accepted: 12/11/2020] [Indexed: 10/22/2022]

Topaz M, Adams V, Wilson P, Woo K, Ryvicker M. Free-Text Documentation of Dementia Symptoms in Home Healthcare: A Natural Language Processing Study. Gerontol Geriatr Med 2020;6:2333721420959861. [PMID: 33029550 PMCID: PMC7520927 DOI: 10.1177/2333721420959861] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 08/18/2020] [Accepted: 08/21/2020] [Indexed: 01/11/2023] Open

Sterckx L, Vandewiele G, Dehaene I, Janssens O, Ongenae F, De Backere F, De Turck F, Roelens K, Decruyenaere J, Van Hoecke S, Demeester T. Clinical information extraction for preterm birth risk prediction. J Biomed Inform 2020;110:103544. [PMID: 32858168 DOI: 10.1016/j.jbi.2020.103544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/18/2020] [Accepted: 08/20/2020] [Indexed: 10/23/2022]

Paulin J, Kurola J, Salanterä S, Moen H, Guragain N, Koivisto M, Käyhkö N, Aaltonen V, Iirola T. Changing role of EMS -analyses of non-conveyed and conveyed patients in Finland. Scand J Trauma Resusc Emerg Med 2020;28:45. [PMID: 32471460 PMCID: PMC7260794 DOI: 10.1186/s13049-020-00741-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 05/20/2020] [Indexed: 12/16/2022] Open

Abstract

Background

Emergency Medical Services (EMS) and Emergency Departments (ED) have seen increasing attendance rates in the last decades. Currently, EMS are increasingly assessing and treating patients without the need to convey patients to health care facility. The aim of this study was to describe and compare the patient case-mix between conveyed and non-conveyed patients and to analyze factors related to non-conveyance decision making.

Methods

This was a prospective study design of EMS patients in Finland, and data was collected between 1st June and 30th November 2018. Adjusted ICPC2-classification was used as the reason for care. NEWS2-points were collected and analyzed both statistically and with a semi-supervised information extraction method. EMS patients’ geographic location and distance to health care facilities were analyzed by urban–rural classification.

Results

Of the EMS patients (40,263), 59.8% were over 65 years of age and 46.0% of the patients had zero NEWS2 points. The most common ICPC2 code was weakness/tiredness, general (A04), as seen in 13.5% of all patients. When comparing patients between the non-conveyance and conveyance group, a total of 35,454 EMS patients met the inclusion criteria and 14,874 patients (42.0%) were not conveyed to health care facilities. According the multivariable logistic regression model, the non-conveyance decision was more likely made by ALS units, when the EMS arrival time was in the evening or night and when the distance to the health care facility was 21-40 km. Furthermore, younger patients, female gender, whether the patient had used alcohol and a rural area were also related to the non-conveyance decision. If the patient’s NEWS2 score increased by one or two points, the likelihood of conveyance increased. When there was less than 1 h to complete a shift, this did not associate with either non-conveyance or conveyance decisions.

Conclusions

The role of EMS might be changing. This warrants to redesign the chain-of-survival in EMS to include not only high-risk patient groups but also non-critical and general acute patients with non-specific reasons for care. Assessment and on-scene treatment without conveyance can be called the “stretched arm of the emergency department”, but should be planned carefully to ensure patient safety.

Collapse

Uronen L, Moen H, Teperi S, Martimo KP, Hartiala J, Salanterä S. Towards automated detection of psychosocial risk factors with text mining. Occup Med (Lond) 2020;70:203-206. [DOI: 10.1093/occmed/kqaa022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Abstract Background Psychosocial risk factors influence early retirement and absence from work. Health checks by occupational health nurses (OHNs) may prevent deterioration of work ability. Health checks are documented electronically mostly as free text, and therefore the effect of psychological risk factors on working capacity is difficult to detect. Aims To evaluate the potential of text mining for automated early detection of psychosocial risk factors by examining health check free-text documentation, which may indicate medical statements recommending early retirement, prolonged sick leave or rehabilitation. Psychosocial risk factors were extracted from OHN documentation in a nationwide occupational health care registry. Methods Analysis of health check documentation and medical statements regarding pension, sick leave and rehabilitation. Annotations of 13 psychosocial factors based on the Prima-EF standard (PAS 1010) were used with a combination of unsupervised machine learning, a document search engine and manual filtering. Results Health check documentation was analysed for 7078 employees. In 83% of their health checks, psychosocial risk factors were mentioned. All of these occurred more frequently in the group that received medical statements for pension, rehabilitation or sick leave than the group that did not receive medical statement. Documentation of career development and work control indicated future loss of work ability. Conclusions This study showed that it was possible to detect risk factors for sick leave, rehabilitation and pension from free-text documentation of health checks. It is suggested to develop a text mining tool to automate the detection of psychosocial risk factors at an early stage. Collapse