1
|
Gan S, Kim C, Chang J, Lee DY, Park RW. Enhancing readmission prediction models by integrating insights from home healthcare notes: Retrospective cohort study. Int J Nurs Stud 2024; 158:104850. [PMID: 39024965 DOI: 10.1016/j.ijnurstu.2024.104850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/20/2024]
Abstract
BACKGROUND Hospital readmission is an important indicator of inpatient care quality and a significant driver of increasing medical costs. Therefore, it is important to explore the effects of postdischarge information, particularly from home healthcare notes, on enhancing readmission prediction models. Despite the use of Natural Language Processing (NLP) and machine learning in prediction model development, current studies often overlook insights from home healthcare notes. OBJECTIVE This study aimed to develop prediction models for 30-day readmissions using home healthcare notes and structured data. In addition, it explored the development of 14- and 180-day prediction models using variables in the 30-day model. DESIGN A retrospective observational cohort study. SETTING(S) This study was conducted at Ajou University School of Medicine in South Korea. PARTICIPANTS Data from electronic health records, encompassing demographic characteristics of 1819 participants, along with information on conditions, drug, and home healthcare, were utilized. METHODS Two distinct models were developed for each prediction window (30-, 14-, 180-day): the traditional model, which utilized structured variables alone, and the common data model (CDM)-NLP model, which incorporated structured and topic variables extracted from home healthcare notes. BERTopic facilitated topic generation and risk probability, representing the likelihood of documents being assigned to specific topics. Feature selection involved experimenting with various algorithms. The best-performing algorithm, determined using the area under the receiver operating characteristic curve (AUROC), was used for model development. Model performance was assessed using various learning metrics including AUROC. RESULTS Among 1819 patients, 251 (13.80 %) experienced 30-day readmission. The least absolute shrinkage and selection operator was used for feature extraction and model development. The 15 structured features were used in the traditional model. Moreover, five additional topic variables from the home healthcare notes were applied in the CDM-NLP model. The AUROC of the traditional model was 0.739 (95 % CI: 0.672-0.807). The AUROC of the CDM-NLP model was high at 0.824 (95 % CI: 0.768-0.880), which indicated an outstanding performance. The topics in the CDM-NLP model included emotional distress, daily living functions, nutrition, postoperative status, and cardiorespiratory issues. In extended prediction model development for 14- and 180-day readmissions, the CDM-NLP consistently outperformed the traditional model. CONCLUSIONS This study developed effective prediction models using both structured and unstructured data, thereby emphasizing the significance of postdischarge information from home healthcare notes in readmission prediction.
Collapse
Affiliation(s)
- Sujin Gan
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea.
| | - Chungsoo Kim
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Junhyuck Chang
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea
| | - Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea
| | - Rae Woong Park
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Gyeonggi-do, Republic of Korea; Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
2
|
Albashayreh A, Bandyopadhyay A, Zeinali N, Zhang M, Fan W, Gilbertson White S. Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives. JCO Clin Cancer Inform 2024; 8:e2300235. [PMID: 39116379 DOI: 10.1200/cci.23.00235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/29/2024] [Accepted: 05/30/2024] [Indexed: 08/10/2024] Open
Abstract
PURPOSE Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer. METHODS We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing. RESULTS The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes). CONCLUSION We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.
Collapse
Affiliation(s)
| | | | | | - Min Zhang
- School of Economics and Management, Communication University of China, Beijing, China
| | - Weiguo Fan
- Tippie College of Business, University of Iowa, Iowa City, IA
| | | |
Collapse
|
3
|
Park J, Ahn H. Translating innovative technology-based interventions into nursing practice. Res Nurs Health 2024; 47:366-367. [PMID: 38752681 DOI: 10.1002/nur.22392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 05/08/2024] [Indexed: 07/11/2024]
Affiliation(s)
- Juyoung Park
- College of Nursing, The University of Arizona, Tucson, Arizona, USA
| | - Hyochol Ahn
- College of Nursing, The University of Arizona, Tucson, Arizona, USA
| |
Collapse
|
4
|
Zeinali N, Albashayreh A, Fan W, White SG. Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes. J Pain Symptom Manage 2024; 68:190-198.e1. [PMID: 38789092 DOI: 10.1016/j.jpainsymman.2024.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/08/2024] [Accepted: 05/14/2024] [Indexed: 05/26/2024]
Abstract
CONTEXT Extracting cancer symptom documentation allows clinicians to develop highly individualized symptom prediction algorithms to deliver symptom management care. Leveraging advanced language models to detect symptom data in clinical narratives can significantly enhance this process. OBJECTIVE This study uses a pretrained large language model to detect and extract cancer symptoms in clinical notes. METHODS We developed a pretrained language model to identify cancer symptoms in clinical notes based on a clinical corpus from the Enterprise Data Warehouse for Research at a healthcare system in the Midwestern United States. This study was conducted in 4 phases:1 pretraining a Bio-Clinical BERT model on one million unlabeled clinical documents,2 fine-tuning Symptom-BERT for detecting 13 cancer symptom groups within 1112 annotated clinical notes,3 generating 180 synthetic clinical notes using ChatGPT-4 for external validation, and4 comparing the internal and external performance of Symptom-BERT against a non-pretrained version and six other BERT implementations. RESULTS The Symptom-BERT model effectively detected cancer symptoms in clinical notes. It achieved results with a micro-averaged F1-score of 0.933, an AUC of 0.929 internally, and 0.831 and 0.834 externally. Our analysis shows that physical symptoms, like Pruritus, are typically identified with higher performance than psychological symptoms, such as anxiety. CONCLUSION This study underscores the transformative potential of specialized pretraining on domain-specific data in boosting the performance of language models for medical applications. The Symptom-BERT model's exceptional efficacy in detecting cancer symptoms heralds a groundbreaking stride in patient-centered AI technologies, offering a promising path to elevate symptom management and cultivate superior patient self-care outcomes.
Collapse
Affiliation(s)
- Nahid Zeinali
- Department of Computer Science and Informatics (N.Z.), University of Iowa, Iowa, USA.
| | - Alaa Albashayreh
- College of Nursing (A.A., S.G.W.), University of Iowa, Iowa, USA
| | - Weiguo Fan
- Department of Business Analytics (W.F.), University of Iowa, Iowa, USA
| | | |
Collapse
|
5
|
Song J, Topaz M, Landau AY, Klitzman RL, Shang J, Stone PW, McDonald MV, Cohen B. Natural Language Processing to Identify Home Health Care Patients at Risk for Becoming Incapacitated With No Evident Advance Directives or Surrogates. J Am Med Dir Assoc 2024; 25:105019. [PMID: 38754475 DOI: 10.1016/j.jamda.2024.105019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 05/18/2024]
Abstract
OBJECTIVES Home health care patients who are at risk for becoming Incapacitated with No Evident Advance Directives or Surrogates (INEADS) may benefit from timely intervention to assist them with advance care planning. This study aimed to develop natural language processing algorithms for identifying home care patients who do not have advance directives, family members, or close social contacts who can serve as surrogate decision-makers in the event that they lose decisional capacity. DESIGN Cross-sectional study of electronic health records. SETTING AND PARTICIPANTS Patients receiving post-acute care discharge services from a large home health agency in New York City in 2019 (n = 45,390 enrollment episodes). METHODS We developed a natural language processing algorithm for identifying information documented in free-text clinical notes (n = 1,429,030 notes) related to 4 categories: evidence of close relationships, evidence of advance directives, evidence suggesting lack of close relationships, and evidence suggesting lack of advance directives. We validated the algorithm against Gold Standard clinician review for 50 patients (n = 314 notes) to calculate precision, recall, and F-score. RESULTS Algorithm performance for identifying text related to the 4 categories was excellent (average F-score = 0.91), with the best results for "evidence of close relationships" (F-score = 0.99) and the worst results for "evidence of advance directives" (F-score = 0.86). The algorithm identified 22% of all clinical notes (313,290 of 1,429,030) as having text related to 1 or more categories. More than 98% of enrollment episodes (48,164 of 49,141) included at least 1 clinical note containing text related to 1 or more categories. CONCLUSIONS AND IMPLICATIONS This study establishes the feasibility of creating an automated screening algorithm to aid home health care agencies with identifying patients at risk of becoming INEADS. This screening algorithm can be applied as part of a multipronged approach to facilitate clinician support for advance care planning with patients at risk of becoming INEADS.
Collapse
Affiliation(s)
- Jiyoun Song
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York, NY, USA; Data Science Institute, Columbia University, New York, NY, USA; Center for Home Care Policy & Research, VNS Health, New York, NY, USA
| | - Aviv Y Landau
- School of Social Policy and Practice, University of Pennsylvania, Philadelphia, PA, USA
| | - Robert L Klitzman
- Columbia University College of Physicians and Surgeons, New York, NY, USA; Columbia University Joseph Mailman School of Public Health, New York, NY, USA
| | - Jingjing Shang
- Columbia University School of Nursing, New York, NY, USA
| | | | | | - Bevin Cohen
- Brookdale Department of Geriatrics and Palliative Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Center for Nursing Research and Innovation, Mount Sinai Health System, New York, NY, USA.
| |
Collapse
|
6
|
Osman M, Cooper R, Sayer AA, Witham MD. The use of natural language processing for the identification of ageing syndromes including sarcopenia, frailty and falls in electronic healthcare records: a systematic review. Age Ageing 2024; 53:afae135. [PMID: 38970549 PMCID: PMC11227113 DOI: 10.1093/ageing/afae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Indexed: 07/08/2024] Open
Abstract
BACKGROUND Recording and coding of ageing syndromes in hospital records is known to be suboptimal. Natural Language Processing algorithms may be useful to identify diagnoses in electronic healthcare records to improve the recording and coding of these ageing syndromes, but the feasibility and diagnostic accuracy of such algorithms are unclear. METHODS We conducted a systematic review according to a predefined protocol and in line with Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Searches were run from the inception of each database to the end of September 2023 in PubMed, Medline, Embase, CINAHL, ACM digital library, IEEE Xplore and Scopus. Eligible studies were identified via independent review of search results by two coauthors and data extracted from each study to identify the computational method, source of text, testing strategy and performance metrics. Data were synthesised narratively by ageing syndrome and computational method in line with the Studies Without Meta-analysis guidelines. RESULTS From 1030 titles screened, 22 studies were eligible for inclusion. One study focussed on identifying sarcopenia, one frailty, twelve falls, five delirium, five dementia and four incontinence. Sensitivity (57.1%-100%) of algorithms compared with a reference standard was reported in 20 studies, and specificity (84.0%-100%) was reported in only 12 studies. Study design quality was variable with results relevant to diagnostic accuracy not always reported, and few studies undertaking external validation of algorithms. CONCLUSIONS Current evidence suggests that Natural Language Processing algorithms can identify ageing syndromes in electronic health records. However, algorithms require testing in rigorously designed diagnostic accuracy studies with appropriate metrics reported.
Collapse
Affiliation(s)
- Mo Osman
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| | - Rachel Cooper
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| | - Avan A Sayer
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| | - Miles D Witham
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
7
|
Park JI, Park JW, Zhang K, Kim D. Advancing equity in breast cancer care: natural language processing for analysing treatment outcomes in under-represented populations. BMJ Health Care Inform 2024; 31:e100966. [PMID: 38955389 PMCID: PMC11218025 DOI: 10.1136/bmjhci-2023-100966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 06/21/2024] [Indexed: 07/04/2024] Open
Abstract
OBJECTIVE The study aimed to develop natural language processing (NLP) algorithms to automate extracting patient-centred breast cancer treatment outcomes from clinical notes in electronic health records (EHRs), particularly for women from under-represented populations. METHODS The study used clinical notes from 2010 to 2021 from a tertiary hospital in the USA. The notes were processed through various NLP techniques, including vectorisation methods (term frequency-inverse document frequency (TF-IDF), Word2Vec, Doc2Vec) and classification models (support vector classification, K-nearest neighbours (KNN), random forest (RF)). Feature selection and optimisation through random search and fivefold cross-validation were also conducted. RESULTS The study annotated 100 out of 1000 clinical notes, using 970 notes to build the text corpus. TF-IDF and Doc2Vec combined with RF showed the highest performance, while Word2Vec was less effective. RF classifier demonstrated the best performance, although with lower recall rates, suggesting more false negatives. KNN showed lower recall due to its sensitivity to data noise. DISCUSSION The study highlights the significance of using NLP in analysing clinical notes to understand breast cancer treatment outcomes in under-represented populations. The TF-IDF and Doc2Vec models were more effective in capturing relevant information than Word2Vec. The study observed lower recall rates in RF models, attributed to the dataset's imbalanced nature and the complexity of clinical notes. CONCLUSION The study developed high-performing NLP pipeline to capture treatment outcomes for breast cancer in under-represented populations, demonstrating the importance of document-level vectorisation and ensemble methods in clinical notes analysis. The findings provide insights for more equitable healthcare strategies and show the potential for broader NLP applications in clinical settings.
Collapse
Affiliation(s)
- Jung In Park
- University of California Irvine, Irvine, California, USA
| | - Jong Won Park
- Yonsei Cancer Center, Yonsei University College of Medicine, Seoul, South Korea
| | - Kexin Zhang
- Donald Bren School of Information & Computer Sciences, University of California Irvine, Irvine, California, USA
| | - Doyop Kim
- Independent Researcher, Irvine, California, USA
| |
Collapse
|
8
|
Miller M, Jorm L, Partyka C, Burns B, Habig K, Oh C, Immens S, Ballard N, Gallego B. Identifying prehospital trauma patients from ambulance patient care records; comparing two methods using linked data in New South Wales, Australia. Injury 2024; 55:111570. [PMID: 38664086 DOI: 10.1016/j.injury.2024.111570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 04/11/2024] [Accepted: 04/14/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND Linked datasets for trauma system monitoring should ideally follow patients from the prehospital scene to hospital admission and post-discharge. Having a well-defined cohort when using administrative datasets is essential because they must capture the representative population. Unlike hospital electronic health records (EHR), ambulance patient-care records lack access to sources beyond immediate clinical notes. Relying on a limited set of variables to define a study population might result in missed patient inclusion. We aimed to compare two methods of identifying prehospital trauma patients: one using only those documented under a trauma protocol and another incorporating additional data elements from ambulance patient care records. METHODS We analyzed data from six routinely collected administrative datasets from 2015 to 2018, including ambulance patient-care records, aeromedical data, emergency department visits, hospitalizations, rehabilitation outcomes, and death records. Three prehospital trauma cohorts were created: an Extended-T-protocol cohort (patients transported under a trauma protocol and/or patients with prespecified criteria from structured data fields), T-protocol cohort (only patients documented as transported under a trauma protocol) and non-T-protocol (extended-T-protocol population not in the T-protocol cohort). Patient-encounter characteristics, mortality, clinical and post-hospital discharge outcomes were compared. A conservative p-value of 0.01 was considered significant RESULTS: Of 1 038 263 patient-encounters included in the extended-T-population 814 729 (78.5 %) were transported, with 438 893 (53.9 %) documented as a T-protocol patient. Half (49.6 %) of the non-T-protocol sub-cohort had an International Classification of Disease 10th edition injury or external cause code, indicating 79644 missed patients when a T-protocol-only definition was used. The non-T-protocol sub-cohort also identified additional patients with intubation, prehospital blood transfusion and positive eFAST. A higher proportion of non-T protocol patients than T-protocol patients were admitted to the ICU (4.6% vs 3.6 %), ventilated (1.8% vs 1.3 %), received in-hospital transfusion (7.9 vs 6.8 %) or died (1.8% vs 1.3 %). Urgent trauma surgery was similar between groups (1.3% vs 1.4 %). CONCLUSION The extended-T-population definition identified 50 % more admitted patients with an ICD-10-AM code consistent with an injury, including patients with severe trauma. Developing an EHR phenotype incorporating multiple data fields of ambulance-transported trauma patients for use with linked data may avoid missing these patients.
Collapse
Affiliation(s)
- Matthew Miller
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Anesthesia, St George Hospital, Kogarah, NSW 2217 Australia; Centre for Big Data Research in Health at UNSW Sydney, Kensington, NSW 2052, Australia.
| | - Louisa Jorm
- Foundation Director of the Centre for Big Data Research in Health at UNSW Sydney, Kensington 2052, Australia
| | - Chris Partyka
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Emergency Medicine, Royal North Shore Hospital, St Leonards, NSW 2065, Australia
| | - Brian Burns
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Royal North Shore Hospital, St Leonards, NSW 2065, Australia; Faculty of Medicine & Health, University of Sydney, Camperdown, NSW 2050, Australia
| | - Karel Habig
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia
| | - Carissa Oh
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Emergency Medicine, St George Hospital, Kogarah, NSW 2217 Australia
| | - Sam Immens
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia
| | - Neil Ballard
- Aeromedical Operations, New South Wales Ambulance, Rozelle, NSW 2039, Australia; Department of Paediatric Emergency Medicine, Sydney Children's Hospital, Randwick, NSW 2031, Australia; Department of Emergency Medicine, Royal Prince Alfred Hospital, Camperdown, NSW 2050, Australia
| | - Blanca Gallego
- Clinical analytics and machine learning unit, Centre for Big Data Research in Health at UNSW Sydney, Kensington 2052, Australia
| |
Collapse
|
9
|
Yin K, Xu W, Ren S, Xu Q, Zhang S, Zhang R, Jiang M, Zhang Y, Xu D, Li R. Machine Learning Accelerates De Novo Design of Antimicrobial Peptides. Interdiscip Sci 2024; 16:392-403. [PMID: 38416364 DOI: 10.1007/s12539-024-00612-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/17/2024] [Accepted: 01/23/2024] [Indexed: 02/29/2024]
Abstract
Efficient and precise design of antimicrobial peptides (AMPs) is of great importance in the field of AMP development. Computing provides opportunities for peptide de novo design. In the present investigation, a new machine learning-based AMP prediction model, AP_Sin, was trained using 1160 AMP sequences and 1160 non-AMP sequences. The results showed that AP_Sin correctly classified 94.61% of AMPs on a comprehensive dataset, outperforming the mainstream and open-source models (Antimicrobial Peptide Scanner vr.2, iAMPpred and AMPlify) and being effective in identifying AMPs. In addition, a peptide sequence generator, AP_Gen, was devised based on the concept of recombining dominant amino acids and dipeptide compositions. After inputting the parameters of the 71 tridecapeptides from antimicrobial peptides database (APD3) into AP_Gen, a tridecapeptide bank consisting of de novo designed 17,496 tridecapeptide sequences were randomly generated, from which 2675 candidate AMP sequences were identified by AP_Sin. Chemical synthesis was performed on 180 randomly selected candidate AMP sequences, of which 18 showed high antimicrobial activities against a wide range of the tested pathogenic microorganisms, and 16 of which had a minimal inhibitory concentration of less than 10 μg/mL against at least one of the tested pathogenic microorganisms. The method established in this research accelerates the discovery of valuable candidate AMPs and provides a novel approach for de novo design of antimicrobial peptides.
Collapse
Affiliation(s)
- Kedong Yin
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
- College of Information Science and Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
| | - Wen Xu
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China.
- Law College, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China.
| | - Shiming Ren
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
- College of Biological Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
| | - Qingpeng Xu
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
| | - Shaojie Zhang
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
- College of Biological Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
| | - Ruiling Zhang
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China
- School of Economics and Trade, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
| | - Mengwan Jiang
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
| | - Yuhong Zhang
- School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, 450001, Henan, People's Republic of China
| | - Degang Xu
- College of Information Science and Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China.
| | - Ruifang Li
- Key Laboratory of Functional Molecules for Biomedical Research, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China.
- College of Biological Engineering, Henan University of Technology, 100 Lianhua Street, Zhengzhou, 450001, Henan, People's Republic of China.
| |
Collapse
|
10
|
Colomer-Lahiguera S, Gentizon J, Christofis M, Darnac C, Serena A, Eicher M. Achieving Comprehensive, Patient-Centered Cancer Services: Optimizing the Role of Advanced Practice Nurses at the Core of Precision Health. Semin Oncol Nurs 2024; 40:151629. [PMID: 38584046 DOI: 10.1016/j.soncn.2024.151629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/11/2024] [Accepted: 03/13/2024] [Indexed: 04/09/2024]
Abstract
OBJECTIVES The field of oncology has been revolutionized by precision medicine, driven by advancements in molecular and genomic profiling. High-throughput genomic sequencing and non-invasive diagnostic methods have deepened our understanding of cancer biology, leading to personalized treatment approaches. Precision health expands on precision medicine, emphasizing holistic healthcare, integrating molecular profiling and genomics, physiology, behavioral, and social and environmental factors. Precision health encompasses traditional and emerging data, including electronic health records, patient-generated health data, and artificial intelligence-based health technologies. This article aims to explore the opportunities and challenges faced by advanced practice nurses (APNs) within the precision health paradigm. METHODS We searched for peer-reviewed and professional relevant studies and articles on advanced practice nursing, oncology, precision medicine and precision health, and symptom science. RESULTS APNs' roles and competencies align with the core principles of precision health, allowing for personalized interventions based on comprehensive patient characteristics. We identified educational needs and policy gaps as limitations faced by APNs in fully embracing precision health. CONCLUSION APNs, including nurse practitioners and clinical nurse specialists, are ideally positioned to advance precision health. Nevertheless, it is imperative to overcome a series of barriers to fully leverage APNs' potential in this context. IMPLICATIONS FOR NURSING PRACTICE APNs can significantly contribute to precision health through their competencies in predictive, preventive, and health promotion strategies, personalized and collaborative care plans, ethical considerations, and interdisciplinary collaboration. However, there is a need to foster education in genetics and genomics, encourage continuous professional development, and enhance understanding of artificial intelligence-related technologies and digital health. Furthermore, APNs' scope of practice needs to be reflected in policy making and legislation to enable effective contribution of APNs to precision health.
Collapse
Affiliation(s)
- Sara Colomer-Lahiguera
- Institute of Higher Education and Research in Healthcare, Faculty of Biology and Medicine, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland; Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland.
| | - Jenny Gentizon
- Institute of Higher Education and Research in Healthcare, Faculty of Biology and Medicine, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland
| | - Melissa Christofis
- Institute of Higher Education and Research in Healthcare, Faculty of Biology and Medicine, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland; Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Célia Darnac
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Andrea Serena
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Manuela Eicher
- Institute of Higher Education and Research in Healthcare, Faculty of Biology and Medicine, University of Lausanne and Lausanne University Hospital, Lausanne, Switzerland; Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| |
Collapse
|
11
|
Scharp D, Hobensack M, Davoudi A, Topaz M. Natural Language Processing Applied to Clinical Documentation in Post-acute Care Settings: A Scoping Review. J Am Med Dir Assoc 2024; 25:69-83. [PMID: 37838000 PMCID: PMC10792659 DOI: 10.1016/j.jamda.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 10/16/2023]
Abstract
OBJECTIVES To determine the scope of the application of natural language processing to free-text clinical notes in post-acute care and provide a foundation for future natural language processing-based research in these settings. DESIGN Scoping review; reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines. SETTING AND PARTICIPANTS Post-acute care (ie, home health care, long-term care, skilled nursing facilities, and inpatient rehabilitation facilities). METHODS PubMed, Cumulative Index of Nursing and Allied Health Literature, and Embase were searched in February 2023. Eligible studies had quantitative designs that used natural language processing applied to clinical documentation in post-acute care settings. The quality of each study was appraised. RESULTS Twenty-one studies were included. Almost all studies were conducted in home health care settings. Most studies extracted data from electronic health records to examine the risk for negative outcomes, including acute care utilization, medication errors, and suicide mortality. About half of the studies did not report age, sex, race, or ethnicity data or use standardized terminologies. Only 8 studies included variables from socio-behavioral domains. Most studies fulfilled all quality appraisal indicators. CONCLUSIONS AND IMPLICATIONS The application of natural language processing is nascent in post-acute care settings. Future research should apply natural language processing using standardized terminologies to leverage free-text clinical notes in post-acute care to promote timely, comprehensive, and equitable care. Natural language processing could be integrated with predictive models to help identify patients who are at risk of negative outcomes. Future research should incorporate socio-behavioral determinants and diverse samples to improve health equity in informatics tools.
Collapse
Affiliation(s)
| | | | - Anahita Davoudi
- VNS Health, Center for Home Care Policy & Research, New York, NY, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York, NY, USA
| |
Collapse
|
12
|
Trinh VQN, Zhang S, Kovoor J, Gupta A, Chan WO, Gilbert T, Bacchi S. The use of natural language processing in detecting and predicting falls within the healthcare setting: a systematic review. Int J Qual Health Care 2023; 35:mzad077. [PMID: 37758209 PMCID: PMC10585351 DOI: 10.1093/intqhc/mzad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 08/30/2023] [Accepted: 09/23/2023] [Indexed: 10/03/2023] Open
Abstract
Falls are a common problem associated with significant morbidity, mortality, and economic costs. Current fall prevention policies in local healthcare settings are often guided by information provided by fall risk assessment tools, incident reporting, and coding data. This review was conducted with the aim of identifying studies which utilized natural language processing (NLP) for the automated detection and prediction of falls in the healthcare setting. The databases Ovid Medline, Ovid Embase, Ovid Emcare, PubMed, CINAHL, IEEE Xplore, and Ei Compendex were searched from 2012 until April 2023. Retrospective derivation, validation, and implementation studies wherein patients experienced falls within a healthcare setting were identified for inclusion. The initial search yielded 2611 publications for title and abstract screening. Full-text screening was conducted on 105 publications, resulting in 26 unique studies that underwent qualitative analyses. Studies applied NLP towards falls risk factor identification, known falls detection, future falls prediction, and falls severity stratification with reasonable success. The NLP pipeline was reviewed in detail between studies and models utilizing rule-based, machine learning (ML), deep learning (DL), and hybrid approaches were examined. With a growing literature surrounding falls prediction in both inpatient and outpatient environments, the absence of studies examining the impact of these models on patient and system outcomes highlights the need for further implementation studies. Through an exploration of the application of NLP techniques, it may be possible to develop models with higher performance in automated falls prediction and detection.
Collapse
Affiliation(s)
| | - Steven Zhang
- University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Joshua Kovoor
- University of Adelaide, Adelaide, South Australia 5005, Australia
- Queen Elizabeth Hospital, Adelaide, South Australia 5011, Australia
| | - Aashray Gupta
- University of Adelaide, Adelaide, South Australia 5005, Australia
- Gold Coast University Hospital, Gold Coast, Queensland 4215, Australia
| | - Weng Onn Chan
- Queen Elizabeth Hospital, Adelaide, South Australia 5011, Australia
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia 5005, Australia
- Royal Adelaide Hospital, Adelaide, South Australia 5000, Australia
| | - Toby Gilbert
- University of Adelaide, Adelaide, South Australia 5005, Australia
- Northern Adelaide Local Health Network, Adelaide, South Australia 5112, Australia
| | - Stephen Bacchi
- Royal Adelaide Hospital, Adelaide, South Australia 5000, Australia
- Flinders University, Adelaide, South Australia 5042, Australia
| |
Collapse
|
13
|
Mishra AK, Chappell MJ, Emerson S, Skubic M. Fall Risk Prediction in Older Adults Using Free-Text Nursing Notes and Medications in Electronic Health Records. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38082830 DOI: 10.1109/embc40787.2023.10341127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Nursing notes in Electronic Health Records (EHR) contain critical health information, including fall risk factors. However, an exploration of fall risk prediction using nursing notes is not well examined. In this study, we explored deep learning architectures to predict fall risk in older adults using text in nursing notes and medications in the EHR. EHR predictor data and fall events outcome data were obtained from 162 older adults living at TigerPlace, a senior living facility located in Columbia, MO. We used pre-trained BioWordVec embeddings to represent the words in the clinical notes and medications and trained multiple recurrent neural network-based natural language processing models to predict future fall events. Our final model predicted falls with an accuracy of 0.81, a sensitivity of 0.75, a specificity of 0.83, and an F1 score of 0.82. This preliminary exploratory analysis provides supporting evidence that fall risk can be predicted from clinical notes and medications. Future studies will utilize additional data modalities available in the EHR to potentially improve fall risk prediction from EHR data.
Collapse
|
14
|
Mitha S, Schwartz J, Hobensack M, Cato K, Woo K, Smaldone A, Topaz M. Natural Language Processing of Nursing Notes: An Integrative Review. Comput Inform Nurs 2023; 41:377-384. [PMID: 36730744 DOI: 10.1097/cin.0000000000000967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Natural language processing includes a variety of techniques that help to extract meaning from narrative data. In healthcare, medical natural language processing has been a growing field of study; however, little is known about its use in nursing. We searched PubMed, EMBASE, and CINAHL and found 689 studies, narrowed to 43 eligible studies using natural language processing in nursing notes. Data related to the study purpose, patient population, methodology, performance evaluation metrics, and quality indicators were extracted for each study. The majority (86%) of the studies were conducted from 2015 to 2021. Most of the studies (58%) used inpatient data. One of four studies used data from open-source databases. The most common standard terminologies used were the Unified Medical Language System and Systematized Nomenclature of Medicine, whereas nursing-specific standard terminologies were used only in eight studies. Full system performance metrics (eg, F score) were reported for 61% of applicable studies. The overall number of nursing natural language processing publications remains relatively small compared with the other medical literature. Future studies should evaluate and report appropriate performance metrics and use existing standard nursing terminologies to enable future scalability of the methods and findings.
Collapse
Affiliation(s)
- Shazia Mitha
- Author Affiliations : Columbia University School of Nursing, New York
| | | | | | | | | | | | | |
Collapse
|
15
|
Lituiev DS, Lacar B, Pak S, Abramowitsch PL, De Marchis EH, Peterson TA. Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients. J Am Med Inform Assoc 2023:7133957. [PMID: 37080559 DOI: 10.1093/jamia/ocad054] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 02/15/2023] [Accepted: 03/18/2023] [Indexed: 04/22/2023] Open
Abstract
OBJECTIVE We applied natural language processing and inference methods to extract social determinants of health (SDoH) information from clinical notes of patients with chronic low back pain (cLBP) to enhance future analyses of the associations between SDoH disparities and cLBP outcomes. MATERIALS AND METHODS Clinical notes for patients with cLBP were annotated for 7 SDoH domains, as well as depression, anxiety, and pain scores, resulting in 626 notes with at least one annotated entity for 364 patients. We used a 2-tier taxonomy with these 10 first-level classes (domains) and 52 second-level classes. We developed and validated named entity recognition (NER) systems based on both rule-based and machine learning approaches and validated an entailment model. RESULTS Annotators achieved a high interrater agreement (Cohen's kappa of 95.3% at document level). A rule-based system (cTAKES), RoBERTa NER, and a hybrid model (combining rules and logistic regression) achieved performance of F1 = 47.1%, 84.4%, and 80.3%, respectively, for first-level classes. DISCUSSION While the hybrid model had a lower F1 performance, it matched or outperformed RoBERTa NER model in terms of recall and had lower computational requirements. Applying an untuned RoBERTa entailment model, we detected many challenging wordings missed by NER systems. Still, the entailment model may be sensitive to hypothesis wording. CONCLUSION This study developed a corpus of annotated clinical notes covering a broad spectrum of SDoH classes. This corpus provides a basis for training machine learning models and serves as a benchmark for predictive models for NER for SDoH and knowledge extraction from clinical texts.
Collapse
Affiliation(s)
- Dmytro S Lituiev
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Benjamin Lacar
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
- Berkeley Institute for Data Science, University of California, Berkeley, California, USA
| | - Sang Pak
- Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA
| | - Peter L Abramowitsch
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Emilia H De Marchis
- Department of Family & Community Medicine, University of California San Francisco, San Francisco, California, USA
| | - Thomas A Peterson
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
- Department of Orthopaedic Surgery, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
16
|
Topaz M, Song J, Davoudi A, McDonald M, Taylor J, Sittig S, Bowles K. Home Health Care Clinicians' Use of Judgment Language for Black and Hispanic Patients: Natural Language Processing Study. JMIR Nurs 2023; 6:e42552. [PMID: 37067893 PMCID: PMC10152333 DOI: 10.2196/42552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/05/2022] [Accepted: 03/16/2023] [Indexed: 03/18/2023] Open
Abstract
BACKGROUND A clinician's biased behavior toward patients can affect the quality of care. Recent literature reviews report on widespread implicit biases among clinicians. Although emerging studies in hospital settings show racial biases in the language used in clinical documentation within electronic health records, no studies have yet investigated the extent of judgment language in home health care. OBJECTIVE We aimed to examine racial differences in judgment language use and the relationship between judgment language use and the amount of time clinicians spent on home visits as a reflection of care quality in home health care. METHODS This study is a retrospective observational cohort study. Study data were extracted from a large urban home health care organization in the Northeastern United States. Study data set included patients (N=45,384) who received home health care services between January 1 and December 31, 2019. The study applied a natural language processing algorithm to automatically detect the language of judgment in clinical notes. RESULTS The use of judgment language was observed in 38% (n=17,141) of the patients. The highest use of judgment language was found in Hispanic (7,167/66,282, 10.8% of all clinical notes), followed by Black (7,010/65,628, 10.7%), White (10,206/107,626, 9.5%), and Asian (1,756/22,548, 7.8%) patients. Black and Hispanic patients were 14% more likely to have notes with judgment language than White patients. The length of a home health care visit was reduced by 21 minutes when judgment language was used. CONCLUSIONS Racial differences were identified in judgment language use. When judgment language is used, clinicians spend less time at patients' homes. Because the language clinicians use in documentation is associated with the time spent providing care, further research is needed to study the impact of using judgment language on quality of home health care. Policy, education, and clinical practice improvements are needed to address the biases behind judgment language.
Collapse
Affiliation(s)
- Maxim Topaz
- Columbia University School of Nursing, New York, NY, United States
- Data Science Institute, Columbia University, New York, NY, United States
- Center for Home Care Policy & Research, VNS Health, New York, NY, United States
| | - Jiyoun Song
- Columbia University School of Nursing, New York, NY, United States
| | - Anahita Davoudi
- Center for Home Care Policy & Research, VNS Health, New York, NY, United States
| | - Margaret McDonald
- Center for Home Care Policy & Research, VNS Health, New York, NY, United States
| | - Jacquelyn Taylor
- Columbia University School of Nursing, New York, NY, United States
| | - Scott Sittig
- Department of Health Sciences, University of Louisiana at Lafayette, Lafayette, LA, United States
| | - Kathryn Bowles
- Center for Home Care Policy & Research, VNS Health, New York, NY, United States
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, United States
| |
Collapse
|
17
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
18
|
Bull NJ, Honan B, Spratt NJ, Quilty S. A method for rapid machine learning development for data mining with doctor-in-the-loop. PLoS One 2023; 18:e0284965. [PMID: 37163511 PMCID: PMC10171605 DOI: 10.1371/journal.pone.0284965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 04/13/2023] [Indexed: 05/12/2023] Open
Abstract
Classifying free-text from historical databases into research-compatible formats is a barrier for clinicians undertaking audit and research projects. The aim of this study was to (a) develop interactive active machine-learning model training methodology using readily available software that was (b) easily adaptable to a wide range of natural language databases and allowed customised researcher-defined categories, and then (c) evaluate the accuracy and speed of this model for classifying free text from two unique and unrelated clinical notes into coded data. A user interface for medical experts to train and evaluate the algorithm was created. Data requiring coding in the form of two independent databases of free-text clinical notes, each of unique natural language structure. Medical experts defined categories relevant to research projects and performed 'label-train-evaluate' loops on the training data set. A separate dataset was used for validation, with the medical experts blinded to the label given by the algorithm. The first dataset was 32,034 death certificate records from Northern Territory Births Deaths and Marriages, which were coded into 3 categories: haemorrhagic stroke, ischaemic stroke or no stroke. The second dataset was 12,039 recorded episodes of aeromedical retrieval from two prehospital and retrieval services in Northern Territory, Australia, which were coded into 5 categories: medical, surgical, trauma, obstetric or psychiatric. For the first dataset, macro-accuracy of the algorithm was 94.7%. For the second dataset, macro-accuracy was 92.4%. The time taken to develop and train the algorithm was 124 minutes for the death certificate coding, and 144 minutes for the aeromedical retrieval coding. This machine-learning training method was able to classify free-text clinical notes quickly and accurately from two different health datasets into categories of relevance to clinicians undertaking health service research.
Collapse
Affiliation(s)
- Neva J Bull
- School of Psychological Sciences, University of Newcastle, Callaghan, NSW, Australia
- Hunter Medical Research Institute, John Hunter Hospital, New Lambton Heights, NSW, Australia
| | - Bridget Honan
- Alice Springs Hospital, Alice Springs, NT, Australia
| | - Neil J Spratt
- Hunter Medical Research Institute, John Hunter Hospital, New Lambton Heights, NSW, Australia
- School of Biomedical Sciences and Pharmacy, University of Newcastle, Callaghan, NSW, Australia
- Department of Neurology, John Hunter Hospital, New Lambton Heights, NSW, Australia
| | - Simon Quilty
- Alice Springs Hospital, Alice Springs, NT, Australia
- National Centre of Epidemiology and Population Health, Australian National University, Canberra, ACT, Australia
| |
Collapse
|
19
|
Cusick M, Velupillai S, Downs J, Campion TR, Sholle ET, Dutta R, Pathak J. Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records. JOURNAL OF AFFECTIVE DISORDERS REPORTS 2022; 10:100430. [PMID: 36644339 PMCID: PMC9835770 DOI: 10.1016/j.jadr.2022.100430] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Background In the global effort to prevent death by suicide, many academic medical institutions are implementing natural language processing (NLP) approaches to detect suicidality from unstructured clinical text in electronic health records (EHRs), with the hope of targeting timely, preventative interventions to individuals most at risk of suicide. Despite the international need, the development of these NLP approaches in EHRs has been largely local and not shared across healthcare systems. Methods In this study, we developed a process to share NLP approaches that were individually developed at King's College London (KCL), UK and Weill Cornell Medicine (WCM), US - two academic medical centers based in different countries with vastly different healthcare systems. We tested and compared the algorithms' performance on manually annotated clinical notes (KCL: n = 4,911 and WCM = 837). Results After a successful technical porting of the NLP approaches, our quantitative evaluation determined that independently developed NLP approaches can detect suicidality at another healthcare organization with a different EHR system, clinical documentation processes, and culture, yet do not achieve the same level of success as at the institution where the NLP algorithm was developed (KCL approach: F1-score 0.85 vs. 0.68, WCM approach: F1-score 0.87 vs. 0.72). Limitations Independent NLP algorithm development and patient cohort selection at the two institutions comprised direct comparability. Conclusions Shared use of these NLP approaches is a critical step forward towards improving data-driven algorithms for early suicide risk identification and timely prevention.
Collapse
Affiliation(s)
- Marika Cusick
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Sumithra Velupillai
- IoPPN, King’s College London, London, UK
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Johnny Downs
- IoPPN, King’s College London, London, UK
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Thomas R. Campion
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Evan T. Sholle
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Rina Dutta
- IoPPN, King’s College London, London, UK
- South London and Maudsley NHS Foundation Trust, London, UK
| | - Jyotishman Pathak
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA
- South London and Maudsley NHS Foundation Trust, London, UK
| |
Collapse
|
20
|
Marchesin S, Giachelle F, Marini N, Atzori M, Boytcheva S, Buttafuoco G, Ciompi F, Di Nunzio GM, Fraggetta F, Irrera O, Müller H, Primov T, Vatrano S, Silvello G. Empowering digital pathology applications through explainable knowledge extraction tools. J Pathol Inform 2022; 13:100139. [PMID: 36268087 PMCID: PMC9577130 DOI: 10.1016/j.jpi.2022.100139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/06/2022] [Accepted: 09/07/2022] [Indexed: 11/25/2022] Open
Abstract
Exa-scale volumes of medical data have been produced for decades. In most cases, the diagnosis is reported in free text, encoding medical knowledge that is still largely unexploited. In order to allow decoding medical knowledge included in reports, we propose an unsupervised knowledge extraction system combining a rule-based expert system with pre-trained Machine Learning (ML) models, namely the Semantic Knowledge Extractor Tool (SKET). Combining rule-based techniques and pre-trained ML models provides high accuracy results for knowledge extraction. This work demonstrates the viability of unsupervised Natural Language Processing (NLP) techniques to extract critical information from cancer reports, opening opportunities such as data mining for knowledge extraction purposes, precision medicine applications, structured report creation, and multimodal learning. SKET is a practical and unsupervised approach to extracting knowledge from pathology reports, which opens up unprecedented opportunities to exploit textual and multimodal medical information in clinical practice. We also propose SKET eXplained (SKET X), a web-based system providing visual explanations about the algorithmic decisions taken by SKET. SKET X is designed/developed to support pathologists and domain experts in understanding SKET predictions, possibly driving further improvements to the system.
Collapse
Affiliation(s)
- Stefano Marchesin
- Department of Information Engineering, University of Padua, Padua, Italy
| | - Fabio Giachelle
- Department of Information Engineering, University of Padua, Padua, Italy
| | - Niccolò Marini
- Information Systems Institute, University of Applied Sciences Western Switzerland, Delémont, Switzerland
| | - Manfredo Atzori
- Information Systems Institute, University of Applied Sciences Western Switzerland, Delémont, Switzerland
- Department of Neuroscience, University of Padua, Padua, Italy
| | | | | | - Francesco Ciompi
- Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | | | - Ornella Irrera
- Department of Information Engineering, University of Padua, Padua, Italy
| | - Henning Müller
- Information Systems Institute, University of Applied Sciences Western Switzerland, Delémont, Switzerland
| | | | - Simona Vatrano
- Pathology Unit Gravina Hospital Caltagirone ASP Catania, Italy
| | - Gianmaria Silvello
- Department of Information Engineering, University of Padua, Padua, Italy
| |
Collapse
|
21
|
Abuzaid MM, Elshami W, Fadden SM. Integration of artificial intelligence into nursing practice. HEALTH AND TECHNOLOGY 2022; 12:1109-1115. [PMID: 36117522 PMCID: PMC9470236 DOI: 10.1007/s12553-022-00697-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 08/15/2022] [Accepted: 08/16/2022] [Indexed: 10/31/2022]
Abstract
Background Methods Results Conclusions
Collapse
|
22
|
Chae S, Song J, Ojo M, Bowles KH, McDonald MV, Barrón Y, Hobensack M, Kennedy E, Sridharan S, Evans L, Topaz M. Factors associated with poor self-management documented in home health care narrative notes for patients with heart failure. Heart Lung 2022; 55:148-154. [PMID: 35597164 PMCID: PMC11021173 DOI: 10.1016/j.hrtlng.2022.05.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 05/03/2022] [Accepted: 05/07/2022] [Indexed: 11/04/2022]
Abstract
BACKGROUND Patients with heart failure (HF) who actively engage in their own self-management have better outcomes. Extracting data through natural language processing (NLP) holds great promise for identifying patients with or at risk of poor self-management. OBJECTIVE To identify home health care (HHC) patients with HF who have poor self-management using NLP of narrative notes, and to examine patient factors associated with poor self-management. METHODS An NLP algorithm was applied to extract poor self-management documentation using 353,718 HHC narrative notes of 9,710 patients with HF. Sociodemographic and structured clinical data were incorporated into multivariate logistic regression models to identify factors associated with poor self-management. RESULTS There were 758 (7.8%) patients in this sample identified as having notes with language describing poor HF self-management. Younger age (OR 0.982, 95% CI 0.976-0.987, p < .001), longer length of stay in HHC (OR 1.036, 95% CI 1.029- 1.043, p < .001), diagnosis of diabetes (OR 1.47, 95% CI 1.3-1.67, p < .001) and depression (OR 1.36, 95% CI 1.09-1.68, p < .01), impaired decision-making (OR 1.64, 95% CI 1.37-1.95, p < .001), smoking (OR 1.7, 95% CI 1.4-2.04, p < .001), and shortness of breath with exertion (OR 1.25, 95% CI 1.1-1.42, p < .01) were associated with poor self-management. CONCLUSIONS Patients with HF who have poor self-management can be identified from the narrative notes in HHC using novel NLP methods. Meaningful information about the self-management of patients with HF can support HHC clinicians in developing individualized care plans to improve self-management and clinical outcomes.
Collapse
Affiliation(s)
- Sena Chae
- College of Nursing, University of Iowa, 50 Newton Rd, Iowa City, IA 52242, United States.
| | - Jiyoun Song
- Columbia University School of Nursing, New York, NY, United States
| | - Marietta Ojo
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Kathryn H Bowles
- Department of Biobehavioral Health Sciences Philadelphia PA, Center for Home Care Policy & Research, University of Pennsylvania School of Nursing, Visiting Nurse Service of New York, New York, NY, United States
| | - Margaret V McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Yolanda Barrón
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Mollie Hobensack
- Columbia University School of Nursing, New York, NY, United States
| | - Erin Kennedy
- Department of Biobehavioral Health Sciences, University of Pennsylvania School of Nursing, Philadelphia, PA, United States
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Lauren Evans
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Maxim Topaz
- Center for Home Care Policy & Research, Columbia University School of Nursing, Data Science Institute, Columbia University, Visiting Nurse Service of New York, New York, NY, United States
| |
Collapse
|
23
|
Bi Q, Kuang Z, Haihong E, Song M, Tan L, Tang X, Liu X. Research on early warning of renal damage in hypertensive patients based on the stacking strategy. BMC Med Inform Decis Mak 2022; 22:212. [PMID: 35945608 PMCID: PMC9361646 DOI: 10.1186/s12911-022-01889-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 03/31/2022] [Indexed: 11/26/2022] Open
Abstract
Background Among the problems caused by hypertension, early renal damage is often ignored. It can not be diagnosed until the condition is severe and irreversible damage occurs. So we decided to screen and explore related risk factors for hypertensive patients with early renal damage and establish the early-warning model of renal damage based on the data-mining method to achieve an early diagnosis for hypertensive patients with renal damage. Methods With the aid of an electronic information management system for hypertensive out-patients, we collected 513 cases of original, untreated hypertensive patients. We recorded their demographic data, ambulatory blood pressure parameters, blood routine index, and blood biochemical index to establish the clinical database. Then we screen risk factors for early renal damage through feature engineering and use Random Forest, Extra-Trees, and XGBoost to build an early-warning model, respectively. Finally, we build a new model by model fusion based on the Stacking strategy. We use cross-validation to evaluate the stability and reliability of each model to determine the best risk assessment model. Results According to the degree of importance, the descending order of features selected by feature engineering is the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, the average diastolic blood pressure at daytime, body surface area, smoking, age, and HDL. The average precision of the two-dimensional fusion model with full features based on the Stacking strategy is 0.89685, and selected features are 0.93824, which is greatly improved. Conclusions Through feature engineering and risk factor analysis, we select the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, and the average diastolic blood pressure at daytime as early-warning factors of early renal damage in patients with hypertension. On this basis, the two-dimensional fusion model based on the Stacking strategy has a better effect than the single model, which can be used for risk assessment of early renal damage in hypertensive patients.
Collapse
Affiliation(s)
- Qiubo Bi
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Zemin Kuang
- Department of Hypertension, Beijing Anzhen Hospital of Capital Medical University, Beijing, 100029, China
| | - E Haihong
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China.
| | - Meina Song
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Ling Tan
- School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Xinying Tang
- Department of Cardiology, The First People's Hospital of Chenzhou, The University of South China, Chenzhou, 423000, China
| | - Xing Liu
- Department of Anesthesiology, Third Xiangya Hospital, Central South University, Changsha, 410013, China
| |
Collapse
|
24
|
Paladino MS. Cuidado e inteligencia artificial: una reflexión necesaria. PERSONA Y BIOÉTICA 2022. [DOI: 10.5294/pebi.2021.25.2.8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
La enfermería no es ajena al cambio revolucionario que supone la introducción de la inteligencia artificial en el cuidado de la salud. A principios de 2021 se publicaron las conclusiones del think-tank internacional sobre la inteligencia artificial y la enfermería, en las que se reconoce la relevancia del uso de dichas tecnologías para aumentar y extender las capacidades de esta disciplina, entre ellas, el cuidado. Una valoración ponderada acerca del acierto de estas conclusiones exige, necesariamente, una reflexión epistemológica sobre el cuidado. En el presente artículo reflexionaremos sobre la incidencia de la inteligencia artificial en el cuidado de enfermería desde la perspectiva de la ética del cuidado y a la luz de los principales aportes del Samaritanus Bonus.
Collapse
|
25
|
Song J, Hobensack M, Bowles KH, McDonald MV, Cato K, Rossetti SC, Chae S, Kennedy E, Barrón Y, Sridharan S, Topaz M. Clinical notes: An untapped opportunity for improving risk prediction for hospitalization and emergency department visit during home health care. J Biomed Inform 2022; 128:104039. [PMID: 35231649 PMCID: PMC9825202 DOI: 10.1016/j.jbi.2022.104039] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/22/2022] [Accepted: 02/23/2022] [Indexed: 01/11/2023]
Abstract
BACKGROUND/OBJECTIVE Between 10 and 25% patients are hospitalized or visit emergency department (ED) during home healthcare (HHC). Given that up to 40% of these negative clinical outcomes are preventable, early and accurate prediction of hospitalization risk can be one strategy to prevent them. In recent years, machine learning-based predictive modeling has become widely used for building risk models. This study aimed to compare the predictive performance of four risk models built with various data sources for hospitalization and ED visits in HHC. METHODS Four risk models were built using different variables from two data sources: structured data (i.e., Outcome and Assessment Information Set (OASIS) and other assessment items from the electronic health record (EHR)) and unstructured narrative-free text clinical notes for patients who received HHC services from the largest non-profit HHC organization in New York between 2015 and 2017. Then, five machine learning algorithms (logistic regression, Random Forest, Bayesian network, support vector machine (SVM), and Naïve Bayes) were used on each risk model. Risk model performance was evaluated using the F-score and Precision-Recall Curve (PRC) area metrics. RESULTS During the study period, 8373/86,823 (9.6%) HHC episodes resulted in hospitalization or ED visits. Among five machine learning algorithms on each model, the SVM showed the highest F-score (0.82), while the Random Forest showed the highest PRC area (0.864). Adding information extracted from clinical notes significantly improved the risk prediction ability by up to 16.6% in F-score and 17.8% in PRC. CONCLUSION All models showed relatively good hospitalization or ED visit risk predictive performance in HHC. Information from clinical notes integrated with the structured data improved the ability to identify patients at risk for these emergent care events.
Collapse
Affiliation(s)
- Jiyoun Song
- Columbia University School of Nursing, New York City, NY, USA,Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA,Corresponding author at: Columbia University School of Nursing, 560 West 168th Street, New York, NY 10032, USA. (J. Song)
| | | | - Kathryn H. Bowles
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA,University of Pennsylvania School of Nursing, Department of Biobehavioral Health Sciences, Philadelphia, PA, USA
| | - Margaret V. McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA
| | - Kenrick Cato
- Columbia University School of Nursing, New York City, NY, USA,Emergency Medicine, Columbia University Irving Medical Center, New York, NY, USA
| | - Sarah Collins Rossetti
- Columbia University School of Nursing, New York City, NY, USA,Columbia University, Department of Biomedical Informatics, New York City, NY, USA
| | - Sena Chae
- College of Nursing, University of Iowa, Iowa City, IA, USA
| | - Erin Kennedy
- University of Pennsylvania School of Nursing, Department of Biobehavioral Health Sciences, Philadelphia, PA, USA
| | - Yolanda Barrón
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA
| | - Maxim Topaz
- Columbia University School of Nursing, New York City, NY, USA,Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York City, NY, USA,Data Science Institute, Columbia University, New York City, NY, USA
| |
Collapse
|
26
|
Combining supervised and unsupervised named entity recognition to detect psychosocial risk factors in occupational health checks. Int J Med Inform 2022; 160:104695. [DOI: 10.1016/j.ijmedinf.2022.104695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 10/25/2021] [Accepted: 01/16/2022] [Indexed: 11/17/2022]
|
27
|
Zolnoori M, Song J, McDonald MV, Barrón Y, Cato K, Sockolow P, Sridharan S, Onorato N, Bowles KH, Topaz M. Exploring Reasons for Delayed Start-of-Care Nursing Visits in Home Health Care: Algorithm Development and Data Science Study. JMIR Nurs 2021; 4:e31038. [PMID: 34967749 PMCID: PMC8759020 DOI: 10.2196/31038] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 08/31/2021] [Accepted: 10/28/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Delayed start-of-care nursing visits in home health care (HHC) can result in negative outcomes, such as hospitalization. No previous studies have investigated why start-of-care HHC nursing visits are delayed, in part because most reasons for delayed visits are documented in free-text HHC nursing notes. OBJECTIVE The aims of this study were to (1) develop and test a natural language processing (NLP) algorithm that automatically identifies reasons for delayed visits in HHC free-text clinical notes and (2) describe reasons for delayed visits in a large patient sample. METHODS This study was conducted at the Visiting Nurse Service of New York (VNSNY). We examined data available at the VNSNY on all new episodes of care started in 2019 (N=48,497). An NLP algorithm was developed and tested to automatically identify and classify reasons for delayed visits. RESULTS The performance of the NLP algorithm was 0.8, 0.75, and 0.77 for precision, recall, and F-score, respectively. A total of one-third of HHC episodes (n=16,244) had delayed start-of-care HHC nursing visits. The most prevalent identified category of reasons for delayed start-of-care nursing visits was no answer at the door or phone (3728/8051, 46.3%), followed by patient/family request to postpone or refuse some HHC services (n=2858, 35.5%), and administrative or scheduling issues (n=1465, 18.2%). In 40% (n=16,244) of HHC episodes, 2 or more reasons were documented. CONCLUSIONS To avoid critical delays in start-of-care nursing visits, HHC organizations might examine and improve ways to effectively address the reasons for delayed visits, using effective interventions, such as educating patients or caregivers on the importance of a timely nursing visit and improving patients' intake procedures.
Collapse
Affiliation(s)
- Maryam Zolnoori
- School of Nursing, Columbia University, New York, NY, United States
| | - Jiyoun Song
- School of Nursing, Columbia University, New York, NY, United States
| | - Margaret V McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Yolanda Barrón
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Kenrick Cato
- School of Nursing, Columbia University, New York, NY, United States
| | - Paulina Sockolow
- College of Nursing and Health Professions, Drexel University, Philadelphia, PA, United States
| | - Sridevi Sridharan
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Nicole Onorato
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States
| | - Kathryn H Bowles
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, United States.,Center for Transitions and Health, School of Nursing, University of Pennsylvania, Philadelphia, PA, United States
| | - Maxim Topaz
- School of Nursing, Columbia University, New York, NY, United States
| |
Collapse
|
28
|
Von Gerich H, Moen H, Block LJ, Chu CH, DeForest H, Hobensack M, Michalowski M, Mitchell J, Nibber R, Olalia MA, Pruinelli L, Ronquillo CE, Topaz M, Peltonen LM. Artificial Intelligence -based technologies in nursing: A scoping literature review of the evidence. Int J Nurs Stud 2021; 127:104153. [DOI: 10.1016/j.ijnurstu.2021.104153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 11/23/2021] [Accepted: 12/01/2021] [Indexed: 12/20/2022]
|
29
|
Santus E, Schuster T, Tahmasebi AM, Li C, Yala A, Lanahan CR, Prinsen P, Thompson SF, Coons S, Mynderse L, Barzilay R, Hughes K. Exploiting Rules to Enhance Machine Learning in Extracting Information From Multi-Institutional Prostate Pathology Reports. JCO Clin Cancer Inform 2021; 4:865-874. [PMID: 33006906 DOI: 10.1200/cci.20.00028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
PURPOSE Literature on clinical note mining has highlighted the superiority of machine learning (ML) over hand-crafted rules. Nevertheless, most studies assume the availability of large training sets, which is rarely the case. For this reason, in the clinical setting, rules are still common. We suggest 2 methods to leverage the knowledge encoded in pre-existing rules to inform ML decisions and obtain high performance, even with scarce annotations. METHODS We collected 501 prostate pathology reports from 6 American hospitals. Reports were split into 2,711 core segments, annotated with 20 attributes describing the histology, grade, extension, and location of tumors. The data set was split by institutions to generate a cross-institutional evaluation setting. We assessed 4 systems, namely a rule-based approach, an ML model, and 2 hybrid systems integrating the previous methods: a Rule as Feature model and a Classifier Confidence model. Several ML algorithms were tested, including logistic regression (LR), support vector machine (SVM), and eXtreme gradient boosting (XGB). RESULTS When training on data from a single institution, LR lags behind the rules by 3.5% (F1 score: 92.2% v 95.7%). Hybrid models, instead, obtain competitive results, with Classifier Confidence outperforming the rules by +0.5% (96.2%). When a larger amount of data from multiple institutions is used, LR improves by +1.5% over the rules (97.2%), whereas hybrid systems obtain +2.2% for Rule as Feature (97.7%) and +2.6% for Classifier Confidence (98.3%). Replacing LR with SVM or XGB yielded similar performance gains. CONCLUSION We developed methods to use pre-existing handcrafted rules to inform ML algorithms. These hybrid systems obtain better performance than either rules or ML models alone, even when training data are limited.
Collapse
Affiliation(s)
- Enrico Santus
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, MA
| | - Tal Schuster
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, MA
| | | | - Clara Li
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, MA
| | - Adam Yala
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, MA
| | - Conor R Lanahan
- Department of Oncology, Massachusetts General Hospital, Boston, MA
| | | | | | | | | | - Regina Barzilay
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, MA
| | - Kevin Hughes
- Department of Oncology, Massachusetts General Hospital, Boston, MA
| |
Collapse
|
30
|
Ronquillo CE, Peltonen LM, Pruinelli L, Chu CH, Bakken S, Beduschi A, Cato K, Hardiker N, Junger A, Michalowski M, Nyrup R, Rahimi S, Reed DN, Salakoski T, Salanterä S, Walton N, Weber P, Wiegand T, Topaz M. Artificial intelligence in nursing: Priorities and opportunities from an international invitational think-tank of the Nursing and Artificial Intelligence Leadership Collaborative. J Adv Nurs 2021; 77:3707-3717. [PMID: 34003504 PMCID: PMC7612744 DOI: 10.1111/jan.14855] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 03/21/2021] [Indexed: 01/23/2023]
Abstract
Aim To develop a consensus paper on the central points of an international invitational think‐tank on nursing and artificial intelligence (AI). Methods We established the Nursing and Artificial Intelligence Leadership (NAIL) Collaborative, comprising interdisciplinary experts in AI development, biomedical ethics, AI in primary care, AI legal aspects, philosophy of AI in health, nursing practice, implementation science, leaders in health informatics practice and international health informatics groups, a representative of patients and the public, and the Chair of the ITU/WHO Focus Group on Artificial Intelligence for Health. The NAIL Collaborative convened at a 3‐day invitational think tank in autumn 2019. Activities included a pre‐event survey, expert presentations and working sessions to identify priority areas for action, opportunities and recommendations to address these. In this paper, we summarize the key discussion points and notes from the aforementioned activities. Implications for nursing Nursing's limited current engagement with discourses on AI and health posts a risk that the profession is not part of the conversations that have potentially significant impacts on nursing practice. Conclusion There are numerous gaps and a timely need for the nursing profession to be among the leaders and drivers of conversations around AI in health systems. Impact We outline crucial gaps where focused effort is required for nursing to take a leadership role in shaping AI use in health systems. Three priorities were identified that need to be addressed in the near future: (a) Nurses must understand the relationship between the data they collect and AI technologies they use; (b) Nurses need to be meaningfully involved in all stages of AI: from development to implementation; and (c) There is a substantial untapped and an unexplored potential for nursing to contribute to the development of AI technologies for global health and humanitarian efforts.
Collapse
Affiliation(s)
- Charlene Esteban Ronquillo
- Daphne Cockwell School of Nursing, Faculty of Community Services, Ryerson University, Toronto, ON, Canada.,School of Nursing, Faculty of Health and Social Development, University of British Columbia Okanagan, Kelowna, BC, Canada.,International Medical Informatics Association, Student and Emerging Professionals Special Interest Group
| | - Laura-Maria Peltonen
- International Medical Informatics Association, Student and Emerging Professionals Special Interest Group.,Department of Nursing Science, University of Turku, Turku, Finland
| | | | - Charlene H Chu
- Lawrence S. Bloomberg Faculty of Nursing, University of Toronto, Toronto, ON, Canada
| | - Suzanne Bakken
- School of Nursing, Department of Biomedical Informatics, Data Science Institute, Columbia University, New York, NY, USA.,Precision in Symptom Self-Management (PriSSM) Center, Reducing Health Disparities Through Informatics Training Program (RHeaDI), Columbia University, New York, NY, USA
| | | | - Kenrick Cato
- School of Nursing, Department of Biomedical Informatics, Data Science Institute, Columbia University, New York, NY, USA
| | - Nicholas Hardiker
- School of Human & Health Sciences, University of Huddersfield, Huddersfield, UK
| | - Alain Junger
- Nursing Direction, Nursing Information System Unit, Centre Hospitalier Universitaire Vaudois (CHUV) Lausanne, Lausanne, Switzerland
| | | | - Rune Nyrup
- Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK
| | - Samira Rahimi
- Department of Family Medicine, McGill University, Lady Davis Institute for Medical Research of Jewish General Hospital, Mila Quebec Artificial Intelligence Institute, Montreal, QC, Canada
| | | | - Tapio Salakoski
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
| | - Sanna Salanterä
- Department of Nursing Science, University of Turku and Turku University Hospital, Turku, Finland
| | - Nancy Walton
- Daphne Cockwell School of Nursing, Faculty of Community Services, Ryerson University, Toronto, ON, Canada.,Research Ethics Board, Women's College Hospital, Toronto, ON, Canada.,Health Canada and Public Health Agency of Canada's Research Ethics Board, Toronto, ON, Canada
| | - Patrick Weber
- NICE Computing SA, Lausanne, Switzerland.,European Federation for Medical Informatics (EFMI)
| | - Thomas Wiegand
- ITU/WHO Focus Group on Artificial Intelligence for Health (FG-AI4H).,Fraunhofer Heinrich Hertz Institute, Berlin, Germany.,Berlin Institute of Technology, Berlin, Germany
| | - Maxim Topaz
- International Medical Informatics Association, Student and Emerging Professionals Special Interest Group.,School of Nursing, Department of Biomedical Informatics, Data Science Institute, Columbia University, New York, NY, USA
| |
Collapse
|
31
|
Koleck TA, Tatonetti NP, Bakken S, Mitha S, Henderson MM, George M, Miaskowski C, Smaldone A, Topaz M. Identifying Symptom Information in Clinical Notes Using Natural Language Processing. Nurs Res 2021; 70:173-183. [PMID: 33196504 PMCID: PMC9109773 DOI: 10.1097/nnr.0000000000000488] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
BACKGROUND Symptoms are a core concept of nursing interest. Large-scale secondary data reuse of notes in electronic health records (EHRs) has the potential to increase the quantity and quality of symptom research. However, the symptom language used in clinical notes is complex. A need exists for methods designed specifically to identify and study symptom information from EHR notes. OBJECTIVES We aim to describe a method that combines standardized vocabularies, clinical expertise, and natural language processing to generate comprehensive symptom vocabularies and identify symptom information in EHR notes. We piloted this method with five diverse symptom concepts: constipation, depressed mood, disturbed sleep, fatigue, and palpitations. METHODS First, we obtained synonym lists for each pilot symptom concept from the Unified Medical Language System. Then, we used two large bodies of text (clinical notes from Columbia University Irving Medical Center and PubMed abstracts containing Medical Subject Headings or key words related to the pilot symptoms) to further expand our initial vocabulary of synonyms for each pilot symptom concept. We used NimbleMiner, an open-source natural language processing tool, to accomplish these tasks and evaluated NimbleMiner symptom identification performance by comparison to a manually annotated set of nurse- and physician-authored common EHR note types. RESULTS Compared to the baseline Unified Medical Language System synonym lists, we identified up to 11 times more additional synonym words or expressions, including abbreviations, misspellings, and unique multiword combinations, for each symptom concept. Natural language processing system symptom identification performance was excellent. DISCUSSION Using our comprehensive symptom vocabularies and NimbleMiner to label symptoms in clinical notes produced excellent performance metrics. The ability to extract symptom information from EHR notes in an accurate and scalable manner has the potential to greatly facilitate symptom science research.
Collapse
|
32
|
Kulshrestha S, Dligach D, Joyce C, Gonzalez R, O'Rourke AP, Glazer JM, Stey A, Kruser JM, Churpek MM, Afshar M. Comparison and interpretability of machine learning models to predict severity of chest injury. JAMIA Open 2021; 4:ooab015. [PMID: 33709067 PMCID: PMC7935500 DOI: 10.1093/jamiaopen/ooab015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/08/2021] [Accepted: 02/12/2021] [Indexed: 11/15/2022] Open
Abstract
Objective Trauma quality improvement programs and registries improve care and outcomes for injured patients. Designated trauma centers calculate injury scores using dedicated trauma registrars; however, many injuries arrive at nontrauma centers, leaving a substantial amount of data uncaptured. We propose automated methods to identify severe chest injury using machine learning (ML) and natural language processing (NLP) methods from the electronic health record (EHR) for quality reporting. Materials and Methods A level I trauma center was queried for patients presenting after injury between 2014 and 2018. Prediction modeling was performed to classify severe chest injury using a reference dataset labeled by certified registrars. Clinical documents from trauma encounters were processed into concept unique identifiers for inputs to ML models: logistic regression with elastic net (EN) regularization, extreme gradient boosted (XGB) machines, and convolutional neural networks (CNN). The optimal model was identified by examining predictive and face validity metrics using global explanations. Results Of 8952 encounters, 542 (6.1%) had a severe chest injury. CNN and EN had the highest discrimination, with an area under the receiver operating characteristic curve of 0.93 and calibration slopes between 0.88 and 0.97. CNN had better performance across risk thresholds with fewer discordant cases. Examination of global explanations demonstrated the CNN model had better face validity, with top features including “contusion of lung” and “hemopneumothorax.” Discussion The CNN model featured optimal discrimination, calibration, and clinically relevant features selected. Conclusion NLP and ML methods to populate trauma registries for quality analyses are feasible.
Collapse
Affiliation(s)
- Sujay Kulshrestha
- Burn and Shock Trauma Research Institute, Loyola University Chicago, Maywood, Illinois, USA.,Department of Surgery, Loyola University Medical Center, Maywood, Illinois, USA
| | - Dmitriy Dligach
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA.,Department of Computer Science, Loyola University Chicago, Chicago, Illinois, USA
| | - Cara Joyce
- Center for Health Outcomes and Informatics Research, Health Sciences Division, Loyola University Chicago, Maywood, Illinois, USA.,Department of Public Health Sciences, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, USA
| | - Richard Gonzalez
- Burn and Shock Trauma Research Institute, Loyola University Chicago, Maywood, Illinois, USA.,Department of Surgery, Loyola University Medical Center, Maywood, Illinois, USA
| | - Ann P O'Rourke
- Department of Surgery, University of Wisconsin, Madison, Wisconsin, USA
| | - Joshua M Glazer
- Department of Emergency Medicine, University of Wisconsin, Madison, Wisconsin, USA
| | - Anne Stey
- Department of Surgery, Northwestern University, Chicago, Illinois, USA
| | | | - Matthew M Churpek
- Department of Medicine, University of Wisconsin, Madison, Wisconsin, USA
| | - Majid Afshar
- Department of Medicine, University of Wisconsin, Madison, Wisconsin, USA
| |
Collapse
|
33
|
Parikh S, Davoudi A, Yu S, Giraldo C, Schriver E, Mowery D. Lexicon Development for COVID-19-related Concepts Using Open-source Word Embedding Sources: An Intrinsic and Extrinsic Evaluation. JMIR Med Inform 2021; 9:e21679. [PMID: 33544689 PMCID: PMC7901592 DOI: 10.2196/21679] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 09/20/2020] [Accepted: 01/31/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Scientists are developing new computational methods and prediction models to better clinically understand COVID-19 prevalence, treatment efficacy, and patient outcomes. These efforts could be improved by leveraging documented COVID-19-related symptoms, findings, and disorders from clinical text sources in an electronic health record. Word embeddings can identify terms related to these clinical concepts from both the biomedical and nonbiomedical domains, and are being shared with the open-source community at large. However, it's unclear how useful openly available word embeddings are for developing lexicons for COVID-19-related concepts. OBJECTIVE Given an initial lexicon of COVID-19-related terms, this study aims to characterize the returned terms by similarity across various open-source word embeddings and determine common semantic and syntactic patterns between the COVID-19 queried terms and returned terms specific to the word embedding source. METHODS We compared seven openly available word embedding sources. Using a series of COVID-19-related terms for associated symptoms, findings, and disorders, we conducted an interannotator agreement study to determine how accurately the most similar returned terms could be classified according to semantic types by three annotators. We conducted a qualitative study of COVID-19 queried terms and their returned terms to detect informative patterns for constructing lexicons. We demonstrated the utility of applying such learned synonyms to discharge summaries by reporting the proportion of patients identified by concept among three patient cohorts: pneumonia (n=6410), acute respiratory distress syndrome (n=8647), and COVID-19 (n=2397). RESULTS We observed high pairwise interannotator agreement (Cohen kappa) for symptoms (0.86-0.99), findings (0.93-0.99), and disorders (0.93-0.99). Word embedding sources generated based on characters tend to return more synonyms (mean count of 7.2 synonyms) compared to token-based embedding sources (mean counts range from 2.0 to 3.4). Word embedding sources queried using a qualifier term (eg, dry cough or muscle pain) more often returned qualifiers of the similar semantic type (eg, "dry" returns consistency qualifiers like "wet" and "runny") compared to a single term (eg, cough or pain) queries. A higher proportion of patients had documented fever (0.61-0.84), cough (0.41-0.55), shortness of breath (0.40-0.59), and hypoxia (0.51-0.56) retrieved than other clinical features. Terms for dry cough returned a higher proportion of patients with COVID-19 (0.07) than the pneumonia (0.05) and acute respiratory distress syndrome (0.03) populations. CONCLUSIONS Word embeddings are valuable technology for learning related terms, including synonyms. When leveraging openly available word embedding sources, choices made for the construction of the word embeddings can significantly influence the words learned.
Collapse
Affiliation(s)
- Soham Parikh
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, United States
| | - Anahita Davoudi
- Department of Biostatistics, Epidemiology, & Informatics, University of Pennsylvania, Philadelphia, PA, United States
| | - Shun Yu
- Division of Hematology/Oncology, Department of Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA, United States
| | - Carolina Giraldo
- Philadelphia College of Osteopathic Medicine, Philadelphia, PA, United States
| | - Emily Schriver
- Data Analytics Center, Penn Medicine, Philadelphia, PA, United States
| | - Danielle Mowery
- Department of Biostatistics, Epidemiology, & Informatics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
34
|
Home Healthcare Clinical Notes Predict Patient Hospitalization and Emergency Department Visits. Nurs Res 2021; 69:448-454. [PMID: 32852359 DOI: 10.1097/nnr.0000000000000470] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
BACKGROUND About 30% of home healthcare patients are hospitalized or visit an emergency department (ED) during a home healthcare (HHC) episode. Novel data science methods are increasingly used to improve identification of patients at risk for negative outcomes. OBJECTIVES The aim of the study was to identify patients at heightened risk hospitalization or ED visits using HHC narrative data (clinical notes). METHODS This study used a large database of HHC visit notes (n = 727,676) documented for 112,237 HHC episodes (89,459 unique patients) by clinicians of the largest nonprofit HHC agency in the United States. Text mining and machine learning algorithms (Naïve Bayes, decision tree, random forest) were implemented to predict patient hospitalization or ED visits using the content of clinical notes. Risk factors associated with hospitalization or ED visits were identified using a feature selection technique (gain ratio attribute evaluation). RESULTS Best performing text mining method (random forest) achieved good predictive performance. Seven risk factors categories were identified, with clinical factors, coordination/communication, and service use being the most frequent categories. DISCUSSION This study was the first to explore the potential contribution of HHC clinical notes to identifying patients at risk for hospitalization or an ED visit. Our results suggest that HHC visit notes are highly informative and can contribute significantly to identification of patients at risk. Further studies are needed to explore ways to improve risk prediction by adding more data elements from additional data sources.
Collapse
|
35
|
Woo K, Adams V, Wilson P, Fu LH, Cato K, Rossetti SC, McDonald M, Shang J, Topaz M. Identifying Urinary Tract Infection-Related Information in Home Care Nursing Notes. J Am Med Dir Assoc 2021; 22:1015-1021.e2. [PMID: 33434568 PMCID: PMC8106637 DOI: 10.1016/j.jamda.2020.12.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 07/28/2020] [Accepted: 12/06/2020] [Indexed: 12/12/2022]
Abstract
Objectives: Urinary tract infection (UTI) is common in home care but not easily captured with standard assessment. This study aimed to examine the value of nursing notes in detecting UTI signs and symptoms in home care. Design: The study developed a natural language processing (NLP) algorithm to automatically identify UTI-related information in nursing notes. Setting and Participants: Home care visit notes (n = 1,149,586) and care coordination notes (n = 1,461,171) for 89,459 patients treated in the largest nonprofit home care agency in the United States during 2014. Measures: We generated 6 categories of UTI-related information from literature and used the Unified Medical Language System (UMLS) to identify a preliminary list of terms. The NLP algorithm was tested on a gold standard set of 300 clinical notes annotated by clinical experts. We used structured Outcome and Assessment Information Set data to extract the frequency of UTI-related emergency department (ED) visits or hospitalizations and explored time-patterns in documentation of UTI-related information. Results: The NLP system achieved very good overall performance (F measure = 0.9, 95% CI: 0.87–0.93) based on the test results obtained by using the notes for patients admitted to the ED or hospital due to UTI. UTI-related information was significantly more prevalent (P < .01 for all the tests) in home care episodes with UTI-related ED admission or hospitalization vs the general patient population; 81% of home care episodes with UTI-related hospitalization or ED admission had at least 1 category of UTI-related information vs 21.6% among episodes without UTI-related hospitalization or ED admission. Frequency of UTI-related information documentation increased in advance of UTI-related hospitalization or ED admission, peaking within a few days before the event. Conclusions and Implications: Information in nursing notes is often overlooked by stakeholders and not integrated into predictive modeling for decision-making support, but our findings highlight their value in early risk identification and care guidance. Health care administrators should consider using NLP to extract clinical data from nursing notes to improve early detection and treatment, which may lead to quality improvement and cost reduction.
Collapse
Affiliation(s)
- Kyungmi Woo
- College of Nursing, Seoul National University, Seoul, Republic of Korea.
| | - Victoria Adams
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA
| | - Paula Wilson
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA
| | - Li-Heng Fu
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Kenrick Cato
- College of Nursing, Seoul National University, Seoul, Republic of Korea
| | - Sarah Collins Rossetti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA; School of Nursing, Columbia University, New York, NY, USA
| | - Margaret McDonald
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA
| | - Jingjing Shang
- School of Nursing, Columbia University, New York, NY, USA
| | - Maxim Topaz
- Center for Home Care Policy & Research, Visiting Nurse Service of New York, New York, NY, USA; School of Nursing, Columbia University, New York, NY, USA; Data Science Institute, Columbia University, New York, NY, USA
| |
Collapse
|
36
|
Sockolow PS, Bowles KH, Topaz M, Koru G, Hellesø R, O'Connor M, Bass EJ. The Time is Now: Informatics Research Opportunities in Home Health Care. Appl Clin Inform 2021; 12:100-106. [PMID: 33598906 PMCID: PMC7889426 DOI: 10.1055/s-0040-1722222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/21/2020] [Indexed: 10/22/2022] Open
Affiliation(s)
- Paulina S. Sockolow
- College of Nursing and Health Professions, Drexel University, Philadelphia, Pennsylvania, United States
| | - Kathryn H. Bowles
- Department of Biobehavioral Health Science, NewCourtland Center for Transitions and Health, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, United States
- Center for Home Care Policy and Research, Visiting Nurse Service of New York, New York, United States
| | - Maxim Topaz
- Columbia University School of Nursing, Columbia University Data Science Institute, Visiting Nurse Service of New York, New York, United States
| | - Gunes Koru
- Department of Information Systems, University of Maryland Baltimore County, Baltimore, Maryland, United States
| | - Ragnhild Hellesø
- Department of Nursing Science, Institute of Health and Society, University of Oslo, Oslo, Norway
| | - Melissa O'Connor
- M. Louise Fitzpatrick College of Nursing, Villanova University, Villanova, Pennsylvania, United States
| | - Ellen J. Bass
- College of Nursing and Health Professions, College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania, United States
| |
Collapse
|
37
|
Topaz M, Koleck TA, Onorato N, Smaldone A, Bakken S. Nursing documentation of symptoms is associated with higher risk of emergency department visits and hospitalizations in homecare patients. Nurs Outlook 2020; 69:435-446. [PMID: 33386145 DOI: 10.1016/j.outlook.2020.12.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 10/23/2020] [Accepted: 12/11/2020] [Indexed: 10/22/2022]
Abstract
BACKGROUND Nurses often document patient symptoms in narrative notes. PURPOSE This study used a technique called natural language processing (NLP) to: (1) Automatically identify documentation of seven common symptoms (anxiety, cognitive disturbance, depressed mood, fatigue, sleep disturbance, pain, and well-being) in homecare narrative nursing notes, and (2) examine the association between symptoms and emergency department visits or hospital admissions from homecare. METHOD NLP was applied on a large subset of narrative notes (2.5 million notes) documented for 89,825 patients admitted to one large homecare agency in the Northeast United States. FINDINGS NLP accurately identified symptoms in narrative notes. Patients with more documented symptom categories had higher risk of emergency department visit or hospital admission. DISCUSSION Further research is needed to explore additional symptoms and implement NLP systems in the homecare setting to enable early identification of concerning patient trends leading to emergency department visit or hospital admission.
Collapse
Affiliation(s)
- Maxim Topaz
- Center for Home Care Policy and Research, Visiting Nurse Service of New York, New York, NY; Columbia University School of Nursing, Columbia University Data Science Institute, New York, NY
| | | | - Nicole Onorato
- Center for Home Care Policy and Research, Visiting Nurse Service of New York, New York, NY.
| | - Arlene Smaldone
- Columbia University School of Nursing, Columbia University College of Dental Medicine, New York, NY
| | - Suzanne Bakken
- Columbia University School of Nursing, Columbia University Department of Biomedical Informatics, Columbia University Data Science Institute, New York, NY
| |
Collapse
|
38
|
Topaz M, Adams V, Wilson P, Woo K, Ryvicker M. Free-Text Documentation of Dementia Symptoms in Home Healthcare: A Natural Language Processing Study. Gerontol Geriatr Med 2020; 6:2333721420959861. [PMID: 33029550 PMCID: PMC7520927 DOI: 10.1177/2333721420959861] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 08/18/2020] [Accepted: 08/21/2020] [Indexed: 01/11/2023] Open
Abstract
Background Little is known about symptom documentation related to Alzheimer's disease and related dementias (ADRD) by home healthcare (HHC) clinicians. Objective This study: (1) developed a natural language processing (NLP) algorithm that identifies common neuropsychiatric symptoms of ADRD in HHC free-text clinical notes; (2) described symptom clusters and hospitalization or emergency department (ED) visit rates for patients with and without these symptoms. Method We examined a corpus of -2.6 million free-text notes for 112,237 HHC episodes among 89,459 patients admitted to a non-profit HHC agency for post-acute care with any diagnosis. We used NLP software (NimbleMiner) to construct indicators of six neuropsychiatric symptoms. Structured HHC assessment data were used to identify known ADRD diagnoses and construct measures of hospitalization/ED use during HHC. Results Neuropsychiatric symptoms were documented for 40% of episodes. Common clusters included impaired memory, anxiety and/or depressed mood. One in three episodes without an ADRD diagnosis had documented symptoms. Hospitalization/ED rates increased with one or more symptoms present. Conclusion HHC providers should examine episodes with neuropsychiatric symptoms but no ADRD diagnoses to determine whether ADRD diagnosis was missed or to recommend ADRD evaluation. NLP-generated symptom indicators can help to identify high-risk patients for targeted interventions.
Collapse
Affiliation(s)
- Maxim Topaz
- Columbia University, New York, NY, USA.,Visiting Nurse Service of New York, New York, NY, USA
| | | | - Paula Wilson
- Visiting Nurse Service of New York, New York, NY, USA
| | | | - Miriam Ryvicker
- Visiting Nurse Service of New York, New York, NY, USA.,Vital Statistics Consulting, New York, NY, USA
| |
Collapse
|
39
|
Sterckx L, Vandewiele G, Dehaene I, Janssens O, Ongenae F, De Backere F, De Turck F, Roelens K, Decruyenaere J, Van Hoecke S, Demeester T. Clinical information extraction for preterm birth risk prediction. J Biomed Inform 2020; 110:103544. [PMID: 32858168 DOI: 10.1016/j.jbi.2020.103544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/18/2020] [Accepted: 08/20/2020] [Indexed: 10/23/2022]
Abstract
This paper contributes to the pursuit of leveraging unstructured medical notes to structured clinical decision making. In particular, we present a pipeline for clinical information extraction from medical notes related to preterm birth, and discuss the main challenges as well as its potential for clinical practice. A large collection of medical notes, created by staff during hospitalizations of patients who were at risk of delivering preterm, was gathered and analyzed. Based on an annotated collection of notes, we trained and evaluated information extraction components to discover clinical entities such as symptoms, events, anatomical sites and procedures, as well as attributes linked to these clinical entities. In a retrospective study, we show that these are highly informative for clinical decision support models that are trained to predict whether delivery is likely to occur within specific time windows, in combination with structured information from electronic health records.
Collapse
Affiliation(s)
- Lucas Sterckx
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium.
| | - Gilles Vandewiele
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| | - Isabelle Dehaene
- Department of Gynaecology and Obstetrics, Ghent University Hospital, Corneel Heymanslaan 10, Ghent, Belgium
| | - Olivier Janssens
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| | - Femke Ongenae
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| | - Femke De Backere
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| | - Filip De Turck
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| | - Kristien Roelens
- Department of Gynaecology and Obstetrics, Ghent University Hospital, Corneel Heymanslaan 10, Ghent, Belgium
| | - Johan Decruyenaere
- Department of Intensive Care Medicine, Ghent University Hospital, Corneel Heymanslaan 10, Ghent, Belgium
| | - Sofie Van Hoecke
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| | - Thomas Demeester
- IDLab, Ghent University - imec, Technologiepark-Zwijnaarde 126, Ghent, Belgium
| |
Collapse
|
40
|
Paulin J, Kurola J, Salanterä S, Moen H, Guragain N, Koivisto M, Käyhkö N, Aaltonen V, Iirola T. Changing role of EMS -analyses of non-conveyed and conveyed patients in Finland. Scand J Trauma Resusc Emerg Med 2020; 28:45. [PMID: 32471460 PMCID: PMC7260794 DOI: 10.1186/s13049-020-00741-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 05/20/2020] [Indexed: 12/16/2022] Open
Abstract
Background Emergency Medical Services (EMS) and Emergency Departments (ED) have seen increasing attendance rates in the last decades. Currently, EMS are increasingly assessing and treating patients without the need to convey patients to health care facility. The aim of this study was to describe and compare the patient case-mix between conveyed and non-conveyed patients and to analyze factors related to non-conveyance decision making. Methods This was a prospective study design of EMS patients in Finland, and data was collected between 1st June and 30th November 2018. Adjusted ICPC2-classification was used as the reason for care. NEWS2-points were collected and analyzed both statistically and with a semi-supervised information extraction method. EMS patients’ geographic location and distance to health care facilities were analyzed by urban–rural classification. Results Of the EMS patients (40,263), 59.8% were over 65 years of age and 46.0% of the patients had zero NEWS2 points. The most common ICPC2 code was weakness/tiredness, general (A04), as seen in 13.5% of all patients. When comparing patients between the non-conveyance and conveyance group, a total of 35,454 EMS patients met the inclusion criteria and 14,874 patients (42.0%) were not conveyed to health care facilities. According the multivariable logistic regression model, the non-conveyance decision was more likely made by ALS units, when the EMS arrival time was in the evening or night and when the distance to the health care facility was 21-40 km. Furthermore, younger patients, female gender, whether the patient had used alcohol and a rural area were also related to the non-conveyance decision. If the patient’s NEWS2 score increased by one or two points, the likelihood of conveyance increased. When there was less than 1 h to complete a shift, this did not associate with either non-conveyance or conveyance decisions. Conclusions The role of EMS might be changing. This warrants to redesign the chain-of-survival in EMS to include not only high-risk patient groups but also non-critical and general acute patients with non-specific reasons for care. Assessment and on-scene treatment without conveyance can be called the “stretched arm of the emergency department”, but should be planned carefully to ensure patient safety.
Collapse
Affiliation(s)
- Jani Paulin
- FinnHEMS Research and Development Unit, FinnHEMS Ltd, Vantaa, Finland. .,University of Turku (Doctoral Programme in Clinical Research (DPCR) / Medicine), Turku, Finland. .,Turku University of Applied Sciences, Turku, Finland.
| | - Jouni Kurola
- Centre for Prehospital Emergency Care, Kuopio University Hospital and University of Eastern Finland, Kuopio, Finland
| | - Sanna Salanterä
- Department of Nursing Science, University of Turku and Turku University Hospital, Turku, Finland
| | - Hans Moen
- Department of Future Technologies, University of Turku, Turku, Finland
| | - Nischal Guragain
- Department of Future Technologies, University of Turku, Turku, Finland
| | - Mari Koivisto
- Department of Biostatistics, University of Turku, Turku, Finland
| | - Niina Käyhkö
- Department of Geography and Geology, University of Turku, Turku, Finland
| | - Venla Aaltonen
- Department of Geography and Geology, University of Turku, Turku, Finland
| | - Timo Iirola
- Emergency Medical Services, Turku University Hospital and University of Turku, Turku, Finland
| |
Collapse
|
41
|
Uronen L, Moen H, Teperi S, Martimo KP, Hartiala J, Salanterä S. Towards automated detection of psychosocial risk factors with text mining. Occup Med (Lond) 2020; 70:203-206. [DOI: 10.1093/occmed/kqaa022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Abstract
Background
Psychosocial risk factors influence early retirement and absence from work. Health checks by occupational health nurses (OHNs) may prevent deterioration of work ability. Health checks are documented electronically mostly as free text, and therefore the effect of psychological risk factors on working capacity is difficult to detect.
Aims
To evaluate the potential of text mining for automated early detection of psychosocial risk factors by examining health check free-text documentation, which may indicate medical statements recommending early retirement, prolonged sick leave or rehabilitation. Psychosocial risk factors were extracted from OHN documentation in a nationwide occupational health care registry.
Methods
Analysis of health check documentation and medical statements regarding pension, sick leave and rehabilitation. Annotations of 13 psychosocial factors based on the Prima-EF standard (PAS 1010) were used with a combination of unsupervised machine learning, a document search engine and manual filtering.
Results
Health check documentation was analysed for 7078 employees. In 83% of their health checks, psychosocial risk factors were mentioned. All of these occurred more frequently in the group that received medical statements for pension, rehabilitation or sick leave than the group that did not receive medical statement. Documentation of career development and work control indicated future loss of work ability.
Conclusions
This study showed that it was possible to detect risk factors for sick leave, rehabilitation and pension from free-text documentation of health checks. It is suggested to develop a text mining tool to automate the detection of psychosocial risk factors at an early stage.
Collapse
Affiliation(s)
- L Uronen
- Department of Nursing Science, University of Turku, Turku, Finland
| | - H Moen
- Department of Future Technologies, University of Turku, Turku, Finland
| | - S Teperi
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
| | - K-P Martimo
- Ilmarinen Mutual Pension Insurance Company, Helsinki, Finland
| | - J Hartiala
- Department of Medicine, University of Turku, Turku, Finland
| | - S Salanterä
- Department of Nursing Science, University of Turku, Turku, Finland
| |
Collapse
|