Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sammani A, Bagheri A, van der Heijden PGM, Te Riele ASJM, Baas AF, Oosters CAJ, Oberski D, Asselbergs FW. Automatic multilabel detection of ICD10 codes in Dutch cardiology discharge letters using neural networks. NPJ Digit Med 2021;4:37. [PMID: 33637859 DOI: 10.1038/s41746-021-00404-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 01/26/2021] [Indexed: 12/02/2022] Open

For:	Sammani A, Bagheri A, van der Heijden PGM, Te Riele ASJM, Baas AF, Oosters CAJ, Oberski D, Asselbergs FW. Automatic multilabel detection of ICD10 codes in Dutch cardiology discharge letters using neural networks. NPJ Digit Med 2021;4:37. [PMID: 33637859 DOI: 10.1038/s41746-021-00404-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 01/26/2021] [Indexed: 12/02/2022] Open

Number

Cited by Other Article(s)

Seinen TM, Kors JA, van Mulligen EM, Fridgeirsson EA, Verhamme KM, Rijnbeek PR. Using clinical text to refine unspecific condition codes in Dutch general practitioner EHR data. Int J Med Inform 2024;189:105506. [PMID: 38820647 DOI: 10.1016/j.ijmedinf.2024.105506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 05/22/2024] [Accepted: 05/27/2024] [Indexed: 06/02/2024]

Abstract

OBJECTIVE

Observational studies using electronic health record (EHR) databases often face challenges due to unspecific clinical codes that can obscure detailed medical information, hindering precise data analysis. In this study, we aimed to assess the feasibility of refining these unspecific condition codes into more specific codes in a Dutch general practitioner (GP) EHR database by leveraging the available clinical free text.

METHODS

We utilized three approaches for text classification-search queries, semi-supervised learning, and supervised learning-to improve the specificity of ten unspecific International Classification of Primary Care (ICPC-1) codes. Two text representations and three machine learning algorithms were evaluated for the (semi-)supervised models. Additionally, we measured the improvement achieved by the refinement process on all code occurrences in the database.

RESULTS

The classification models performed well for most codes. In general, no single classification approach consistently outperformed the others. However, there were variations in the relative performance of the classification approaches within each code and in the use of different text representations and machine learning algorithms. Class imbalance and limited training data affected the performance of the (semi-)supervised models, yet the simple search queries remained particularly effective. Ultimately, the developed models improved the specificity of over half of all the unspecific code occurrences in the database.

CONCLUSIONS

Our findings show the feasibility of using information from clinical text to improve the specificity of unspecific condition codes in observational healthcare databases, even with a limited range of machine-learning techniques and modest annotated training sets. Future work could investigate transfer learning, integration of structured data, alternative semi-supervised methods, and validation of models across healthcare settings. The improved level of detail enriches the interpretation of medical information and can benefit observational research and patient care.

Collapse

Wattanachayakul P, Yanpiset P, Suenghataiphorn T, Srikulmontri T, Danpanichkul P, Rujirachun P, Polpichai N, Saowapa S, Casipit BA, Suparan K, Amanullah A. Impact of COVID-19 infection among patients hospitalized for conventional pacemaker implantation: Analysis of the Nationwide Inpatient Sample (NIS) 2020. J Arrhythm 2024;40:905-912. [PMID: 39139863 PMCID: PMC11317689 DOI: 10.1002/joa3.13089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 05/06/2024] [Accepted: 05/21/2024] [Indexed: 08/15/2024] Open

Abstract

Introduction

The cardiac pacemaker is indicated for treating various types of bradyarrhythmia, providing lifelong cardiovascular benefits. Recent data showed that COVID-19 has impacted procedure numbers and led to adverse long-term outcomes in patients with cardiac pacemakers. However, the impact of COVID-19 infection on the in-hospital outcome of patients undergoing conventional pacemaker implantation remains unclear.

Method

Patients aged above 18 years who were hospitalized for conventional pacemaker implantation in the Nationwide In-patient Sample (NIS) 2020 were identified using relevant ICD-10 CM and PCS codes. Multivariable logistic and linear regression models were used to analyze pre-specified outcomes, with the primary outcome being in-patient mortality and secondary outcomes including system-based and procedure-related complications.

Results

Of 108 020 patients hospitalized for conventional pacemaker implantation, 0.71% (765 out of 108 020) had a concurrent diagnosis of COVID-19 infection. Individuals with COVID-19 infection exhibited a lower mean age (73.7 years vs. 75.9 years, p = .027) and a lower female proportion (39.87% vs. 47.60%, p = .062) than those without COVID-19. In the multivariable logistic and linear regression models, adjusted for patient and hospital factors, COVID-19 infection was associated with higher in-hospital mortality (aOR 4.67; 95% CI 2.02 to 10.27, p < .001), extended length of stay (5.23 days vs. 1.04 days, p < .001), and linked with various in-hospital complications, including sepsis, acute respiratory failure, post-procedural pneumothorax, and venous thromboembolism.

Conclusion

Our study suggests that COVID-19 infection is attributed to higher in-hospital mortality, extended hospital stays, and increased adverse in-hospital outcomes in patients undergoing conventional pacemaker implantation.

Collapse

Tavabi N, Singh M, Pruneski J, Kiapour AM. Systematic evaluation of common natural language processing techniques to codify clinical notes. PLoS One 2024;19:e0298892. [PMID: 38451905 PMCID: PMC10919678 DOI: 10.1371/journal.pone.0298892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/31/2024] [Indexed: 03/09/2024] Open

Boonstra MJ, Weissenbacher D, Moore JH, Gonzalez-Hernandez G, Asselbergs FW. Artificial intelligence: revolutionizing cardiology with large language models. Eur Heart J 2024;45:332-345. [PMID: 38170821 PMCID: PMC10834163 DOI: 10.1093/eurheartj/ehad838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 12/01/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024] Open

Desai R, Katukuri N, Goguri SR, Kothawala A, Alle NR, Bellamkonda MK, Dey D, Ganesan S, Biswas M, Sarkar K, Prattipati P, Chauhan S. Prediabetes: An overlooked risk factor for major adverse cardiac and cerebrovascular events in atrial fibrillation patients. World J Diabetes 2024;15:24-33. [PMID: 38313858 PMCID: PMC10835500 DOI: 10.4239/wjd.v15.i1.24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 10/22/2023] [Accepted: 12/15/2023] [Indexed: 01/12/2024] Open

Abstract

BACKGROUND

Prediabetes is a well-established risk factor for major adverse cardiac and cerebrovascular events (MACCE). However, the relationship between prediabetes and MACCE in atrial fibrillation (AF) patients has not been extensively studied. Therefore, this study aimed to establish a link between prediabetes and MACCE in AF patients.

AIM

To investigate a link between prediabetes and MACCE in AF patients.

METHODS

We used the National Inpatient Sample (2019) and relevant ICD-10 CM codes to identify hospitalizations with AF and categorized them into groups with and without prediabetes, excluding diabetics. The primary outcome was MACCE (all-cause inpatient mortality, cardiac arrest including ventricular fibrillation, and stroke) in AF-related hospitalizations.

RESULTS

Of the 2965875 AF-related hospitalizations for MACCE, 47505 (1.6%) were among patients with prediabetes. The prediabetes cohort was relatively younger (median 75 vs 78 years), and often consisted of males (56.3% vs 51.4%), blacks (9.8% vs 7.9%), Hispanics (7.3% vs 4.3%), and Asians (4.7% vs 1.6%) than the non-prediabetic cohort (P < 0.001). The prediabetes group had significantly higher rates of hypertension, hyperlipidemia, smoking, obesity, drug abuse, prior myocardial infarction, peripheral vascular disease, and hyperthyroidism (all P < 0.05). The prediabetes cohort was often discharged routinely (51.1% vs 41.1%), but more frequently required home health care (23.6% vs 21.0%) and had higher costs. After adjusting for baseline characteristics or comorbidities, the prediabetes cohort with AF admissions showed a higher rate and significantly higher odds of MACCE compared to the non-prediabetic cohort [18.6% vs 14.7%, odds ratio (OR) 1.34, 95% confidence interval 1.26-1.42, P < 0.001]. On subgroup analyses, males had a stronger association (aOR 1.43) compared to females (aOR 1.22), whereas on the race-wise comparison, Hispanics (aOR 1.43) and Asians (aOR 1.36) had a stronger association with MACCE with prediabetes vs whites (aOR 1.33) and blacks (aOR 1.21).

CONCLUSION

This population-based study found a significant association between prediabetes and MACCE in AF patients. Therefore, there is a need for further research to actively screen and manage prediabetes in AF to prevent MACCE.

Collapse

Hobensack M, Song J, Oh S, Evans L, Davoudi A, Bowles KH, McDonald MV, Barrón Y, Sridharan S, Wallace AS, Topaz M. Social Risk Factors are Associated with Risk for Hospitalization in Home Health Care: A Natural Language Processing Study. J Am Med Dir Assoc 2023;24:1874-1880.e4. [PMID: 37553081 PMCID: PMC10839109 DOI: 10.1016/j.jamda.2023.06.031] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/23/2023] [Accepted: 06/25/2023] [Indexed: 08/10/2023]

Abstract

OBJECTIVE

This study aimed to develop a natural language processing (NLP) system that identified social risk factors in home health care (HHC) clinical notes and to examine the association between social risk factors and hospitalization or an emergency department (ED) visit.

DESIGN

Retrospective cohort study.

SETTING AND PARTICIPANTS

We used standardized assessments and clinical notes from one HHC agency located in the northeastern United States. This included 86,866 episodes of care for 65,593 unique patients. Patients received HHC services between 2015 and 2017.

METHODS

Guided by HHC experts, we created a vocabulary of social risk factors that influence hospitalization or ED visit risk in the HHC setting. We then developed an NLP system to automatically identify social risk factors documented in clinical notes. We used an adjusted logistic regression model to examine the association between the NLP-based social risk factors and hospitalization or an ED visit.

RESULTS

On the basis of expert consensus, the following social risk factors emerged: Social Environment, Physical Environment, Education and Literacy, Food Insecurity, Access to Care, and Housing and Economic Circumstances. Our NLP system performed "very good" with an F score of 0.91. Approximately 4% of clinical notes (33% episodes of care) documented a social risk factor. The most frequently documented social risk factors were Physical Environment and Social Environment. Except for Housing and Economic Circumstances, all NLP-based social risk factors were associated with higher odds of hospitalization and ED visits.

CONCLUSIONS AND IMPLICATIONS

HHC clinicians assess and document social risk factors associated with hospitalizations and ED visits in their clinical notes. Future studies can explore the social risk factors documented in HHC to improve communication across the health care system and to predict patients at risk for being hospitalized or visiting the ED.

Collapse

Chen PF, He TL, Lin SC, Chu YC, Kuo CT, Lai F, Wang SM, Zhu WX, Chen KC, Kuo LC, Hung FM, Lin YC, Tsai IC, Chiu CH, Chang SC, Yang CY. Training a Deep Contextualized Language Model for International Classification of Diseases, 10th Revision Classification via Federated Learning: Model Development and Validation Study. JMIR Med Inform 2022;10:e41342. [DOI: 10.2196/41342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 10/03/2022] [Accepted: 10/08/2022] [Indexed: 11/12/2022] Open

Abstract Background The automatic coding of clinical text documents by using the International Classification of Diseases, 10th Revision (ICD-10) can be performed for statistical analyses and reimbursements. With the development of natural language processing models, new transformer architectures with attention mechanisms have outperformed previous models. Although multicenter training may increase a model’s performance and external validity, the privacy of clinical documents should be protected. We used federated learning to train a model with multicenter data, without sharing data per se. Objective This study aims to train a classification model via federated learning for ICD-10 multilabel classification. Methods Text data from discharge notes in electronic medical records were collected from the following three medical centers: Far Eastern Memorial Hospital, National Taiwan University Hospital, and Taipei Veterans General Hospital. After comparing the performance of different variants of bidirectional encoder representations from transformers (BERT), PubMedBERT was chosen for the word embeddings. With regard to preprocessing, the nonalphanumeric characters were retained because the model’s performance decreased after the removal of these characters. To explain the outputs of our model, we added a label attention mechanism to the model architecture. The model was trained with data from each of the three hospitals separately and via federated learning. The models trained via federated learning and the models trained with local data were compared on a testing set that was composed of data from the three hospitals. The micro F1 score was used to evaluate model performance across all 3 centers. Results The F1 scores of PubMedBERT, RoBERTa (Robustly Optimized BERT Pretraining Approach), ClinicalBERT, and BioBERT (BERT for Biomedical Text Mining) were 0.735, 0.692, 0.711, and 0.721, respectively. The F1 score of the model that retained nonalphanumeric characters was 0.8120, whereas the F1 score after removing these characters was 0.7875—a decrease of 0.0245 (3.11%). The F1 scores on the testing set were 0.6142, 0.4472, 0.5353, and 0.2522 for the federated learning, Far Eastern Memorial Hospital, National Taiwan University Hospital, and Taipei Veterans General Hospital models, respectively. The explainable predictions were displayed with highlighted input words via the label attention architecture. Conclusions Federated learning was used to train the ICD-10 classification model on multicenter clinical text while protecting data privacy. The model’s performance was better than that of models that were trained locally. Collapse

Sammani A, Jansen M, de Vries NM, de Jonge N, Baas AF, te Riele ASJM, Asselbergs FW, Oerlemans MIFJ. Automatic Identification of Patients With Unexplained Left Ventricular Hypertrophy in Electronic Health Record Data to Improve Targeted Treatment and Family Screening. Front Cardiovasc Med 2022;9:768847. [PMID: 35498038 PMCID: PMC9051030 DOI: 10.3389/fcvm.2022.768847] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 02/18/2022] [Indexed: 11/29/2022] Open

Blanco A, Remmer S, Pérez A, Dalianis H, Casillas A. Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish. J Biomed Inform 2022;130:104050. [DOI: 10.1016/j.jbi.2022.104050] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 01/31/2022] [Accepted: 03/07/2022] [Indexed: 11/30/2022]

Siegersma KR, Evers M, Bots SH, Groepenhoff F, Appelman Y, Hofstra L, Tulevski II, Somsen GA, den Ruijter HM, Spruit M, Onland-Moret NC. Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching. JMIR Med Inform 2022;10:e31063. [PMID: 35076407 PMCID: PMC8826143 DOI: 10.2196/31063] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 11/02/2021] [Accepted: 11/14/2021] [Indexed: 12/02/2022] Open

Abstract

Background

Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available.

Objective

The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN).

Methods

Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype.

Results

The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline.

Conclusions

The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs.

Collapse

Asselbergs FW, Fraser AG. Artificial intelligence in cardiology: the debate continues. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2021;2:721-726. [PMID: 36713089 PMCID: PMC9708032 DOI: 10.1093/ehjdh/ztab090] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 10/12/2021] [Indexed: 02/01/2023]

Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports. JOURNAL OF HEALTHCARE ENGINEERING 2021;2021:6663884. [PMID: 34306597 PMCID: PMC8285182 DOI: 10.1155/2021/6663884] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 05/29/2021] [Accepted: 06/29/2021] [Indexed: 11/17/2022]

Abstract

Methods

We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM).

Results

We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively.

Conclusions

Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors.

Collapse