1
|
Seinen TM, Kors JA, van Mulligen EM, Rijnbeek PR. Annotation-preserving machine translation of English corpora to validate Dutch clinical concept extraction tools. J Am Med Inform Assoc 2024; 31:1725-1734. [PMID: 38934643 PMCID: PMC11258409 DOI: 10.1093/jamia/ocae159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/24/2024] [Accepted: 06/10/2024] [Indexed: 06/28/2024] Open
Abstract
OBJECTIVE To explore the feasibility of validating Dutch concept extraction tools using annotated corpora translated from English, focusing on preserving annotations during translation and addressing the scarcity of non-English annotated clinical corpora. MATERIALS AND METHODS Three annotated corpora were standardized and translated from English to Dutch using 2 machine translation services, Google Translate and OpenAI GPT-4, with annotations preserved through a proposed method of embedding annotations in the text before translation. The performance of 2 concept extraction tools, MedSpaCy and MedCAT, was assessed across the corpora in both Dutch and English. RESULTS The translation process effectively generated Dutch annotated corpora and the concept extraction tools performed similarly in both English and Dutch. Although there were some differences in how annotations were preserved across translations, these did not affect extraction accuracy. Supervised MedCAT models consistently outperformed unsupervised models, whereas MedSpaCy demonstrated high recall but lower precision. DISCUSSION Our validation of Dutch concept extraction tools on corpora translated from English was successful, highlighting the efficacy of our annotation preservation method and the potential for efficiently creating multilingual corpora. Further improvements and comparisons of annotation preservation techniques and strategies for corpus synthesis could lead to more efficient development of multilingual corpora and accurate non-English concept extraction tools. CONCLUSION This study has demonstrated that translated English corpora can be used to validate non-English concept extraction tools. The annotation preservation method used during translation proved effective, and future research can apply this corpus translation method to additional languages and clinical settings.
Collapse
Affiliation(s)
- Tom M Seinen
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Erik M van Mulligen
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| |
Collapse
|
2
|
Gu S, Lee EW, Zhang W, Simpson RL, Hertzberg VS, Ho JC. Evaluating Natural Language Processing Packages for Predicting Hospital-Acquired Pressure Injuries From Clinical Notes. Comput Inform Nurs 2024; 42:184-192. [PMID: 37607706 PMCID: PMC10884344 DOI: 10.1097/cin.0000000000001053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Incidence of hospital-acquired pressure injury, a key indicator of nursing quality, is directly proportional to adverse outcomes, increased hospital stays, and economic burdens on patients, caregivers, and society. Thus, predicting hospital-acquired pressure injury is important. Prediction models use structured data more often than unstructured notes, although the latter often contain useful patient information. We hypothesize that unstructured notes, such as nursing notes, can predict hospital-acquired pressure injury. We evaluate the impact of using various natural language processing packages to identify salient patient information from unstructured text. We use named entity recognition to identify keywords, which comprise the feature space of our classifier for hospital-acquired pressure injury prediction. We compare scispaCy and Stanza, two different named entity recognition models, using unstructured notes in Medical Information Mart for Intensive Care III, a publicly available ICU data set. To assess the impact of vocabulary size reduction, we compare the use of all clinical notes with only nursing notes. Our results suggest that named entity recognition extraction using nursing notes can yield accurate models. Moreover, the extracted keywords play a significant role in the prediction of hospital-acquired pressure injury.
Collapse
Affiliation(s)
- Siyi Gu
- Author Affiliations: Department of Computer Science, Center for Data Science (Ms Gu, Mr Lee, and Dr Ho), and Nell Hodgson Woodruff School of Nursing (Drs Zhang, Simpson, and Hertzberg), Emory University, Atlanta, GA
| | | | | | | | | | | |
Collapse
|
3
|
Han L, Gladkoff S, Erofeev G, Sorokina I, Galiano B, Nenadic G. Neural machine translation of clinical text: an empirical investigation into multilingual pre-trained language models and transfer-learning. Front Digit Health 2024; 6:1211564. [PMID: 38468693 PMCID: PMC10926203 DOI: 10.3389/fdgth.2024.1211564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 01/12/2024] [Indexed: 03/13/2024] Open
Abstract
Clinical text and documents contain very rich information and knowledge in healthcare, and their processing using state-of-the-art language technology becomes very important for building intelligent systems for supporting healthcare and social good. This processing includes creating language understanding models and translating resources into other natural languages to share domain-specific cross-lingual knowledge. In this work, we conduct investigations on clinical text machine translation by examining multilingual neural network models using deep learning such as Transformer based structures. Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs). The experimental results on three sub-tasks including (1) clinical case (CC), (2) clinical terminology (CT), and (3) ontological concept (OC) show that our models achieved top-level performances in the ClinSpEn-2022 shared task on English-Spanish clinical domain data. Furthermore, our expert-based human evaluations demonstrate that the small-sized pre-trained language model (PLM) outperformed the other two extra-large language models by a large margin in the clinical domain fine-tuning, which finding was never reported in the field. Finally, the transfer learning method works well in our experimental setting using the WMT21fb model to accommodate a new language space Spanish that was not seen at the pre-training stage within WMT21fb itself, which deserves more exploitation for clinical knowledge transformation, e.g. to investigate into more languages. These research findings can shed some light on domain-specific machine translation development, especially in clinical and healthcare fields. Further research projects can be carried out based on our work to improve healthcare text analytics and knowledge transformation. Our data is openly available for research purposes at: https://github.com/HECTA-UoM/ClinicalNMT.
Collapse
Affiliation(s)
- Lifeng Han
- Department of Computer Science, The University of Manchester, Manchester, United Kingom
| | - Serge Gladkoff
- AI Lab, Logrus Global, Translation & Localization, Philadelphia, PA, United States
| | - Gleb Erofeev
- AI Lab, Logrus Global, Translation & Localization, Philadelphia, PA, United States
| | - Irina Sorokina
- AI Lab, Logrus Global, Translation & Localization, Philadelphia, PA, United States
| | - Betty Galiano
- Management Department, Ocean Translations, Rosario, Argentina
| | - Goran Nenadic
- Department of Computer Science, The University of Manchester, Manchester, United Kingom
| |
Collapse
|
4
|
Chien A, Tang H, Jagessar B, Chang KW, Peng N, Nael K, Salamon N. AI-Assisted Summarization of Radiologic Reports: Evaluating GPT3davinci, BARTcnn, LongT5booksum, LEDbooksum, LEDlegal, and LEDclinical. AJNR Am J Neuroradiol 2024; 45:244-248. [PMID: 38238092 DOI: 10.3174/ajnr.a8102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 11/09/2023] [Indexed: 02/09/2024]
Abstract
BACKGROUND AND PURPOSE The review of clinical reports is an essential part of monitoring disease progression. Synthesizing multiple imaging reports is also important for clinical decisions. It is critical to aggregate information quickly and accurately. Machine learning natural language processing (NLP) models hold promise to address an unmet need for report summarization. MATERIALS AND METHODS We evaluated NLP methods to summarize longitudinal aneurysm reports. A total of 137 clinical reports and 100 PubMed case reports were used in this study. Models were 1) compared against expert-generated summary using longitudinal imaging notes collected in our institute and 2) compared using publicly accessible PubMed case reports. Five AI models were used to summarize the clinical reports, and a sixth model, the online GPT3davinci NLP large language model (LLM), was added for the summarization of PubMed case reports. We assessed the summary quality through comparison with expert summaries using quantitative metrics and quality reviews by experts. RESULTS In clinical summarization, BARTcnn had the best performance (BERTscore = 0.8371), followed by LongT5Booksum and LEDlegal. In the analysis using PubMed case reports, GPT3davinci demonstrated the best performance, followed by models BARTcnn and then LEDbooksum (BERTscore = 0.894, 0.872, and 0.867, respectively). CONCLUSIONS AI NLP summarization models demonstrated great potential in summarizing longitudinal aneurysm reports, though none yet reached the level of quality for clinical usage. We found the online GPT LLM outperformed the others; however, the BARTcnn model is potentially more useful because it can be implemented on-site. Future work to improve summarization, address other types of neuroimaging reports, and develop structured reports may allow NLP models to ease clinical workflow.
Collapse
Affiliation(s)
- Aichi Chien
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Hubert Tang
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Bhavita Jagessar
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Kai-Wei Chang
- Department of Computer Science (K.C., N.P.), University of California, Los Angeles, Los Angeles, California
| | - Nanyun Peng
- Department of Computer Science (K.C., N.P.), University of California, Los Angeles, Los Angeles, California
| | - Kambiz Nael
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Noriko Salamon
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| |
Collapse
|
5
|
Tsai CH, Liu KH, Cheng DC. Remote Diagnosis on Upper Respiratory Tract Infections Based on a Neural Network with Few Symptom Words-A Feasibility Study. Diagnostics (Basel) 2024; 14:329. [PMID: 38337845 PMCID: PMC10855815 DOI: 10.3390/diagnostics14030329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/23/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
This study aims explore the feasibility of using neural network (NNs) and deep learning to diagnose three common respiratory diseases with few symptom words. These three diseases are nasopharyngitis, upper respiratory infection, and bronchitis/bronchiolitis. Through natural language processing, the symptom word vectors are encoded by GPT-2 and classified by the last linear layer of the NN. The experimental results are promising, showing that this model achieves a high performance in predicting all three diseases. They revealed 90% accuracy, which suggests the implications of the developed model, highlighting its potential use in assisting patients' understanding of their conditions via a remote diagnosis. Unlike previous studies that have focused on extracting various categories of information from medical records, this study directly extracts sequential features from unstructured text data, reducing the effort required for data pre-processing.
Collapse
Affiliation(s)
- Chung-Hung Tsai
- Institute of Allied Health Sciences, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan;
- Department of Family Medicine, An Nan Hospital, China Medical University, Tainan 709, Taiwan
| | - Kuan-Hung Liu
- School of Medicine, China Medical University, Taichung 404, Taiwan;
| | - Da-Chuan Cheng
- Department of Biomedical Imaging and Radiological Science, China Medical University, Taichung 404, Taiwan
| |
Collapse
|
6
|
Hacking C, Verbeek H, Hamers JPH, Aarts S. Comparing text mining and manual coding methods: Analysing interview data on quality of care in long-term care for older adults. PLoS One 2023; 18:e0292578. [PMID: 37939098 PMCID: PMC10631650 DOI: 10.1371/journal.pone.0292578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 09/24/2023] [Indexed: 11/10/2023] Open
Abstract
OBJECTIVES In long-term care for older adults, large amounts of text are collected relating to the quality of care, such as transcribed interviews. Researchers currently analyze textual data manually to gain insights, which is a time-consuming process. Text mining could provide a solution, as this methodology can be used to analyze large amounts of text automatically. This study aims to compare text mining to manual coding with regard to sentiment analysis and thematic content analysis. METHODS Data were collected from interviews with residents (n = 21), family members (n = 20), and care professionals (n = 20). Text mining models were developed and compared to the manual approach. The results of the manual and text mining approaches were evaluated based on three criteria: accuracy, consistency, and expert feedback. Accuracy assessed the similarity between the two approaches, while consistency determined whether each individual approach found the same themes in similar text segments. Expert feedback served as a representation of the perceived correctness of the text mining approach. RESULTS An accuracy analysis revealed that more than 80% of the text segments were assigned the same themes and sentiment using both text mining and manual approaches. Interviews coded with text mining demonstrated higher consistency compared to those coded manually. Expert feedback identified certain limitations in both the text mining and manual approaches. CONCLUSIONS AND IMPLICATIONS While these analyses highlighted the current limitations of text mining, they also exposed certain inconsistencies in manual analysis. This information suggests that text mining has the potential to be an effective and efficient tool for analysing large volumes of textual data in the context of long-term care for older adults.
Collapse
Affiliation(s)
- Coen Hacking
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Hilde Verbeek
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Jan P. H. Hamers
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Sil Aarts
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| |
Collapse
|
7
|
Szekér S, Fogarassy G, Vathy-Fogarassy Á. A general text mining method to extract echocardiography measurement results from echocardiography documents. Artif Intell Med 2023; 143:102584. [PMID: 37673570 DOI: 10.1016/j.artmed.2023.102584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 03/08/2023] [Accepted: 05/16/2023] [Indexed: 09/08/2023]
Abstract
BACKGROUND In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports. METHOD The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document. RESULTS The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948. CONCLUSION The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts.
Collapse
Affiliation(s)
- Szabolcs Szekér
- Department of Computer Science and Systems Technology, University of Pannonia, Veszprém, Hungary
| | - György Fogarassy
- 1st Department of Cardiology, State Hospital for Cardiology, Balatonfüred, Hungary
| | - Ágnes Vathy-Fogarassy
- Department of Computer Science and Systems Technology, University of Pannonia, Veszprém, Hungary.
| |
Collapse
|
8
|
Frei J, Kramer F. German Medical Named Entity Recognition Model and Data Set Creation Using Machine Translation and Word Alignment: Algorithm Development and Validation. JMIR Form Res 2023; 7:e39077. [PMID: 36853741 PMCID: PMC10015355 DOI: 10.2196/39077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 09/11/2022] [Accepted: 11/03/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Data mining in the field of medical data analysis often needs to rely solely on the processing of unstructured data to retrieve relevant data. For German natural language processing, few open medical neural named entity recognition (NER) models have been published before this work. A major issue can be attributed to the lack of German training data. OBJECTIVE We developed a synthetic data set and a novel German medical NER model for public access to demonstrate the feasibility of our approach. In order to bypass legal restrictions due to potential data leaks through model analysis, we did not make use of internal, proprietary data sets, which is a frequent veto factor for data set publication. METHODS The underlying German data set was retrieved by translation and word alignment of a public English data set. The data set served as a foundation for model training and evaluation. For demonstration purposes, our NER model follows a simple network architecture that is designed for low computational requirements. RESULTS The obtained data set consisted of 8599 sentences including 30,233 annotations. The model achieved a class frequency-averaged F1 score of 0.82 on the test set after training across 7 different NER types. Artifacts in the synthesized data set with regard to translation and alignment induced by the proposed method were exposed. The annotation performance was evaluated on an external data set and measured in comparison with an existing baseline model that has been trained on a dedicated German data set in a traditional fashion. We discussed the drop in annotation performance on an external data set for our simple NER model. Our model is publicly available. CONCLUSIONS We demonstrated the feasibility of obtaining a data set and training a German medical NER model by the exclusive use of public training data through our suggested method. The discussion on the limitations of our approach includes ways to further mitigate remaining problems in future work.
Collapse
Affiliation(s)
- Johann Frei
- IT Infrastructure for Translational Medical Research, University of Augsburg, Augsburg, Germany
| | - Frank Kramer
- IT Infrastructure for Translational Medical Research, University of Augsburg, Augsburg, Germany
| |
Collapse
|
9
|
López B, Raya O, Baykova E, Saez M, Rigau D, Cunill R, Mayoral S, Carrion C, Serrano D, Castells X. APPRAISE-RS: Automated, updated, participatory, and personalized treatment recommender systems based on GRADE methodology. Heliyon 2023; 9:e13074. [PMID: 36798764 PMCID: PMC9925880 DOI: 10.1016/j.heliyon.2023.e13074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 01/04/2023] [Accepted: 01/16/2023] [Indexed: 01/26/2023] Open
Abstract
Purpose Clinical practice guidelines (CPGs) have become fundamental tools for evidence-based medicine (EBM). However, CPG suffer from several limitations, including obsolescence, lack of applicability to many patients, and limited patient participation. This paper presents APPRAISE-RS, which is a methodology that we developed to overcome these limitations by automating, extending, and iterating the methodology that is most commonly used for building CPGs: the GRADE methodology. Method APPRAISE-RS relies on updated information from clinical studies and adapts and automates the GRADE methodology to generate treatment recommendations. APPRAISE-RS provides personalized recommendations because they are based on the patient's individual characteristics. Moreover, both patients and clinicians express their personal preferences for treatment outcomes which are considered when making the recommendation (participatory). Rule-based system approaches are used to manage heuristic knowledge. Results APPRAISE-RS has been implemented for attention deficit hyperactivity disorder (ADHD) and tested experimentally on 28 simulated patients. The resulting recommender system (APPRAISE-RS/TDApp) shows a higher degree of treatment personalization and patient participation than CPGs, while recommending the most frequent interventions in the largest body of evidence in the literature (EBM). Moreover, a comparison of the results with four blinded psychiatrist prescriptions supports the validation of the proposal. Conclusions APPRAISE-RS is a valid methodology to build recommender systems that manage updated, personalized and participatory recommendations, which, in the case of ADHD includes at least one intervention that is identical or very similar to other drugs prescribed by psychiatrists.
Collapse
Affiliation(s)
- Beatriz López
- Control Engineering and Intelligent Systems (eXiT), University of Girona, Spain,Corresponding author.
| | - Oscar Raya
- Control Engineering and Intelligent Systems (eXiT), University of Girona, Spain
| | | | - Marc Saez
- Research Group on Statistics, Econometrics and Health, University of Girona, Spain,CIBER of Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | | | - Ruth Cunill
- Sant Joan de Deu-Numancia Health Park, Barcelona, Spain
| | | | - Carme Carrion
- Health Lab Research Group, Universitat Oberta de Catalunya, Spain
| | | | - Xavier Castells
- TransLab Research Group, Dept. of Medical Sciences, University of Girona, Spain
| |
Collapse
|
10
|
van Laar SA, Kapiteijn E, Gombert-Handoko KB, Guchelaar HJ, Zwaveling J. Application of Electronic Health Record Text Mining: Real-World Tolerability, Safety, and Efficacy of Adjuvant Melanoma Treatments. Cancers (Basel) 2022; 14:5426. [PMID: 36358844 PMCID: PMC9657798 DOI: 10.3390/cancers14215426] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 10/31/2022] [Accepted: 11/02/2022] [Indexed: 08/13/2023] Open
Abstract
Introduction: Nivolumab (N), pembrolizumab (P), and dabrafenib plus trametinib (D + T) have been registered as adjuvant treatments for resected stage III and IV melanoma since 2018. Electronic health records (EHRs) are a real-world data source that can be used to review treatments in clinical practice. In this study, we applied EHR text-mining software to evaluate the real-world tolerability, safety, and efficacy of adjuvant melanoma treatments. Methods: Adult melanoma patients receiving adjuvant treatment between January 2019 and October 2021 at the Leiden University Medical Center, the Netherlands, were included. CTcue text-mining software (v3.1.0, CTcue B.V., Amsterdam, The Netherlands) was used to construct rule-based queries and perform context analysis for patient inclusion and data collection from structured and unstructured EHR data. Results: In total, 122 patients were included: 54 patients treated with nivolumab, 48 with pembrolizumab, and 20 with D + T. Significantly more patients discontinued treatment due to toxicity on D + T (N: 16%, P: 6%, D + T: 40%), and X2 (6, n = 122) = 14.6 and p = 0.024. Immune checkpoint inhibitors (ICIs) mainly showed immune-related treatment-limiting adverse events (AEs), and chronic thyroid-related AE occurred frequently (hyperthyroidism: N: 15%, P: 13%, hypothyroidism: N: 20%, P: 19%). Treatment-limiting toxicity from D + T was primarily a combination of reversible AEs, including pyrexia and fatigue. The 1-year recurrence-free survival was 70.3% after nivolumab, 72.4% after pembrolizumab, and 83.0% after D + T. Conclusions: Text-mining EHR is a valuable method to collect real-world data to evaluate adjuvant melanoma treatments. ICIs were better tolerated than D + T, in line with RCT results. For BRAF+ patients, physicians must weigh the higher risk of reversible treatment-limiting AEs of D + T against the risk of long-term immune-related AEs.
Collapse
Affiliation(s)
- Sylvia A. van Laar
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - Ellen Kapiteijn
- Department of Medical Oncology, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - Kim B. Gombert-Handoko
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - Henk-Jan Guchelaar
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - Juliette Zwaveling
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| |
Collapse
|
11
|
Frei J, Soto-Rey I, Kramer F. DrNote: An open medical annotation service. PLOS DIGITAL HEALTH 2022; 1:e0000086. [PMID: 36812581 PMCID: PMC9931362 DOI: 10.1371/journal.pdig.0000086] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 07/12/2022] [Indexed: 11/19/2022]
Abstract
In the context of clinical trials and medical research medical text mining can provide broader insights for various research scenarios by tapping additional text data sources and extracting relevant information that is often exclusively present in unstructured fashion. Although various works for data like electronic health reports are available for English texts, only limited work on tools for non-English text resources has been published that offers immediate practicality in terms of flexibility and initial setup. We introduce DrNote, an open source text annotation service for medical text processing. Our work provides an entire annotation pipeline with its focus on a fast yet effective and easy to use software implementation. Further, the software allows its users to define a custom annotation scope by filtering only for relevant entities that should be included in its knowledge base. The approach is based on OpenTapioca and combines the publicly available datasets from WikiData and Wikipedia, and thus, performs entity linking tasks. In contrast to other related work our service can easily be built upon any language-specific Wikipedia dataset in order to be trained on a specific target language. We provide a public demo instance of our DrNote annotation service at https://drnote.misit-augsburg.de/.
Collapse
Affiliation(s)
- Johann Frei
- IT-Infrastructure for Translational Medical Research, Faculty of Applied Computer Science, University of Augsburg, Augsburg, Germany
- * E-mail:
| | - Iñaki Soto-Rey
- Medical Data Integration Center, Institute for Digital Medicine, University Hospital Augsburg, Augsburg, Germany
| | - Frank Kramer
- IT-Infrastructure for Translational Medical Research, Faculty of Applied Computer Science, University of Augsburg, Augsburg, Germany
| |
Collapse
|
12
|
Zhao Y, Ren B, Yu W, Zhang H, Zhao D, Lv J, Xie Z, Jiang K, Shang L, Yao H, Xu Y, Zhao G. Construction of an Assisted Model Based on Natural Language Processing for Automatic Early Diagnosis of Autoimmune Encephalitis. Neurol Ther 2022; 11:1117-1134. [PMID: 35543808 PMCID: PMC9338198 DOI: 10.1007/s40120-022-00355-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 04/07/2022] [Indexed: 11/25/2022] Open
Abstract
Introduction Early diagnosis and etiological treatment can effectively improve the prognosis of patients with autoimmune encephalitis (AE). However, anti-neuronal antibody tests which provide the definitive diagnosis require time and are not always abnormal. By using natural language processing (NLP) technology, our study proposes an assisted diagnostic method for early clinical diagnosis of AE and compares its sensitivity with that of previously established criteria. Methods Our model is based on the text classification model trained by the history of present illness (HPI) in electronic medical records (EMRs) that present a definite pathological diagnosis of AE or infectious encephalitis (IE). The definitive diagnosis of IE was based on the results of traditional etiological examinations. The definitive diagnosis of AE was based on the results of neuronal antibodies, and the diagnostic criteria of definite autoimmune limbic encephalitis proposed by Graus et al. used as the reference standard for antibody-negative AE. First, we automatically recognized and extracted symptoms for all HPI texts in EMRs by training a dataset of 552 cases. Second, four text classification models trained by a dataset of 199 cases were established for differential diagnosis of AE and IE based on a post-structuring text dataset of every HPI, which was completed using symptoms in English language after the process of normalization of synonyms. The optimal model was identified by evaluating and comparing the performance of the four models. Finally, combined with three typical symptoms and the results of standard paraclinical tests such as cerebrospinal fluid (CSF), magnetic resonance imaging (MRI), or electroencephalogram (EEG) proposed from Graus criteria, an assisted early diagnostic model for AE was established on the basis of the text classification model with the best performance. Results The comparison results for the four models applied to the independent testing dataset showed the naïve Bayesian classifier with bag of words achieved the best performance, with an area under the receiver operating characteristic curve of 0.85, accuracy of 84.5% (95% confidence interval [CI] 74.0–92.0%), sensitivity of 86.7% (95% CI 69.3–96.2%), and specificity of 82.9% (95% CI 67.9–92.8%), respectively. Compared with the diagnostic criteria proposed previously, the early diagnostic sensitivity for possible AE using the assisted diagnostic model based on the independent testing dataset was improved from 73.3% (95% CI 54.1–87.7%) to 86.7% (95% CI 69.3–96.2%). Conclusions The assisted diagnostic model could effectively increase the early diagnostic sensitivity for AE compared to previous diagnostic criteria, assist physicians in establishing the diagnosis of AE automatically after inputting the HPI and the results of standard paraclinical tests according to their narrative habits for describing symptoms, avoiding misdiagnosis and allowing for prompt initiation of specific treatment. Supplementary Information The online version contains supplementary material available at 10.1007/s40120-022-00355-7.
Collapse
Affiliation(s)
- Yunsong Zhao
- Department of Neurology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Bin Ren
- Department of Information, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Wenjin Yu
- Department of Neurology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Haijun Zhang
- Department of Neurology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Di Zhao
- Department of Neurology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Junchao Lv
- Department of Neurology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Zhen Xie
- College of Life Sciences and Medicine, Northwest University, Xi'an, China
| | - Kun Jiang
- Department of Information, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Lei Shang
- Department of Health Statistics, Fourth Military Medical University, Xi'an, China
| | - Han Yao
- Department of Neurobiology, School of Basic Medicine, Fourth Military Medical University, Xi'an, China
| | - Yongyong Xu
- College of Life Sciences and Medicine, Northwest University, Xi'an, China.
| | - Gang Zhao
- Department of Neurology, Xijing Hospital, Fourth Military Medical University, Xi'an, China.
- College of Life Sciences and Medicine, Northwest University, Xi'an, China.
| |
Collapse
|
13
|
Percha B, Pisapati K, Gao C, Schmidt H. Natural language inference for curation of structured clinical registries from unstructured text. J Am Med Inform Assoc 2021; 29:97-108. [PMID: 34791282 DOI: 10.1093/jamia/ocab243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 09/26/2021] [Accepted: 10/25/2021] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Clinical registries-structured databases of demographic, diagnosis, and treatment information-play vital roles in retrospective studies, operational planning, and assessment of patient eligibility for research, including clinical trials. Registry curation, a manual and time-intensive process, is always costly and often impossible for rare or underfunded diseases. Our goal was to evaluate the feasibility of natural language inference (NLI) as a scalable solution for registry curation. MATERIALS AND METHODS We applied five state-of-the-art, pretrained, deep learning-based NLI models to clinical, laboratory, and pathology notes to infer information about 43 different breast oncology registry fields. Model inferences were evaluated against a manually curated, 7439 patient breast oncology research database. RESULTS NLI models showed considerable variation in performance, both within and across fields. One model, ALBERT, outperformed the others (BART, RoBERTa, XLNet, and ELECTRA) on 22 out of 43 fields. A detailed error analysis revealed that incorrect inferences primarily arose through models' tendency to misinterpret historical findings, as well as confusion based on abbreviations and subtle term variants common in clinical text. DISCUSSION AND CONCLUSION Traditional natural language processing methods require specially annotated training sets or the construction of a separate model for each registry field. In contrast, a single pretrained NLI model can curate dozens of different fields simultaneously. Surprisingly, NLI methods remain unexplored in the clinical domain outside the realm of shared tasks and benchmarks. Modern NLI models could increase the efficiency of registry curation, even when applied "out of the box" with no additional training.
Collapse
Affiliation(s)
- Bethany Percha
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Kereeti Pisapati
- Mount Sinai Innovation Partners, Mount Sinai Health System, New York, New York, USA.,Breast Surgical Oncology, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Cynthia Gao
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Hank Schmidt
- Breast Surgical Oncology, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|