1
|
Tomlin HR, Wissing M, Tanikella S, Kaur P, Tabas L. Challenges and Opportunities for Professional Medical Publications Writers to Contribute to Plain Language Summaries (PLS) in an AI/ML Environment - A Consumer Health Informatics Systematic Review. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:709-717. [PMID: 38222388 PMCID: PMC10785924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Professional medical publications writers (PMWs) cover a wide range of biomedical writing activities that recently includes translation of biomedical publications to plain language summaries (PLS). The consumer health informatics literature (CHI) consistently describes the importance of incorporating health literacy principles in any natural language processing (NLP) app designed to communicate medical information to lay audiences, particularly patients. In this stepwise systematic review, we searched PubMed indexed literature for CHI NLP-based apps that have the potential to assist PMWs in developing text based PLS. Results showed that available apps are limited to patient portals and other technologies used to communicate medical text and reports from electronic health records. PMWs can apply the lessons learned from CHI NLP-based apps to supervise development of tools specific to text simplification and summarization for PLS from biomedical publications.
Collapse
Affiliation(s)
- Holly R Tomlin
- Certara Synchrogenix, Wilmington, DE, USA
- Consumer Health Informatics Lab (CHIL), Section of Biostatistics and Data Sciences, Yale School of Medicine, New Haven, CT
- Weill Cornell Medicine, Department of Population Health Sciences, Division of Health Analytics, New York, NY
| | | | | | | | | |
Collapse
|
2
|
van Es B, Reteig LC, Tan SC, Schraagen M, Hemker MM, Arends SRS, Rios MAR, Haitjema S. Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods. BMC Bioinformatics 2023; 24:10. [PMID: 36624385 PMCID: PMC9830789 DOI: 10.1186/s12859-022-05130-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 12/30/2022] [Indexed: 01/11/2023] Open
Abstract
When developing models for clinical information retrieval and decision support systems, the discrete outcomes required for training are often missing. These labels need to be extracted from free text in electronic health records. For this extraction process one of the most important contextual properties in clinical text is negation, which indicates the absence of findings. We aimed to improve large scale extraction of labels by comparing three methods for negation detection in Dutch clinical notes. We used the Erasmus Medical Center Dutch Clinical Corpus to compare a rule-based method based on ContextD, a biLSTM model using MedCAT and (finetuned) RoBERTa-based models. We found that both the biLSTM and RoBERTa models consistently outperform the rule-based model in terms of F1 score, precision and recall. In addition, we systematically categorized the classification errors for each model, which can be used to further improve model performance in particular applications. Combining the three models naively was not beneficial in terms of performance. We conclude that the biLSTM and RoBERTa-based models in particular are highly accurate accurate in detecting clinical negations, but that ultimately all three approaches can be viable depending on the use case at hand.
Collapse
Affiliation(s)
- Bram van Es
- grid.7692.a0000000090126352Central Diagnostic Laboratory, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands ,MedxAI, Amsterdam, The Netherlands
| | - Leon C. Reteig
- grid.7692.a0000000090126352Center for Translational Immunology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sander C. Tan
- grid.7692.a0000000090126352Department for Research & Data Technology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Marijn Schraagen
- grid.5477.10000000120346234Institute for Information and Computing Sciences, Utrecht University, Utrecht, The Netherlands
| | - Myrthe M. Hemker
- grid.5477.10000000120346234Utrecht Institute of Linguistics OTS & Department of Languages, Literature and Communication, Utrecht University, Utrecht, The Netherlands
| | - Sebastiaan R. S. Arends
- grid.7177.60000000084992262Department of Medical Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Miguel A. R. Rios
- grid.10420.370000 0001 2286 1424Centre for Translation Studies, University of Vienna, Vienna, Austria
| | - Saskia Haitjema
- grid.7692.a0000000090126352Central Diagnostic Laboratory, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
3
|
Ondov B, Attal K, Demner-Fushman D. A survey of automated methods for biomedical text simplification. J Am Med Inform Assoc 2022; 29:1976-1988. [PMID: 36083212 PMCID: PMC10161533 DOI: 10.1093/jamia/ocac149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 07/26/2022] [Accepted: 08/16/2022] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Plain language in medicine has long been advocated as a way to improve patient understanding and engagement. As the field of Natural Language Processing has progressed, increasingly sophisticated methods have been explored for the automatic simplification of existing biomedical text for consumers. We survey the literature in this area with the goals of characterizing approaches and applications, summarizing existing resources, and identifying remaining challenges. MATERIALS AND METHODS We search English language literature using lists of synonyms for both the task (eg, "text simplification") and the domain (eg, "biomedical"), and searching for all pairs of these synonyms using Google Scholar, Semantic Scholar, PubMed, ACL Anthology, and DBLP. We expand search terms based on results and further include any pertinent papers not in the search results but cited by those that are. RESULTS We find 45 papers that we deem relevant to the automatic simplification of biomedical text, with data spanning 7 natural languages. Of these (nonexclusively), 32 describe tools or methods, 13 present data sets or resources, and 9 describe impacts on human comprehension. Of the tools or methods, 22 are chiefly procedural and 10 are chiefly neural. CONCLUSIONS Though neural methods hold promise for this task, scarcity of parallel data has led to continued development of procedural methods. Various low-resource mitigations have been proposed to advance neural methods, including paragraph-level and unsupervised models and augmentation of neural models with procedural elements drawing from knowledge bases. However, high-quality parallel data will likely be crucial for developing fully automated biomedical text simplification.
Collapse
Affiliation(s)
- Brian Ondov
- Computational Health Research Branch, National Library of Medicine, Bethesda, Maryland, USA
| | - Kush Attal
- Computational Health Research Branch, National Library of Medicine, Bethesda, Maryland, USA
| | - Dina Demner-Fushman
- Computational Health Research Branch, National Library of Medicine, Bethesda, Maryland, USA
| |
Collapse
|
4
|
Devaraj A, Wallace BC, Marshall IJ, Li JJ. Paragraph-level Simplification of Medical Texts. PROCEEDINGS OF THE CONFERENCE. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. NORTH AMERICAN CHAPTER. MEETING 2021; 2021:4972-4984. [PMID: 35663507 PMCID: PMC9161242 DOI: 10.18653/v1/2021.naacl-main.395] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplification does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing 'jargon' terms; we find that this yields improvements over baselines in terms of readability.
Collapse
|
5
|
Exploring the impact of short-text complexity and structure on its quality in social media. JOURNAL OF ENTERPRISE INFORMATION MANAGEMENT 2020. [DOI: 10.1108/jeim-06-2019-0156] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeThe purpose of this paper is to explore to which extent the quality of social media short text without extensions can be investigated and what are the predictors, if any, of such short text that lead to trust its content.Design/methodology/approachThe paper applies a trust model to classify data collections based on metadata into four classes: Very Trusted, Trusted, Untrusted and Very Untrusted. These data are collected from the online communities, Genius and Stack Overflow. In order to evaluate short texts in terms of its trust levels, the authors have conducted two investigations: (1) A natural language processing (NLP) approach to extract relevant features (i.e. Part-of-Speech and various readability indexes). The authors report relatively good performance of the NLP study. (2) A machine learning technique in more precise, a random forest (RF) classifierusing bag-of-words model (BoW).FindingsThe investigation of the RF classifier using BoW shows promising intermediate results (on average 62% accuracy of both online communities) in short-text quality identification that leads to trust.Practical implicationsAs social media becomes an increasingly new and attractive source of information, which is mostly provided in the form of short texts, businesses (e.g. in search engines for smart data) can filter content without having to apply complex approaches and continue to deal with information that is considered more trustworthy.Originality/valueShort-text classifications with regard to a criterion (e.g. quality, readability) are usually extended by an external source or its metadata. This enhancement either changes the original text if it is an additional text from an external source, or it requires text metadata that is not always available. To this end, the originality of this study faces the challenge of investigating the quality of short text (i.e. social media text) without having to extend or modify it using external sources. This modification alters the text and distorts the results of the investigation.
Collapse
|
6
|
Electronic health records for the diagnosis of rare diseases. Kidney Int 2020; 97:676-686. [DOI: 10.1016/j.kint.2019.11.037] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 11/15/2019] [Accepted: 11/22/2019] [Indexed: 01/13/2023]
|
7
|
Zeng Z, Deng Y, Li X, Naumann T, Luo Y. Natural Language Processing for EHR-Based Computational Phenotyping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:139-153. [PMID: 29994486 PMCID: PMC6388621 DOI: 10.1109/tcbb.2018.2849968] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This article reviews recent advances in applying natural language processing (NLP) to Electronic Health Records (EHRs) for computational phenotyping. NLP-based computational phenotyping has numerous applications including diagnosis categorization, novel phenotype discovery, clinical trial screening, pharmacogenomics, drug-drug interaction (DDI), and adverse drug event (ADE) detection, as well as genome-wide and phenome-wide association studies. Significant progress has been made in algorithm development and resource construction for computational phenotyping. Among the surveyed methods, well-designed keyword search and rule-based systems often achieve good performance. However, the construction of keyword and rule lists requires significant manual effort, which is difficult to scale. Supervised machine learning models have been favored because they are capable of acquiring both classification patterns and structures from data. Recently, deep learning and unsupervised learning have received growing attention, with the former favored for its performance and the latter for its ability to find novel phenotypes. Integrating heterogeneous data sources have become increasingly important and have shown promise in improving model performance. Often, better performance is achieved by combining multiple modalities of information. Despite these many advances, challenges and opportunities remain for NLP-based computational phenotyping, including better model interpretability and generalizability, and proper characterization of feature relations in clinical narratives.
Collapse
Affiliation(s)
- Zexian Zeng
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611.
| | - Yu Deng
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611.
| | - Xiaoyu Li
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA 02115.
| | - Tristan Naumann
- Science and Artificial Intelligence Lab, Massachusetts Institue of Technology, Cambridge, MA 02139.
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611.
| |
Collapse
|
8
|
Mukherjee P, Leroy G, Kauchak D. Using Lexical Chains to Identify Text Difficulty: A Corpus Statistics and Classification Study. IEEE J Biomed Health Inform 2018; 23:2164-2173. [PMID: 30530380 DOI: 10.1109/jbhi.2018.2885465] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Our goal is data-driven discovery of features for text simplification. In this paper, we investigate three types of lexical chains: exact, synonymous, and semantic. A lexical chain links semantically related words in a document. We examine their potential with a document-level corpus statistics study (914 texts) to estimate their overall capacity to differentiate between easy and difficult text and a classification task (11 000 sentences) to determine usefulness of features at sentence-level for simplification. For the corpus statistics study we tested five document-level features for each chain type: total number of chains, average chain length, average chain span, number of crossing chains, and the number of chains longer than half the document length. We found significant differences between easy and difficult text for average chain length and the average number of cross chains. For the sentence classification study, we compared the lexical chain features to standard bag-of-words features on a range of classifiers: logistic regression, naïve Bayes, decision trees, linear and RBF kernel SVM, and random forest. The lexical chain features performed significantly better than the bag-of-words baseline across all classifiers with the best classifier achieving an accuracy of ∼90% (compared to 78% for bag-of-words). Overall, we find several lexical chain features provide specific information useful for identifying difficult sentences of text, beyond what is available from standard lexical features.
Collapse
|
9
|
Cross Disciplinary Consultancy to Bridge Public Health Technical Needs and Analytic Developers: Negation Detection Use Case. Online J Public Health Inform 2018; 10:e209. [PMID: 30349627 PMCID: PMC6194092 DOI: 10.5210/ojphi.v10i2.8944] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
This paper describes a continuing initiative of the International Society for Disease Surveillance designed to bring together public health practitioners and analytics solution developers from both academia and industry. Funded by the Defense Threat Reduction Agency, a series of consultancies have been conducted on a range of topics of pressing concern to public health (e.g. developing methods to enhance prediction of asthma exacerbation, developing tools for asyndromic surveillance from chief complaints). The topic of this final consultancy, conducted at the University of Utah in January 2017, is focused on defining a roadmap for the development of algorithms, tools, and datasets for improving the capabilities of text processing algorithms to identify negated terms (i.e. negation detection) in free-text chief complaints
and triage reports.
Collapse
|
10
|
McCoy TH, Yu S, Hart KL, Castro VM, Brown HE, Rosenquist JN, Doyle AE, Vuijk PJ, Cai T, Perlis RH. High Throughput Phenotyping for Dimensional Psychopathology in Electronic Health Records. Biol Psychiatry 2018; 83:997-1004. [PMID: 29496195 PMCID: PMC5972065 DOI: 10.1016/j.biopsych.2018.01.011] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Revised: 12/15/2017] [Accepted: 01/08/2018] [Indexed: 01/29/2023]
Abstract
BACKGROUND Relying on diagnostic categories of neuropsychiatric illness obscures the complexity of these disorders. Capturing multiple dimensional measures of neuropathology could facilitate the clinical and neurobiological investigation of cognitive and behavioral phenotypes. METHODS We developed a natural language processing-based approach to extract five symptom dimensions, based on the National Institute of Mental Health Research Domain Criteria definitions, from narrative clinical notes. Estimates of Research Domain Criteria loading were derived from a cohort of 3619 individuals with 4623 hospital admissions. We applied this tool to a large corpus of psychiatric inpatient admission and discharge notes (2010-2015), and using the same cohort we examined face validity, predictive validity, and convergent validity with gold standard annotations. RESULTS In mixed-effect models adjusted for sociodemographic and clinical features, greater negative and positive symptom domains were associated with a shorter length of stay (β = -.88, p = .001 and β = -1.22, p < .001, respectively), while greater social and arousal domain scores were associated with a longer length of stay (β = .93, p < .001 and β = .81, p = .007, respectively). In fully adjusted Cox regression models, a greater positive domain score at discharge was also associated with a significant increase in readmission risk (hazard ratio = 1.22, p < .001). Positive and negative valence domains were correlated with expert annotation (by analysis of variance [df = 3], R2 = .13 and .19, respectively). Likewise, in a subset of patients, neurocognitive testing was correlated with cognitive performance scores (p < .008 for three of six measures). CONCLUSIONS This shows that natural language processing can be used to efficiently and transparently score clinical notes in terms of cognitive and psychopathologic domains.
Collapse
Affiliation(s)
- Thomas H. McCoy
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114,Correspondence: Thomas H. McCoy, MD, Massachusetts General Hospital, Simches Research Building, 6th Floor, Boston, MA 02114, 617-726-7426,
| | - Sheng Yu
- Tsinghua University, 30 Shuangqing Rd, Haidian Qu, Beijing Shi, China, 100084,Harvard School of Public Health, 677 Huntington Ave, Boston, MA 02115
| | - Kamber L. Hart
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| | - Victor M. Castro
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| | - Hannah E. Brown
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| | - James N. Rosenquist
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| | - Alysa E. Doyle
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| | - Pieter J. Vuijk
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| | - Tianxi Cai
- Harvard School of Public Health, 677 Huntington Ave, Boston, MA 02115
| | - Roy H. Perlis
- Center for Quantitative Health and Department of Psychiatry, Simches Research Building, 6th Floor, 185 Cambridge Street, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114
| |
Collapse
|
11
|
Accuracy of using natural language processing methods for identifying healthcare-associated infections. Int J Med Inform 2018; 117:96-102. [PMID: 30032970 DOI: 10.1016/j.ijmedinf.2018.06.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 04/27/2018] [Accepted: 06/03/2018] [Indexed: 01/09/2023]
Abstract
OBJECTIVE There is a growing interest in using natural language processing (NLP) for healthcare-associated infections (HAIs) monitoring. A French project consortium, SYNODOS, developed a NLP solution for detecting medical events in electronic medical records for epidemiological purposes. The objective of this study was to evaluate the performance of the SYNODOS data processing chain for detecting HAIs in clinical documents. MATERIALS AND METHODS The collection of textual records in these hospitals was carried out between October 2009 and December 2010 in three French University hospitals (Lyon, Rouen and Nice). The following medical specialties were included in the study: digestive surgery, neurosurgery, orthopedic surgery, adult intensive-care units. Reference Standard surveillance was compared with the results of automatic detection using NLP. Sensitivity on 56 HAI cases and specificity on 57 non-HAI cases were calculated. RESULTS The accuracy rate was 84% (n = 95/113). The overall sensitivity of automatic detection of HAIs was 83.9% (CI 95%: 71.7-92.4) and the specificity was 84.2% (CI 95%: 72.1-92.5). The sensitivity varies from one specialty to the other, from 69.2% (CI 95%: 38.6-90.9) for intensive care to 93.3% (CI 95%: 68.1-99.8) for orthopedic surgery. The manual review of classification errors showed that the most frequent cause was an inaccurate temporal labeling of medical events, which is an important factor for HAI detection. CONCLUSION This study confirmed the feasibility of using NLP for the HAI detection in hospital facilities. Automatic HAI detection algorithms could offer better surveillance standardization for hospital comparisons.
Collapse
|
12
|
Mukherjee P, Leroy G, Kauchak D, Navarrete BA, Diaz DY, Colina S. The Role of Surface, Semantic and Grammatical Features on Simplification of Spanish Medical Texts: A User Study. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:1322-1331. [PMID: 29854201 PMCID: PMC5977682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Simplifying medical texts facilitates readability and comprehension. While most simplification work focuses on English, we investigate whether features important for simplifying English text are similarly helpful for simplifying Spanish text. We conducted a user study on 15 Spanish medical texts using Amazon Mechanical Turk and measured perceived and actual difficulty. Using the median of the difficulty scores, we split the texts into easy and difficult groups and extracted 10 surface, 2 semantic and 4 grammatical features. Using t-tests, we identified those features that significantly distinguish easy text from difficult text in Spanish and compare with prior work in English. We found that easy Spanish texts use more repeated words and adverbs, less negations and more familiar words, similar to English. Also like English, difficult Spanish texts use more nouns and adjectives. However in contrast to English, easier Spanish texts contained longer sentences and used grammatical structures that were more varied.
Collapse
|