Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Koleck TA, Dreisbach C, Bourne PE, Bakken S. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc 2020;26:364-379. [PMID: 30726935 DOI: 10.1093/jamia/ocy173] [Citation(s) in RCA: 200] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/20/2018] [Accepted: 11/27/2018] [Indexed: 12/26/2022] Open

For:	Koleck TA, Dreisbach C, Bourne PE, Bakken S. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc 2020;26:364-379. [PMID: 30726935 DOI: 10.1093/jamia/ocy173] [Citation(s) in RCA: 200] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/20/2018] [Accepted: 11/27/2018] [Indexed: 12/26/2022] Open

Number

Cited by Other Article(s)

Huang G, Li Y, Jameel S, Long Y, Papanastasiou G. From explainable to interpretable deep learning for natural language processing in healthcare: How far from reality? Comput Struct Biotechnol J 2024;24:362-373. [PMID: 38800693 PMCID: PMC11126530 DOI: 10.1016/j.csbj.2024.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 05/29/2024] Open

Bandyopadhyay A, Albashayreh A, Zeinali N, Fan W, Gilbertson-White S. Using real-world electronic health record data to predict the development of 12 cancer-related symptoms in the context of multimorbidity. JAMIA Open 2024;7:ooae082. [PMID: 39282082 PMCID: PMC11397936 DOI: 10.1093/jamiaopen/ooae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 08/09/2024] [Accepted: 09/05/2024] [Indexed: 09/18/2024] Open

Abstract

Objective

This study uses electronic health record (EHR) data to predict 12 common cancer symptoms, assessing the efficacy of machine learning (ML) models in identifying symptom influencers.

Materials and Methods

We analyzed EHR data of 8156 adults diagnosed with cancer who underwent cancer treatment from 2017 to 2020. Structured and unstructured EHR data were sourced from the Enterprise Data Warehouse for Research at the University of Iowa Hospital and Clinics. Several predictive models, including logistic regression, random forest (RF), and XGBoost, were employed to forecast symptom development. The performances of the models were evaluated by F1-score and area under the curve (AUC) on the testing set. The SHapley Additive exPlanations framework was used to interpret these models and identify the predictive risk factors associated with fatigue as an exemplar.

Results

The RF model exhibited superior performance with a macro average AUC of 0.755 and an F1-score of 0.729 in predicting a range of cancer-related symptoms. For instance, the RF model achieved an AUC of 0.954 and an F1-score of 0.914 for pain prediction. Key predictive factors identified included clinical history, cancer characteristics, treatment modalities, and patient demographics depending on the symptom. For example, the odds ratio (OR) for fatigue was significantly influenced by allergy (OR = 2.3, 95% CI: 1.8-2.9) and colitis (OR = 1.9, 95% CI: 1.5-2.4).

Discussion

Our research emphasizes the critical integration of multimorbidity and patient characteristics in modeling cancer symptoms, revealing the considerable influence of chronic conditions beyond cancer itself.

Conclusion

We highlight the potential of ML for predicting cancer symptoms, suggesting a pathway for integrating such models into clinical systems to enhance personalized care and symptom management.

Collapse

Mathis WS, Zhao S, Pratt N, Weleff J, De Paoli S. Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods? COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;255:108356. [PMID: 39067136 DOI: 10.1016/j.cmpb.2024.108356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 07/13/2024] [Accepted: 07/23/2024] [Indexed: 07/30/2024]

Hohenschurz-Schmidt D, Cherkin D, Rice ASC, Dworkin RH, Turk DC, McDermott MP, Bair MJ, DeBar LL, Edwards RR, Evans SR, Farrar JT, Kerns RD, Rowbotham MC, Wasan AD, Cowan P, Ferguson M, Freeman R, Gewandter JS, Gilron I, Grol-Prokopczyk H, Iyengar S, Kamp C, Karp BI, Kleykamp BA, Loeser JD, Mackey S, Malamut R, McNicol E, Patel KV, Schmader K, Simon L, Steiner DJ, Veasley C, Vollert J. Methods for pragmatic randomized clinical trials of pain therapies: IMMPACT statement. Pain 2024;165:2165-2183. [PMID: 38723171 PMCID: PMC11404339 DOI: 10.1097/j.pain.0000000000003249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 03/08/2024] [Indexed: 09/18/2024]

Abstract

ABSTRACT

Pragmatic, randomized, controlled trials hold the potential to directly inform clinical decision making and health policy regarding the treatment of people experiencing pain. Pragmatic trials are designed to replicate or are embedded within routine clinical care and are increasingly valued to bridge the gap between trial research and clinical practice, especially in multidimensional conditions, such as pain and in nonpharmacological intervention research. To maximize the potential of pragmatic trials in pain research, the careful consideration of each methodological decision is required. Trials aligned with routine practice pose several challenges, such as determining and enrolling appropriate study participants, deciding on the appropriate level of flexibility in treatment delivery, integrating information on concomitant treatments and adherence, and choosing comparator conditions and outcome measures. Ensuring data quality in real-world clinical settings is another challenging goal. Furthermore, current trials in the field would benefit from analysis methods that allow for a differentiated understanding of effects across patient subgroups and improved reporting of methods and context, which is required to assess the generalizability of findings. At the same time, a range of novel methodological approaches provide opportunities for enhanced efficiency and relevance of pragmatic trials to stakeholders and clinical decision making. In this study, best-practice considerations for these and other concerns in pragmatic trials of pain treatments are offered and a number of promising solutions discussed. The basis of these recommendations was an Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) meeting organized by the Analgesic, Anesthetic, and Addiction Clinical Trial Translations, Innovations, Opportunities, and Networks.

Collapse

Affiliation(s)

David Hohenschurz-Schmidt Pain Research, Department of Surgery & Cancer, Faculty of Medicine, Imperial College London, United Kingdom Research Department, University College of Osteopathy, London, United Kingdom
Dan Cherkin Osher Center for Integrative Health, Department of Family Medicine, University of Washington, Seattle, WA, United States
Andrew S C Rice Pain Research, Department of Surgery & Cancer, Faculty of Medicine, Imperial College London, United Kingdom
Robert H Dworkin Department of Anesthesiology and Perioperative Medicine, University of Rochester Medical Center, Rochester, NY, United States
Dennis C Turk Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, United States
Michael P McDermott Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY, United States
Matthew J Bair VA Center for Health Information and Communication, Regenstrief Institute, and Indiana University School of Medicine, Indianapolis, IN, United States
Lynn L DeBar Kaiser Permanente Washington Health Research Institute, Seattle, WA, United States
Robert R Edwards Harvard Medical School, Boston, MA, United States
Scott R Evans Biostatistics Center and the Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Rockville, MD, United States
John T Farrar Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, United States
Robert D Kerns Department of Psychiatry, Yale School of Medicine, New Haven, CT, United States
Michael C Rowbotham Department of Anesthesia, University of California San Francisco School of Medicine, San Francisco, CA, United States
Ajay D Wasan Departments of Anesthesiology & Perioperative Medicine, and Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
Penney Cowan American Chronic Pain Association, Rocklin, CA, United States
McKenzie Ferguson Department of Pharmacy Practice, Southern Illinois University Edwardsville, Edwardsville, IL, United States
Roy Freeman Department of Neurology, Harvard Medical School, Boston, MA, United States
Jennifer S Gewandter Department of Anesthesiology and Perioperative, University of Rochester, Rochester, NY, United States
Ian Gilron Departments of Anesthesiology & Perioperative Medicine, Biomedical & Molecular Sciences, Centre for Neuroscience Studies, and School of Policy Studies, Queen's University, Kingston Health Sciences Centre, Kingston, ON, Canada
Hanna Grol-Prokopczyk Department of Sociology, University at Buffalo, State University of New York, Buffalo, NY, United States
Smriti Iyengar Eli Lilly and Company, Indianapolis, IN, United States
Cornelia Kamp Center for Health and Technology (CHeT), Clinical Materials Services Unit (CMSU), University of Rochester Medical Center, Rochester, NY, United States
Barbara I Karp National Institutes of Health, Bethesda, MD, United States
Bethea A Kleykamp University of Maryland, School of Medicine, Baltimore, MD, United States
John D Loeser Departments of Neurological Surgery and Anesthesia and Pain Medicine, University of Washington, Seattle, WA, United States
Sean Mackey Stanford University School of Medicine, Department of Anesthesiology, Perioperative, and Pain Medicine, Neurosciences and Neurology, Palo Alto, CA, United States
Richard Malamut Chief Medical Officer, MedinCell, Jacou, France
Ewan McNicol Department of Pharmacy Practice, Massachusetts College of Pharmacy and Health Sciences University, Boston, MA, United States
Kushang V Patel Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, United States
Kenneth Schmader Department of Medicine-Geriatrics, Center for the Study of Aging, Duke University Medical Center, and Geriatrics Research Education and Clinical Center, Durham VA Medical Center, Durham, NC, United States
Lee Simon SDG, LLC, Cambridge, MA, United States
Deborah J Steiner Eli Lilly and Company, Indianapolis, IN, United States
Christin Veasley Chronic Pain Research Alliance, Milwaukee, WI, United States
Jan Vollert Department of Clinical and Biomedical Sciences, Faculty of Health and Life Sciences, University of Exeter, Exeter, United Kingdom

Collapse

Hughes JA, Wu Y, Jones L, Douglas C, Brown N, Hazelwood S, Lyrstedt AL, Jarugula R, Chu K, Nguyen A. Analyzing pain patterns in the emergency department: Leveraging clinical text deep learning models for real-world insights. Int J Med Inform 2024;190:105544. [PMID: 39003790 DOI: 10.1016/j.ijmedinf.2024.105544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 06/09/2024] [Accepted: 07/06/2024] [Indexed: 07/16/2024]

Abstract

OBJECTIVE

To determine the incidence of patients presenting in pain to a large Australian inner-city emergency department (ED) using a clinical text deep learning algorithm.

MATERIALS AND METHODS

A fine-tuned, domain-specific, transformer-based clinical text deep learning model was used to interpret free-text nursing assessments in the electronic medical records of 235,789 adult presentations to the ED over a three-year period. The model classified presentations according to whether the patient had pain on arrival at the ED. Interrupted time series analysis was used to determine the incidence of pain in patients on arrival over time. We described the changes in the population characteristics and incidence of patients with pain on arrival occurring with the start of the Covid-19 pandemic.

RESULTS

55.16% (95%CI 54.95%-55.36%) of all patients presenting to this ED had pain on arrival. There were differences in demographics and arrival and departure patterns between patients with and without pain. The Covid-19 pandemic initially precipitated a decrease followed by a sharp, sustained rise in pain on arrival, with concurrent changes to the population arriving in pain and their treatment.

DISCUSSION

Applying a clinical text deep learning model has successfully identified the incidence of pain on arrival. It represents an automated, reproducible mechanism to identify pain from routinely collected medical records. The description of this population and their treatment forms the basis of intervention to improve care for patients with pain. The combination of the clinical text deep learning models and interrupted time series analysis has reported on the effects of the Covid-19 pandemic on pain care in the ED, outlining a methodology to assess the impact of significant events or interventions on pain care in the ED.

CONCLUSION

Applying a novel deep learning approach to identifying pain guides methodological approaches to evaluating pain care interventions in the ED, giving previously unavailable population-level insights.

Collapse

Payton EM, Graber ML, Bachiashvili V, Mehta T, Dissanayake PI, Berner ES. Impact of clinical note format on diagnostic accuracy and efficiency. HEALTH INF MANAG J 2024;53:183-188. [PMID: 37129041 DOI: 10.1177/18333583231151979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Cao T, Brady V, Whisenant M, Wang X, Gu Y, Wu H. Toward Reliable Symptom Coding in Electronic Health Records for Symptom Assessment and Research: Identification and Categorization of International Classification of Diseases, Ninth Revision, Clinical Modification Symptom Codes. Comput Inform Nurs 2024;42:636-647. [PMID: 38968447 PMCID: PMC11377150 DOI: 10.1097/cin.0000000000001146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2024]

Danna G, Garg R, Buchheit J, Patel R, Zhan T, Ellyn A, Maqbool F, Yala L, Moklyak Y, Frydman J, Kho A, Kong N, Furmanchuk A, Lundberg A, Stey AM. Prediction of intra-abdominal injury using natural language processing of electronic medical record data. Surgery 2024;176:577-585. [PMID: 38972771 PMCID: PMC11330356 DOI: 10.1016/j.surg.2024.05.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 05/12/2024] [Accepted: 05/28/2024] [Indexed: 07/09/2024]

Abstract

BACKGROUND

This study aimed to use natural language processing to predict the presence of intra-abdominal injury using unstructured data from electronic medical records.

METHODS

This was a random-sample retrospective observational cohort study leveraging unstructured data from injured patients taken to one of 9 acute care hospitals in an integrated health system between 2015 and 2021. Patients with International Classification of Diseases External Cause of Morbidity codes were identified. History and physical, consult, progress, and radiology report text from the first 8 hours of care were abstracted. Annotator dyads independently annotated encounters' text files to establish ground truth regarding whether intra-abdominal injury occurred. Features were extracted from text using natural language processing techniques, bag of words, and principal component analysis. We tested logistic regression, random forests, and gradient boosting machine to determine accuracy, recall, and precision of natural language processing to predict intra-abdominal injury.

RESULTS

A random sample of 7,000 patient encounters of 177,127 was annotated. Only 2,951 had sufficient information to determine whether an intra-abdominal injury was present. Among those, 84 (2.9%) had an intra-abdominal injury. The concordance between annotators was 0.989. Logistic regression of features identified with bag of words and principal component analysis had the best predictive ability, with an area under the receiver operating characteristic curve of 0.9, recall of 0.73, and precision of 0.17. Text features with greatest importance included "abdomen," "pelvis," "spleen," and "hematoma."

CONCLUSION

Natural language processing could be a screening decision support tool, which, if paired with human clinical assessment, can maximize precision of intra-abdominal injury identification.

Collapse

Albashayreh A, Bandyopadhyay A, Zeinali N, Zhang M, Fan W, Gilbertson White S. Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives. JCO Clin Cancer Inform 2024;8:e2300235. [PMID: 39116379 DOI: 10.1200/cci.23.00235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/29/2024] [Accepted: 05/30/2024] [Indexed: 08/10/2024] Open

Abstract

PURPOSE

Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer.

METHODS

We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing.

RESULTS

The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes).

CONCLUSION

We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.

Collapse

Mora S, Giacobbe DR, Bartalucci C, Viglietti G, Mikulska M, Vena A, Ball L, Robba C, Cappello A, Battaglini D, Brunetti I, Pelosi P, Bassetti M, Giacomini M. Towards the automatic calculation of the EQUAL Candida Score: Extraction of CVC-related information from EMRs of critically ill patients with candidemia in Intensive Care Units. J Biomed Inform 2024;156:104667. [PMID: 38848885 DOI: 10.1016/j.jbi.2024.104667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 06/01/2024] [Accepted: 06/03/2024] [Indexed: 06/09/2024]

Affiliation(s)

Sara Mora Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy; UO Information and Communication Technologies (ICT), IRCCS Ospedale Policlinico San Martino, Genoa, Italy.
Daniele Roberto Giacobbe Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Claudia Bartalucci Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Giulia Viglietti Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Malgorzata Mikulska Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Antonio Vena Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Lorenzo Ball Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Chiara Robba Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Alice Cappello Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Denise Battaglini Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Iole Brunetti Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Paolo Pelosi Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Matteo Bassetti Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
Mauro Giacomini Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy

Collapse

Borchert F, Llorca I, Schapranow MP. Improving biomedical entity linking for complex entity mentions with LLM-based text simplification. Database (Oxford) 2024;2024:baae067. [PMID: 39066514 PMCID: PMC11281847 DOI: 10.1093/database/baae067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/08/2024] [Accepted: 07/18/2024] [Indexed: 07/28/2024]

Kim A, Jeon E, Lee H, Heo H, Woo K. Risk factors for prediabetes in community-dwelling adults: A generalized estimating equation logistic regression approach with natural language processing insights. Res Nurs Health 2024. [PMID: 38961672 DOI: 10.1002/nur.22413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 05/11/2024] [Accepted: 06/22/2024] [Indexed: 07/05/2024]

Park JI, Park JW, Zhang K, Kim D. Advancing equity in breast cancer care: natural language processing for analysing treatment outcomes in under-represented populations. BMJ Health Care Inform 2024;31:e100966. [PMID: 38955389 PMCID: PMC11218025 DOI: 10.1136/bmjhci-2023-100966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 06/21/2024] [Indexed: 07/04/2024] Open

Abstract

OBJECTIVE

The study aimed to develop natural language processing (NLP) algorithms to automate extracting patient-centred breast cancer treatment outcomes from clinical notes in electronic health records (EHRs), particularly for women from under-represented populations.

METHODS

The study used clinical notes from 2010 to 2021 from a tertiary hospital in the USA. The notes were processed through various NLP techniques, including vectorisation methods (term frequency-inverse document frequency (TF-IDF), Word2Vec, Doc2Vec) and classification models (support vector classification, K-nearest neighbours (KNN), random forest (RF)). Feature selection and optimisation through random search and fivefold cross-validation were also conducted.

RESULTS

The study annotated 100 out of 1000 clinical notes, using 970 notes to build the text corpus. TF-IDF and Doc2Vec combined with RF showed the highest performance, while Word2Vec was less effective. RF classifier demonstrated the best performance, although with lower recall rates, suggesting more false negatives. KNN showed lower recall due to its sensitivity to data noise.

DISCUSSION

The study highlights the significance of using NLP in analysing clinical notes to understand breast cancer treatment outcomes in under-represented populations. The TF-IDF and Doc2Vec models were more effective in capturing relevant information than Word2Vec. The study observed lower recall rates in RF models, attributed to the dataset's imbalanced nature and the complexity of clinical notes.

CONCLUSION

The study developed high-performing NLP pipeline to capture treatment outcomes for breast cancer in under-represented populations, demonstrating the importance of document-level vectorisation and ensemble methods in clinical notes analysis. The findings provide insights for more equitable healthcare strategies and show the potential for broader NLP applications in clinical settings.

Collapse

Gleason KT, Tran A, Fawzy A, Yan L, Farley H, Garibaldi B, Iwashyna TJ. Does nurse use of a standardized flowsheet to document communication with advanced providers provide a mechanism to detect pulse oximetry failures? A retrospective study of electronic health record data. Int J Nurs Stud 2024;155:104770. [PMID: 38676990 DOI: 10.1016/j.ijnurstu.2024.104770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 03/05/2024] [Accepted: 04/02/2024] [Indexed: 04/29/2024]

Abstract

BACKGROUND

Pulse oximetry guides clinical decisions, yet does not uniformly identify hypoxemia. We hypothesized that nursing documentation of notifying providers, facilitated by a standardized flowsheet for documenting communication to providers (physicians, nurse practitioners, and physician assistants), may increase when hypoxemia is present, but undetected by the pulse oximeter, in events termed "occult hypoxemia."

OBJECTIVE

To compare nurse documentation of provider notification in the 4 h preceding cases of occult hypoxemia, normal oxygenation, and evident hypoxemia confirmed by an arterial blood gas reading.

METHODS

We conducted a retrospective study using electronic health record data from patients with COVID-19 at five hospitals in a healthcare system with paired SpO2 and SaO2 readings (measurements within 10 min of oxygen saturation levels in arterial blood, SaO2, and by pulse oximetry, SpO2). We applied multivariate logistic regression to assess if having any nursing documentation of provider notification in the 4 h prior to a paired reading confirming occult hypoxemia was more likely compared to a paired reading confirming normal oxygen status, adjusting for characteristics significantly associated with nursing documentation. We applied conditional logistic regression to assess if having any nursing documentation of provider notification was more likely in the 4-hour window preceding a paired reading compared to the 4-hour window 24 h earlier separately for occult hypoxemia, visible hypoxemia, and normal oxygenation.

RESULTS

There were data from 1910 patients hospitalized with COVID-19 who had 44,972 paired readings and an average of 26.5 (34.5) nursing documentation of provider notification events. The mean age was 63.4 (16.2). Almost half (866/1910, 45.3 %) were White, 701 (36.7 %) were Black, and 239 (12.5 %) were Hispanic. Having any nursing documentation of provider notification was 46 % more common in the 4 h before an occult hypoxemia paired reading compared to a normal oxygen status paired reading (OR 1.46, 95 % CI: 1.28-1.67). Comparing the 4 h immediately before the reading to the 4 h one day preceding the paired reading, there was a higher likelihood of having any nursing documentation of provider notification for both evident (OR 1.45, 95 % CI 1.24-1.68) and occult paired readings (OR 1.26, 95 % CI 1.04-1.53).

CONCLUSION

This study finds that nursing documentation of provider notification significantly increases prior to confirmed occult hypoxemia, which has potential in proactively identifying occult hypoxemia and other clinical issues. There is potential value to encouraging standardized documentation of nurse concern, including communication to providers, to facilitate its inclusion in clinical decision-making.

Collapse

Jackson AF, Burkom H. A Framework for Developing and Assessing Custom Case Definitions: A Demonstration Applied to Opioid Overdose in Maryland. JOURNAL OF PUBLIC HEALTH MANAGEMENT AND PRACTICE 2024;30:578-585. [PMID: 38870375 DOI: 10.1097/phh.0000000000001885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]

Abstract

CONTEXT

Public health epidemiologists monitor data sources for disease outbreaks and other events of public health concern, but manual review of records to identify cases of interest is slow and labor-intensive and may not reflect evolving data practices. To automatically identify cases from electronic data sources, epidemiologists must use "case definitions" or formal logic that captures the criteria used to identify a record as a case of interest.

OBJECTIVE

To establish a methodology for development and evaluation of case definitions. A logical evaluation framework to approach case definitions will allow jurisdictions the flexibility to implement a case definition tailored to their goals and available data.

DESIGN

Case definition development is explained as a process with multiple logical components combining free-text and categorical data fields. The process is illustrated with the development of a case definition to identify emergency medical services (EMS) call records related to opioid overdoses in Maryland.

SETTING

The Maryland Department of Health (MDH) installation of the Electronic Surveillance System for Early Notification of Community-Based Epidemics (ESSENCE), which began capturing EMS call records in ESSENCE in 2019 to improve statewide coverage of all-hazards health issues.

RESULTS

We describe a case definition evaluation framework and demonstrate its application through development of an opioid overdose case definition to be used in MDH ESSENCE. We show the iterative process of development, from defining how a case can be identified conceptually to examining each component of the conceptual definition and then exploring how to capture that component using available data.

CONCLUSION

We present a framework for developing and qualitatively assessing case definitions and demonstrate an application of the framework to identifying opioid overdose incidents from MDH EMS data. We discuss guidelines to support jurisdictions in applying this framework to their own data and public health challenges to improve local surveillance capability.

Collapse

Wieland-Jorna Y, van Kooten D, Verheij RA, de Man Y, Francke AL, Oosterveld-Vlug MG. Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review. JAMIA Open 2024;7:ooae044. [PMID: 38798774 PMCID: PMC11126158 DOI: 10.1093/jamiaopen/ooae044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 03/21/2024] [Accepted: 05/07/2024] [Indexed: 05/29/2024] Open

Smith SJ, Moorin R, Taylor K, Newton J, Smith S. Collecting routine and timely cancer stage at diagnosis by implementing a cancer staging tiered framework: the Western Australian Cancer Registry experience. BMC Health Serv Res 2024;24:770. [PMID: 38943091 PMCID: PMC11214229 DOI: 10.1186/s12913-024-11224-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024] Open

Abstract

BACKGROUND

Current processes collecting cancer stage data in population-based cancer registries (PBCRs) lack standardisation, resulting in difficulty utilising diverse data sources and incomplete, low-quality data. Implementing a cancer staging tiered framework aims to improve stage collection and facilitate inter-PBCR benchmarking.

OBJECTIVE

Demonstrate the application of a cancer staging tiered framework in the Western Australian Cancer Staging Project to establish a standardised method for collecting cancer stage at diagnosis data in PBCRs.

METHODS

The tiered framework, developed in collaboration with a Project Advisory Group and applied to breast, colorectal, and melanoma cancers, provides business rules - procedures for stage collection. Tier 1 represents the highest staging level, involving complete American Joint Committee on Cancer (AJCC) tumour-node-metastasis (TNM) data collection and other critical staging information. Tier 2 (registry-derived stage) relies on supplementary data, including hospital admission data, to make assumptions based on data availability. Tier 3 (pathology stage) solely uses pathology reports.

FINDINGS

The tiered framework promotes flexible utilisation of staging data, recognising various levels of data completeness. Tier 1 is suitable for all purposes, including clinical and epidemiological applications. Tiers 2 and 3 are recommended for epidemiological analysis alone. Lower tiers provide valuable insights into disease patterns, risk factors, and overall disease burden for public health planning and policy decisions. Capture of staging at each tier depends on data availability, with potential shifts to higher tiers as new data sources are acquired.

CONCLUSIONS

The tiered framework offers a dynamic approach for PBCRs to record stage at diagnosis, promoting consistency in population-level staging data and enabling practical use for benchmarking across jurisdictions, public health planning, policy development, epidemiological analyses, and assessing cancer outcomes. Evolution with staging classifications and data variable changes will futureproof the tiered framework. Its adaptability fosters continuous refinement of data collection processes and encourages improvements in data quality.

Collapse

Mun M, Kim A, Woo K. Natural Language Processing Application in Nursing Research: A Study Using Text Network Analysis and Topic Modeling. Comput Inform Nurs 2024:00024665-990000000-00202. [PMID: 38913983 DOI: 10.1097/cin.0000000000001158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]

Dos Santos VL, Sato KS, Maher CG, Vidal RVC, Grande GHD, Costa LOP, Machado GC, Ferreira GE, Buchbinder R, Oliveira CB. Clinical indicators to monitor health care in low back pain: a scoping review. Int J Qual Health Care 2024;36:mzae044. [PMID: 38814664 DOI: 10.1093/intqhc/mzae044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 05/27/2024] [Indexed: 05/31/2024] Open

Affiliation(s)

Vanessa L Dos Santos Faculty of Medicine, University of Western São Paulo (UNOESTE), Presidente Prudente, Sao Paulo 19050-920, Brazil
Karen S Sato Faculty of Medicine, University of Western São Paulo (UNOESTE), Presidente Prudente, Sao Paulo 19050-920, Brazil
Chris G Maher Institute for Musculoskeletal Health, Sydney Local Health District, King George V Building, Missenden Road, Camperdown, Sydney, New South Wales 2050, Australia Sydney Musculoskeletal Health, Faculty of Medicine and Health, The University of Sydney, King George V Building, Missenden Road, Camperdown, Sydney, New South Wales 2050, Australia
Rubens V C Vidal Faculty of Medicine, University of Western São Paulo (UNOESTE), Presidente Prudente, Sao Paulo 19050-920, Brazil
Guilherme H D Grande Faculty of Medicine, University of Western São Paulo (UNOESTE), Presidente Prudente, Sao Paulo 19050-920, Brazil Departamento de Educação Física, Faculdade de Ciências e Tecnologia, Universidade Estadual Paulista, Rua Roberto Simonsen, 305, Presidente Prudente, Sao Pualo 19060-900, Brazil
Leonardo O P Costa Masters and Doctoral Programs in Physical Therapy, Universidade Cidade de São Paulo, Rua Cesário Galeno, 448, Sao Paulo 03071-000, Brazil
Gustavo C Machado Institute for Musculoskeletal Health, Sydney Local Health District, King George V Building, Missenden Road, Camperdown, Sydney, New South Wales 2050, Australia Sydney Musculoskeletal Health, Faculty of Medicine and Health, The University of Sydney, King George V Building, Missenden Road, Camperdown, Sydney, New South Wales 2050, Australia
Giovanni E Ferreira Institute for Musculoskeletal Health, Sydney Local Health District, King George V Building, Missenden Road, Camperdown, Sydney, New South Wales 2050, Australia Sydney Musculoskeletal Health, Faculty of Medicine and Health, The University of Sydney, King George V Building, Missenden Road, Camperdown, Sydney, New South Wales 2050, Australia
Rachelle Buchbinder Musculoskeletal Health and Wiser Health Care Units, School of Public Health and Preventive Medicine, Monash University, 4 Drysdale St, Malvern, Melbourne, Victoria 3144, Australia
Crystian B Oliveira Faculty of Medicine, University of Western São Paulo (UNOESTE), Presidente Prudente, Sao Paulo 19050-920, Brazil Departamento de Educação Física, Faculdade de Ciências e Tecnologia, Universidade Estadual Paulista, Rua Roberto Simonsen, 305, Presidente Prudente, Sao Pualo 19060-900, Brazil

Collapse

Le KDR, Tay SBP, Choy KT, Verjans J, Sasanelli N, Kong JCH. Applications of natural language processing tools in the surgical journey. Front Surg 2024;11:1403540. [PMID: 38826809 PMCID: PMC11140056 DOI: 10.3389/fsurg.2024.1403540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/07/2024] [Indexed: 06/04/2024] Open

Abstract

Background

Natural language processing tools are becoming increasingly adopted in multiple industries worldwide. They have shown promising results however their use in the field of surgery is under-recognised. Many trials have assessed these benefits in small settings with promising results before large scale adoption can be considered in surgery. This study aims to review the current research and insights into the potential for implementation of natural language processing tools into surgery.

Methods

A narrative review was conducted following a computer-assisted literature search on Medline, EMBASE and Google Scholar databases. Papers related to natural language processing tools and consideration into their use for surgery were considered.

Results

Current applications of natural language processing tools within surgery are limited. From the literature, there is evidence of potential improvement in surgical capability and service delivery, such as through the use of these technologies to streamline processes including surgical triaging, data collection and auditing, surgical communication and documentation. Additionally, there is potential to extend these capabilities to surgical academia to improve processes in surgical research and allow innovation in the development of educational resources. Despite these outcomes, the evidence to support these findings are challenged by small sample sizes with limited applicability to broader settings.

Conclusion

With the increasing adoption of natural language processing technology, such as in popular forms like ChatGPT, there has been increasing research in the use of these tools within surgery to improve surgical workflow and efficiency. This review highlights multifaceted applications of natural language processing within surgery, albeit with clear limitations due to the infancy of the infrastructure available to leverage these technologies. There remains room for more rigorous research into broader capability of natural language processing technology within the field of surgery and the need for cross-sectoral collaboration to understand the ways in which these algorithms can best be integrated.

Collapse

Roberts K, Chin AT, Loewy K, Pompeii L, Shin H, Rider NL. Natural language processing of clinical notes enables early inborn error of immunity risk ascertainment. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. GLOBAL 2024;3:100224. [PMID: 38439946 PMCID: PMC10910118 DOI: 10.1016/j.jacig.2024.100224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/24/2023] [Accepted: 01/21/2024] [Indexed: 03/06/2024]

Harris C, Tang Y, Birnbaum E, Cherian C, Mendhe D, Chen MH. Digital Neuropsychology beyond Computerized Cognitive Assessment: Applications of Novel Digital Technologies. Arch Clin Neuropsychol 2024;39:290-304. [PMID: 38520381 DOI: 10.1093/arclin/acae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 02/16/2024] [Indexed: 03/25/2024] Open

Reading Turchioe M, Volodarskiy A, Guo W, Taylor B, Hobensack M, Pathak J, Slotwiner D. Characterizing atrial fibrillation symptom improvement following de novo catheter ablation. Eur J Cardiovasc Nurs 2024;23:241-250. [PMID: 37479225 PMCID: PMC11008952 DOI: 10.1093/eurjcn/zvad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 07/05/2023] [Accepted: 07/18/2023] [Indexed: 07/23/2023]

Sim JA, Huang X, Horan MR, Baker JN, Huang IC. Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review. Expert Rev Pharmacoecon Outcomes Res 2024;24:467-475. [PMID: 38383308 PMCID: PMC11001514 DOI: 10.1080/14737167.2024.2322664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 02/20/2024] [Indexed: 02/23/2024]

Leal I, Nogueira V, Matos DB, Araújo J, Berens O, Ribeiro M, Furtado MJ, Liverani M, Silva MI, Guedes M, Cordeiro M, Ribeiro M, José P, Barão R, Nunes Ferreira R, Fonseca S, Mano S, Pina S, Santos MJ, Fonseca JE, Fonseca C, Figueira L. Design and Development of a Web-Based Prospective Nationwide Registry for Ocular Inflammatory Diseases: UVEITE.PT - The Portuguese Ocular Inflammation Registry. Ocul Immunol Inflamm 2024;32:342-350. [PMID: 36780588 DOI: 10.1080/09273948.2023.2171891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 01/11/2023] [Accepted: 01/18/2023] [Indexed: 02/15/2023]

Affiliation(s)

Inês Leal Ophthalmology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Centro de Estudos das Ciências da Visão, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
Vanda Nogueira Instituto de Oftalmologia Dr. Gama Pinto, Lisbon, Portugal
Diogo Bernardo Matos Ophthalmology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Centro de Estudos das Ciências da Visão, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
Joana Araújo Ophthalmology Department, Centro Hospitalar Universitário São João, Porto, Portugal Departamento de Cirurgia e Fisiologia, Faculdade de Medicina, Universidade do Porto, Porto, Portugal
Olga Berens Ophthalmology Department, Hospital do Espírito Santo, Évora, Portugal
Margarida Ribeiro Ophthalmology Department, Centro Hospitalar Universitário São João, Porto, Portugal Department of Biomedicine, Unit of Pharmacology and Therapeutics, Faculdade de Medicina, Universidade do Porto, Porto, Portugal
Maria João Furtado Ophthalmology Department, Centro Hospitalar Universitário do Porto, Porto, Portugal
Marco Liverani Ophthalmology Department, Hospital de Vila Franca de Xira, Vila Franca de Xira, Portugal
Marta Inês Silva Ophthalmology Department, Centro Hospitalar Universitário São João, Porto, Portugal
Marta Guedes Ophthalmology Department, Hospital Egas Moniz, Centro Hospitalar de Lisboa Ocidental, Lisbon, Portugal
Miguel Cordeiro Ophthalmology Department, Hospital Egas Moniz, Centro Hospitalar de Lisboa Ocidental, Lisbon, Portugal
Miguel Ribeiro Ophthalmology Department, Centro Hospitalar Tondela-Viseu, Viseu, Portugal
Patrícia José Ophthalmology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Centro de Estudos das Ciências da Visão, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
Rafael Barão Ophthalmology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Centro de Estudos das Ciências da Visão, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
Rui Nunes Ferreira Ophthalmology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Centro de Estudos das Ciências da Visão, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
Sofia Fonseca Ophthalmology Department, Centro Hospitalar de Vila Nova de Gaia/Espinho, Vila Nova de Gaia, Portugal
Sofia Mano Ophthalmology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Centro de Estudos das Ciências da Visão, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
Susana Pina Ophthalmology Department, Hospital Beatriz Ângelo, Loures, Portugal
Maria José Santos Rheumatology Department, Hospital Garcia de Orta, Almada, Portugal Rheumatology Research Unit, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Centro Académico de Medicina de Lisboa, Lisbon, Portugal
João Eurico Fonseca Rheumatology Research Unit, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Centro Académico de Medicina de Lisboa, Lisbon, Portugal Rheumatology Department, Hospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Lisbon, Portugal
Cristina Fonseca Ophthalmology Department, Centro de Responsabilidade Integrado de Oftalmologia, Centro Hospitalar e Universitário de Coimbra, Coimbra, Portugal
Luís Figueira Ophthalmology Department, Centro Hospitalar Universitário São João, Porto, Portugal Center for Drug Discovery and Innovative Medicines (MedInUP) of the University of Porto, Porto, Portugal

Collapse

Mashima Y, Tanigawa M, Yokoi H. Information heterogeneity between progress notes by physicians and nurses for inpatients with digestive system diseases. Sci Rep 2024;14:7656. [PMID: 38561333 PMCID: PMC10984979 DOI: 10.1038/s41598-024-56324-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 03/05/2024] [Indexed: 04/04/2024] Open

Irrera O, Marchesin S, Silvello G. MetaTron: advancing biomedical annotation empowering relation annotation and collaboration. BMC Bioinformatics 2024;25:112. [PMID: 38486137 PMCID: PMC10941452 DOI: 10.1186/s12859-024-05730-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 03/04/2024] [Indexed: 03/17/2024] Open

Abstract

BACKGROUND

The constant growth of biomedical data is accompanied by the need for new methodologies to effectively and efficiently extract machine-readable knowledge for training and testing purposes. A crucial aspect in this regard is creating large, often manually or semi-manually, annotated corpora vital for developing effective and efficient methods for tasks like relation extraction, topic recognition, and entity linking. However, manual annotation is expensive and time-consuming especially if not assisted by interactive, intuitive, and collaborative computer-aided tools. To support healthcare experts in the annotation process and foster annotated corpora creation, we present MetaTron. MetaTron is an open-source and free-to-use web-based annotation tool to annotate biomedical data interactively and collaboratively; it supports both mention-level and document-level annotations also integrating automatic built-in predictions. Moreover, MetaTron enables relation annotation with the support of ontologies, functionalities often overlooked by off-the-shelf annotation tools.

RESULTS

We conducted a qualitative analysis to compare MetaTron with a set of manual annotation tools including TeamTat, INCEpTION, LightTag, MedTAG, and brat, on three sets of criteria: technical, data, and functional. A quantitative evaluation allowed us to assess MetaTron performances in terms of time and number of clicks to annotate a set of documents. The results indicated that MetaTron fulfills almost all the selected criteria and achieves the best performances.

CONCLUSIONS

MetaTron stands out as one of the few annotation tools targeting the biomedical domain supporting the annotation of relations, and fully customizable with documents in several formats-PDF included, as well as abstracts retrieved from PubMed, Semantic Scholar, and OpenAIRE. To meet any user need, we released MetaTron both as an online instance and as a Docker image locally deployable.

Collapse

Hughes JA, Hazelwood S, Lyrstedt AL, Jones L, Brown NJ, Jarugula R, Douglas C, Chu K. Enhancing pain care with the American Pain Society Patient Outcome Questionnaire for use in the emergency department (APS-POQ-RED): validating a patient-reported outcome measure. BMJ Open Qual 2024;13:e002295. [PMID: 38448040 PMCID: PMC10916172 DOI: 10.1136/bmjoq-2023-002295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 12/02/2023] [Indexed: 03/08/2024] Open

Margetta J, Sale A. Distinguishing cardiac catheter ablation energy modalities by applying natural language processing to electronic health records. J Comp Eff Res 2024;13:e230053. [PMID: 38261335 PMCID: PMC10945417 DOI: 10.57264/cer-2023-0053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 01/03/2024] [Indexed: 01/24/2024] Open

Pankratz N, Cole BR, Beutel KM, Liao KP, Ashe J. Parkinson Disease Genetics Extended to African and Hispanic Ancestries in the VA Million Veteran Program. Neurol Genet 2024;10:e200110. [PMID: 38130828 PMCID: PMC10732342 DOI: 10.1212/nxg.0000000000200110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 10/06/2023] [Indexed: 12/23/2023]

Abstract

Background and Objectives

Nearly all genetic analyses of Parkinson disease (PD) have been in populations of European ancestry. We sought to test the ability of a machine learning method to extract accurate PD diagnoses from an electronic medical record (EMR) system, to see whether genetic variants identified in European populations generalize to individuals of African and Hispanic ancestries, and to compare the rates of PD across ancestries.

Methods

A machine learning method using natural language processing was applied to EMRs of US veterans participating in the VA Million Veteran Program (MVP) to identify individuals with PD. These putative cases were vetted via blind chart review by a movement disorder specialist. A polygenic risk score (PRS) of 90 established genetic variants whose genotypes were imputed from a customized Axiom Biobank Array was evaluated in different case groups.

Results

The EMR prediction scores had a distinct trimodal distribution, with 97% of the high group and only 30% of the middle group having a credible diagnosis of PD. Using the 3,542 cases from the high group matched 4:1 to controls, the PRS was highly predictive in individuals of European ancestry (n = 3,137 cases; OR = 1.82; p = 8.01E-48), and nearly identical effect sizes were seen in individuals of African (n = 184; OR = 2.07; p = 3.4E-4) and Hispanic ancestries (n = 221; OR = 2.13; p = 3.9E-6). The PRS was much less predictive for the 2,757 European ancestry cases who had an ICD code for PD but for whom the machine learning method had a lower confidence in their diagnosis. No novel ancestry-specific genetic variants were identified. Individuals with African ancestry had one-quarter the rate of PD compared with European or Hispanic ancestries aged 60-70 years and one half the rate in the 70-80 years age range. African American cases had a higher proportion of their DNA originating in Europe compared with African American controls.

Discussion

Machine learning can reliably classify PD using data from a large EMR. Larger studies of non-European populations are required to confirm the generalizability of PD risk variants identified in populations of European ancestry and the increased risk coming from a higher proportion of European DNA in African Americans.

Collapse

Affiliation(s)

Nathan Pankratz From the Department of Laboratory Medicine and Pathology (N.P., B.R.C., K.M.B.), School of Medicine, University of Minnesota, Minneapolis; Division of Rheumatology (K.P.L.), Immunology, and Allergy, Brigham and Women's Hospital; Department of Biomedical Informatics (K.P.L.), Harvard Medical School; Division of Data Sciences (K.P.L.), VA Boston Healthcare System, MA; Department of Neurology (J.A.), University of Minnesota Medical School; and Department of Neurology (J.A.), Minneapolis Veterans Affairs Health Care System, MN
Benjamin R Cole From the Department of Laboratory Medicine and Pathology (N.P., B.R.C., K.M.B.), School of Medicine, University of Minnesota, Minneapolis; Division of Rheumatology (K.P.L.), Immunology, and Allergy, Brigham and Women's Hospital; Department of Biomedical Informatics (K.P.L.), Harvard Medical School; Division of Data Sciences (K.P.L.), VA Boston Healthcare System, MA; Department of Neurology (J.A.), University of Minnesota Medical School; and Department of Neurology (J.A.), Minneapolis Veterans Affairs Health Care System, MN
Kathleen M Beutel From the Department of Laboratory Medicine and Pathology (N.P., B.R.C., K.M.B.), School of Medicine, University of Minnesota, Minneapolis; Division of Rheumatology (K.P.L.), Immunology, and Allergy, Brigham and Women's Hospital; Department of Biomedical Informatics (K.P.L.), Harvard Medical School; Division of Data Sciences (K.P.L.), VA Boston Healthcare System, MA; Department of Neurology (J.A.), University of Minnesota Medical School; and Department of Neurology (J.A.), Minneapolis Veterans Affairs Health Care System, MN
Katherine P Liao From the Department of Laboratory Medicine and Pathology (N.P., B.R.C., K.M.B.), School of Medicine, University of Minnesota, Minneapolis; Division of Rheumatology (K.P.L.), Immunology, and Allergy, Brigham and Women's Hospital; Department of Biomedical Informatics (K.P.L.), Harvard Medical School; Division of Data Sciences (K.P.L.), VA Boston Healthcare System, MA; Department of Neurology (J.A.), University of Minnesota Medical School; and Department of Neurology (J.A.), Minneapolis Veterans Affairs Health Care System, MN
James Ashe From the Department of Laboratory Medicine and Pathology (N.P., B.R.C., K.M.B.), School of Medicine, University of Minnesota, Minneapolis; Division of Rheumatology (K.P.L.), Immunology, and Allergy, Brigham and Women's Hospital; Department of Biomedical Informatics (K.P.L.), Harvard Medical School; Division of Data Sciences (K.P.L.), VA Boston Healthcare System, MA; Department of Neurology (J.A.), University of Minnesota Medical School; and Department of Neurology (J.A.), Minneapolis Veterans Affairs Health Care System, MN

Collapse

Abid R, Hussein AA, Guru KA. Artificial Intelligence in Urology: Current Status and Future Perspectives. Urol Clin North Am 2024;51:117-130. [PMID: 37945097 DOI: 10.1016/j.ucl.2023.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]

Kuck K, Lofgren L, Lybbert C. Anesthesia Patient Monitoring 2050. Anesth Analg 2024;138:273-283. [PMID: 38215707 DOI: 10.1213/ane.0000000000006660] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2024]

Abstract

The monitoring of vital signs in patients undergoing anesthesia began with the very first case of anesthesia and has evolved alongside the development of anesthesiology ever since. Patient monitoring started out as a manually performed, intermittent, and qualitative assessment of the patient's general well-being in the operating room. In its evolution, patient monitoring development has responded to the clinical need, for example, when critical incident studies in the 1980s found that many anesthesia adverse events could be prevented by improved monitoring, especially respiratory monitoring. It also facilitated and perhaps even enabled increasingly complex surgeries in increasingly higher-risk patients. For example, it would be very challenging to perform and provide anesthesia care during some of the very complex cardiovascular surgeries that are almost routine today without being able to simultaneously and reliably monitor multiple pressures in a variety of places in the circulatory system. Of course, anesthesia patient monitoring itself is enabled by technological developments in the world outside of the operating room. Throughout its history, anesthesia patient monitoring has taken advantage of advancements in material science (when nonthrombogenic polymers allowed the design of intravascular catheters, for example), in electronics and transducers, in computers, in displays, in information technology, and so forth. Slower product life cycles in medical devices mean that by carefully observing technologies such as consumer electronics, including user interfaces, it is possible to peek ahead and estimate with confidence the foundational technologies that will be used by patient monitors in the near future. Just as the discipline of anesthesiology has, the patient monitoring that accompanies it has come a long way from its beginnings in the mid-19th century. Extrapolating from careful observations of the prevailing trends that have shaped anesthesia patient monitoring historically, patient monitoring in the future will use noncontact technologies, will predict the trajectory of a patient's vital signs, will add regional vital signs to the current systemic ones, and will facilitate directed and supervised anesthesia care over the broader scope that anesthesia will be responsible for.

Collapse

Hassan E, Abd El-Hafeez T, Shams MY. Optimizing classification of diseases through language model analysis of symptoms. Sci Rep 2024;14:1507. [PMID: 38233458 PMCID: PMC10794698 DOI: 10.1038/s41598-024-51615-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 01/07/2024] [Indexed: 01/19/2024] Open

Xie F, Chang J, Luong T, Wu B, Lustigova E, Shrader E, Chen W. Identifying Symptoms Prior to Pancreatic Ductal Adenocarcinoma Diagnosis in Real-World Care Settings: Natural Language Processing Approach. JMIR AI 2024;3:e51240. [PMID: 38875566 PMCID: PMC11041417 DOI: 10.2196/51240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/08/2023] [Accepted: 12/16/2023] [Indexed: 06/16/2024]

Abstract

BACKGROUND

Pancreatic cancer is the third leading cause of cancer deaths in the United States. Pancreatic ductal adenocarcinoma (PDAC) is the most common form of pancreatic cancer, accounting for up to 90% of all cases. Patient-reported symptoms are often the triggers of cancer diagnosis and therefore, understanding the PDAC-associated symptoms and the timing of symptom onset could facilitate early detection of PDAC.

OBJECTIVE

This paper aims to develop a natural language processing (NLP) algorithm to capture symptoms associated with PDAC from clinical notes within a large integrated health care system.

METHODS

We used unstructured data within 2 years prior to PDAC diagnosis between 2010 and 2019 and among matched patients without PDAC to identify 17 PDAC-related symptoms. Related terms and phrases were first compiled from publicly available resources and then recursively reviewed and enriched with input from clinicians and chart review. A computerized NLP algorithm was iteratively developed and fine-trained via multiple rounds of chart review followed by adjudication. Finally, the developed algorithm was applied to the validation data set to assess performance and to the study implementation notes.

RESULTS

A total of 408,147 and 709,789 notes were retrieved from 2611 patients with PDAC and 10,085 matched patients without PDAC, respectively. In descending order, the symptom distribution of the study implementation notes ranged from 4.98% for abdominal or epigastric pain to 0.05% for upper extremity deep vein thrombosis in the PDAC group, and from 1.75% for back pain to 0.01% for pale stool in the non-PDAC group. Validation of the NLP algorithm against adjudicated chart review results of 1000 notes showed that precision ranged from 98.9% (jaundice) to 84% (upper extremity deep vein thrombosis), recall ranged from 98.1% (weight loss) to 82.8% (epigastric bloating), and F1-scores ranged from 0.97 (jaundice) to 0.86 (depression).

CONCLUSIONS

The developed and validated NLP algorithm could be used for the early detection of PDAC.

Collapse

Wei WI, Leung CLK, Tang A, McNeil EB, Wong SYS, Kwok KO. Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong. Clin Microbiol Infect 2024;30:142.e1-142.e3. [PMID: 37949111 DOI: 10.1016/j.cmi.2023.11.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 11/01/2023] [Accepted: 11/03/2023] [Indexed: 11/12/2023]

Jadhav P, Sears T, Floan G, Joskowitz K, Nienow S, Cruz S, David M, de Cos V, Choi P, Ignacio RC. Application of a Machine Learning Algorithm in Prediction of Abusive Head Trauma in Children. J Pediatr Surg 2024;59:80-85. [PMID: 37858394 DOI: 10.1016/j.jpedsurg.2023.09.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 09/07/2023] [Indexed: 10/21/2023]

Abstract

PURPOSE

We explored the application of a machine learning algorithm for the timely detection of potential abusive head trauma (AHT) using the first free-text note of an encounter and demographic information.

METHODS

First free-text physician notes and demographic information were collected for children under 5 years of age at a Level 1 Trauma Center. The control group, which included patients with head/neck injury, was compared to those with AHT diagnosed by the Child Protective Team. Differential scores accounted for words overrepresented in AHT patient vs. control notes. Sentiment scores were reflective of note positivity/negativity and subjectivity scores accounted for note subjectivity/objectivity. The composite scores reflected the patient's differential score modified by the subjectivity score. Composite, sentiment, and subjectivity scores combined with demographic information trained a Random Forest (RF) machine learning algorithm to predict AHT.

RESULTS

Final composite scores with demographic information were highly associated with AHT in a test dataset. The control group included 587 patients and the test group included 193 patients. Combining composite scores with demographic information into the RF model improved AHT classification area under the curve (AUC) from 0.68 to 0.78, with an overall accuracy of 84%. Feature importance analysis of our RF model revealed that composite score, sentiment, age, and subjectivity were the most impactful predictors of AHT. The sentiment was not significantly different between control and AHT notes (p = 0.87), while subjectivity trended higher for AHT notes (p = 0.081).

CONCLUSION

We conclude that a machine learning algorithm can recognize patterns within free-text notes and demographic information that aid in AHT detection in children.

LEVEL OF EVIDENCE

III.

Collapse

Scharp D, Hobensack M, Davoudi A, Topaz M. Natural Language Processing Applied to Clinical Documentation in Post-acute Care Settings: A Scoping Review. J Am Med Dir Assoc 2024;25:69-83. [PMID: 37838000 PMCID: PMC10792659 DOI: 10.1016/j.jamda.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 10/16/2023]

Abstract

OBJECTIVES

To determine the scope of the application of natural language processing to free-text clinical notes in post-acute care and provide a foundation for future natural language processing-based research in these settings.

DESIGN

Scoping review; reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines.

SETTING AND PARTICIPANTS

Post-acute care (ie, home health care, long-term care, skilled nursing facilities, and inpatient rehabilitation facilities).

METHODS

PubMed, Cumulative Index of Nursing and Allied Health Literature, and Embase were searched in February 2023. Eligible studies had quantitative designs that used natural language processing applied to clinical documentation in post-acute care settings. The quality of each study was appraised.

RESULTS

Twenty-one studies were included. Almost all studies were conducted in home health care settings. Most studies extracted data from electronic health records to examine the risk for negative outcomes, including acute care utilization, medication errors, and suicide mortality. About half of the studies did not report age, sex, race, or ethnicity data or use standardized terminologies. Only 8 studies included variables from socio-behavioral domains. Most studies fulfilled all quality appraisal indicators.

CONCLUSIONS AND IMPLICATIONS

The application of natural language processing is nascent in post-acute care settings. Future research should apply natural language processing using standardized terminologies to leverage free-text clinical notes in post-acute care to promote timely, comprehensive, and equitable care. Natural language processing could be integrated with predictive models to help identify patients who are at risk of negative outcomes. Future research should incorporate socio-behavioral determinants and diverse samples to improve health equity in informatics tools.

Collapse

Sim JA, Huang X, Horan MR, Stewart CM, Robison LL, Hudson MM, Baker JN, Huang IC. Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review. Artif Intell Med 2023;146:102701. [PMID: 38042599 PMCID: PMC10693655 DOI: 10.1016/j.artmed.2023.102701] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 09/30/2023] [Accepted: 10/29/2023] [Indexed: 12/04/2023]

Abstract

OBJECTIVE

Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care.

METHODS

We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs.

RESULTS

Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP.

CONCLUSIONS

This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.

Collapse

Keszthelyi D, Gaudet-Blavignac C, Bjelogrlic M, Lovis C. Patient Information Summarization in Clinical Settings: Scoping Review. JMIR Med Inform 2023;11:e44639. [PMID: 38015588 DOI: 10.2196/44639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 03/15/2023] [Accepted: 07/25/2023] [Indexed: 11/29/2023] Open

Abstract

BACKGROUND

Information overflow, a common problem in the present clinical environment, can be mitigated by summarizing clinical data. Although there are several solutions for clinical summarization, there is a lack of a complete overview of the research relevant to this field.

OBJECTIVE

This study aims to identify state-of-the-art solutions for clinical summarization, to analyze their capabilities, and to identify their properties.

METHODS

A scoping review of articles published between 2005 and 2022 was conducted. With a clinical focus, PubMed and Web of Science were queried to find an initial set of reports, later extended by articles found through a chain of citations. The included reports were analyzed to answer the questions of where, what, and how medical information is summarized; whether summarization conserves temporality, uncertainty, and medical pertinence; and how the propositions are evaluated and deployed. To answer how information is summarized, methods were compared through a new framework "collect-synthesize-communicate" referring to information gathering from data, its synthesis, and communication to the end user.

RESULTS

Overall, 128 articles were included, representing various medical fields. Exclusively structured data were used as input in 46.1% (59/128) of papers, text in 41.4% (53/128) of articles, and both in 10.2% (13/128) of papers. Using the proposed framework, 42.2% (54/128) of the records contributed to information collection, 27.3% (35/128) contributed to information synthesis, and 46.1% (59/128) presented solutions for summary communication. Numerous summarization approaches have been presented, including extractive (n=13) and abstractive summarization (n=19); topic modeling (n=5); summary specification (n=11); concept and relation extraction (n=30); visual design considerations (n=59); and complete pipelines (n=7) using information extraction, synthesis, and communication. Graphical displays (n=53), short texts (n=41), static reports (n=7), and problem-oriented views (n=7) were the most common types in terms of summary communication. Although temporality and uncertainty information were usually not conserved in most studies (74/128, 57.8% and 113/128, 88.3%, respectively), some studies presented solutions to treat this information. Overall, 115 (89.8%) articles showed results of an evaluation, and methods included evaluations with human participants (median 15, IQR 24 participants): measurements in experiments with human participants (n=31), real situations (n=8), and usability studies (n=28). Methods without human involvement included intrinsic evaluation (n=24), performance on a proxy (n=10), or domain-specific tasks (n=11). Overall, 11 (8.6%) reports described a system deployed in clinical settings.

CONCLUSIONS

The scientific literature contains many propositions for summarizing patient information but reports very few comparisons of these proposals. This work proposes to compare these algorithms through how they conserve essential aspects of clinical information and through the "collect-synthesize-communicate" framework. We found that current propositions usually address these 3 steps only partially. Moreover, they conserve and use temporality, uncertainty, and pertinent medical aspects to varying extents, and solutions are often preliminary.

Collapse

Stead WW, Flatley Brennan P. Celebrating Suzanne Bakken, 2023 Morris F. Collen Award winner and pioneer in health equity. J Am Med Inform Assoc 2023;30:1760-1761. [PMID: 37855452 PMCID: PMC10586030 DOI: 10.1093/jamia/ocad189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 09/05/2023] [Indexed: 10/20/2023] Open

Shah AD, Subramanian A, Lewis J, Dhalla S, Ford E, Haroon S, Kuan V, Nirantharakumar K. Long Covid symptoms and diagnosis in primary care: A cohort study using structured and unstructured data in The Health Improvement Network primary care database. PLoS One 2023;18:e0290583. [PMID: 37751444 PMCID: PMC10521988 DOI: 10.1371/journal.pone.0290583] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/11/2023] [Indexed: 09/28/2023] Open

Nishioka S, Asano M, Yada S, Aramaki E, Yajima H, Yanagisawa Y, Sayama K, Kizaki H, Hori S. Adverse event signal extraction from cancer patients' narratives focusing on impact on their daily-life activities. Sci Rep 2023;13:15516. [PMID: 37726371 PMCID: PMC10509234 DOI: 10.1038/s41598-023-42496-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 09/11/2023] [Indexed: 09/21/2023] Open

Dhingra LS, Shen M, Mangla A, Khera R. Cardiovascular Care Innovation through Data-Driven Discoveries in the Electronic Health Record. Am J Cardiol 2023;203:136-148. [PMID: 37499593 PMCID: PMC10865722 DOI: 10.1016/j.amjcard.2023.06.104] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/24/2023] [Accepted: 06/29/2023] [Indexed: 07/29/2023]

Adamson B, Waskom M, Blarre A, Kelly J, Krismer K, Nemeth S, Gippetti J, Ritten J, Harrison K, Ho G, Linzmayer R, Bansal T, Wilkinson S, Amster G, Estola E, Benedum CM, Fidyk E, Estévez M, Shapiro W, Cohen AB. Approach to machine learning for extraction of real-world data variables from electronic health records. Front Pharmacol 2023;14:1180962. [PMID: 37781703 PMCID: PMC10541019 DOI: 10.3389/fphar.2023.1180962] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 08/25/2023] [Indexed: 10/03/2023] Open

Abstract

Background: As artificial intelligence (AI) continues to advance with breakthroughs in natural language processing (NLP) and machine learning (ML), such as the development of models like OpenAI's ChatGPT, new opportunities are emerging for efficient curation of electronic health records (EHR) into real-world data (RWD) for evidence generation in oncology. Our objective is to describe the research and development of industry methods to promote transparency and explainability. Methods: We applied NLP with ML techniques to train, validate, and test the extraction of information from unstructured documents (e.g., clinician notes, radiology reports, lab reports, etc.) to output a set of structured variables required for RWD analysis. This research used a nationwide electronic health record (EHR)-derived database. Models were selected based on performance. Variables curated with an approach using ML extraction are those where the value is determined solely based on an ML model (i.e. not confirmed by abstraction), which identifies key information from visit notes and documents. These models do not predict future events or infer missing information. Results: We developed an approach using NLP and ML for extraction of clinically meaningful information from unstructured EHR documents and found high performance of output variables compared with variables curated by manually abstracted data. These extraction methods resulted in research-ready variables including initial cancer diagnosis with date, advanced/metastatic diagnosis with date, disease stage, histology, smoking status, surgery status with date, biomarker test results with dates, and oral treatments with dates. Conclusion: NLP and ML enable the extraction of retrospective clinical data in EHR with speed and scalability to help researchers learn from the experience of every person with cancer.

Collapse

Fraile Navarro D, Ijaz K, Rezazadegan D, Rahimi-Ardabili H, Dras M, Coiera E, Berkovsky S. Clinical named entity recognition and relation extraction using natural language processing of medical free text: A systematic review. Int J Med Inform 2023;177:105122. [PMID: 37295138 DOI: 10.1016/j.ijmedinf.2023.105122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 04/14/2023] [Accepted: 06/03/2023] [Indexed: 06/12/2023]

Abstract

BACKGROUND

Natural Language Processing (NLP) applications have developed over the past years in various fields including its application to clinical free text for named entity recognition and relation extraction. However, there has been rapid developments the last few years that there's currently no overview of it. Moreover, it is unclear how these models and tools have been translated into clinical practice. We aim to synthesize and review these developments.

METHODS

We reviewed literature from 2010 to date, searching PubMed, Scopus, the Association of Computational Linguistics (ACL), and Association of Computer Machinery (ACM) libraries for studies of NLP systems performing general-purpose (i.e., not disease- or treatment-specific) information extraction and relation extraction tasks in unstructured clinical text (e.g., discharge summaries).

RESULTS

We included in the review 94 studies with 30 studies published in the last three years. Machine learning methods were used in 68 studies, rule-based in 5 studies, and both in 22 studies. 63 studies focused on Named Entity Recognition, 13 on Relation Extraction and 18 performed both. The most frequently extracted entities were "problem", "test" and "treatment". 72 studies used public datasets and 22 studies used proprietary datasets alone. Only 14 studies defined clearly a clinical or information task to be addressed by the system and just three studies reported its use outside the experimental setting. Only 7 studies shared a pre-trained model and only 8 an available software tool.

DISCUSSION

Machine learning-based methods have dominated the NLP field on information extraction tasks. More recently, Transformer-based language models are taking the lead and showing the strongest performance. However, these developments are mostly based on a few datasets and generic annotations, with very few real-world use cases. This may raise questions about the generalizability of findings, translation into practice and highlights the need for robust clinical evaluation.

Collapse

Magoc T, Allen KS, McDonnell C, Russo JP, Cummins J, Vest JR, Harle CA. Generalizability and portability of natural language processing system to extract individual social risk factors. Int J Med Inform 2023;177:105115. [PMID: 37302362 PMCID: PMC11164320 DOI: 10.1016/j.ijmedinf.2023.105115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 05/15/2023] [Accepted: 05/30/2023] [Indexed: 06/13/2023]

Hobensack M, Zhao Y, Scharp D, Volodarskiy A, Slotwiner D, Reading Turchioe M. Characterising symptom clusters in patients with atrial fibrillation undergoing catheter ablation. Open Heart 2023;10:e002385. [PMID: 37541744 PMCID: PMC10407417 DOI: 10.1136/openhrt-2023-002385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 07/11/2023] [Indexed: 08/06/2023] Open

Abstract

OBJECTIVE

This study aims to leverage natural language processing (NLP) and machine learning clustering analyses to (1) identify co-occurring symptoms of patients undergoing catheter ablation for atrial fibrillation (AF) and (2) describe clinical and sociodemographic correlates of symptom clusters.

METHODS

We conducted a cross-sectional retrospective analysis using electronic health records data. Adults who underwent AF ablation between 2010 and 2020 were included. Demographic, comorbidity and medication information was extracted using structured queries. Ten AF symptoms were extracted from unstructured clinical notes (n=13 416) using a validated NLP pipeline (F-score=0.81). We used the unsupervised machine learning approach known as Ward's hierarchical agglomerative clustering to characterise and identify subgroups of patients representing different clusters. Fisher's exact tests were used to investigate subgroup differences based on age, gender, race and heart failure (HF) status.

RESULTS

A total of 1293 patients were included in our analysis (mean age 65.5 years, 35.2% female, 58% white). The most frequently documented symptoms were dyspnoea (64%), oedema (62%) and palpitations (57%). We identified six symptom clusters: generally symptomatic, dyspnoea and oedema, chest pain, anxiety, fatigue and palpitations, and asymptomatic (reference). The asymptomatic cluster had a significantly higher prevalence of male, white and comorbid HF patients.

CONCLUSIONS

We applied NLP and machine learning to a large dataset to identify symptom clusters, which may signify latent biological underpinnings of symptom experiences and generate implications for clinical care. AF patients' symptom experiences vary widely. Given prior work showing that AF symptoms predict adverse outcomes, future work should investigate associations between symptom clusters and postablation outcomes.

Collapse

Li H, Gerkin RC, Bakke A, Norel R, Cecchi G, Laudamiel C, Niv MY, Ohla K, Hayes JE, Parma V, Meyer P. Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions. COMMUNICATIONS MEDICINE 2023;3:104. [PMID: 37500763 PMCID: PMC10374642 DOI: 10.1038/s43856-023-00334-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 07/19/2023] [Indexed: 07/29/2023] Open

Crowson MG, Alsentzer E, Fiskio J, Bates DW. Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.07.23292367. [PMID: 37502975 PMCID: PMC10370228 DOI: 10.1101/2023.07.07.23292367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]

Abstract

Objectives

Our primary objective was to develop a natural language processing approach that accurately predicts outpatient Evaluation and Management (E/M) level of service (LoS) codes using clinicians' notes from a health system electronic health record. A secondary objective was to investigate the impact of clinic note de-identification on document classification performance.

Methods

We used retrospective outpatient office clinic notes from four medical and surgical specialties. Classification models were fine-tuned on the clinic notes datasets and stratified by subspecialty. The success criteria for the classification tasks were the classification accuracy and F1-scores on internal test data. For the secondary objective, the dataset was de-identified using Named Entity Recognition (NER) to remove protected health information (PHI), and models were retrained.

Results

The models demonstrated similar predictive performance across different specialties, except for internal medicine, which had the lowest classification accuracy across all model architectures. The models trained on the entire note corpus achieved an E/M LoS CPT code classification accuracy of 74.8% (CI 95: 74.1-75.6). However, the de-identified note corpus showed a markedly lower classification accuracy of 48.2% (CI 95: 47.7-48.6) compared to the model trained on the identified notes.

Conclusion

The study demonstrates the potential of NLP-based document classifiers to accurately predict E/M LoS CPT codes using clinical notes from various medical and procedural specialties. The models' performance suggests that the classification task's complexity merits further investigation. The de-identification experiment demonstrated that de-identification may negatively impact classifier performance. Further research is needed to validate the performance of our NLP classifiers in different healthcare settings and patient populations and to investigate the potential implications of de-identification on model performance.

Collapse

Sun Z, Shi H, Huang Z, Ding N. Learning Representations from Medical Text for Effective Diagnoses and Knowledge Discovery. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023;2023:1-7. [PMID: 38083156 DOI: 10.1109/embc40787.2023.10340797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]