1
|
Zhou Z, Qin P, Cheng X, Shao M, Ren Z, Zhao Y, Li Q, Liu L. ChatGPT in Oncology Diagnosis and Treatment: Applications, Legal and Ethical Challenges. Curr Oncol Rep 2025:10.1007/s11912-025-01649-3. [PMID: 39998782 DOI: 10.1007/s11912-025-01649-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Accepted: 02/01/2025] [Indexed: 02/27/2025]
Abstract
PURPOSE OF REVIEW This study aims to systematically review the trajectory of artificial intelligence (AI) development in the medical field, with a particular emphasis on ChatGPT, a cutting-edge tool that is transforming oncology's diagnosis and treatment practices. RECENT FINDINGS Recent advancements have demonstrated that ChatGPT can be effectively utilized in various areas, including collecting medical histories, conducting radiological & pathological diagnoses, generating electronic medical record (EMR), providing nutritional support, participating in Multidisciplinary Team (MDT) and formulating personalized, multidisciplinary treatment plans. However, some significant challenges related to data privacy and legal issues that need to be addressed for the safe and effective integration of ChatGPT into clinical practice. ChatGPT, an emerging AI technology, opens up new avenues and viewpoints for oncology diagnosis and treatment. If current technological and legal challenges can be overcome, ChatGPT is expected to play a more significant role in oncology diagnosis and treatment in the future, providing better treatment options and improving the quality of medical services.
Collapse
Affiliation(s)
- Zihan Zhou
- The First Clinical Medical College of Nanjing Medical University, Nanjing, 211166, China
| | - Peng Qin
- The First Clinical Medical College of Nanjing Medical University, Nanjing, 211166, China
| | - Xi Cheng
- The First Clinical Medical College of Nanjing Medical University, Nanjing, 211166, China
| | - Maoxuan Shao
- The First Clinical Medical College of Nanjing Medical University, Nanjing, 211166, China
| | - Zhaozheng Ren
- The First Clinical Medical College of Nanjing Medical University, Nanjing, 211166, China
| | - Yiting Zhao
- Stomatological College of Nanjing Medical University, Nanjing, 211166, China
| | - Qiunuo Li
- The First Clinical Medical College of Nanjing Medical University, Nanjing, 211166, China
| | - Lingxiang Liu
- Department of Oncology, The First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing, 210029, Jiangsu, China.
| |
Collapse
|
2
|
Watkins H, Gray R, Julius A, Mah YH, Teo J, Pinaya WHL, Wright P, Jha A, Engleitner H, Cardoso J, Ourselin S, Rees G, Jaeger R, Nachev P. Neuradicon: Operational representation learning of neuroimaging reports. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 262:108638. [PMID: 39951958 DOI: 10.1016/j.cmpb.2025.108638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 02/13/2024] [Revised: 01/18/2025] [Accepted: 02/01/2025] [Indexed: 02/17/2025]
Abstract
BACKGROUND AND OBJECTIVE Radiological reports typically summarize the content and interpretation of imaging studies in unstructured form that precludes quantitative analysis. This limits the monitoring of radiological services to throughput undifferentiated by content, impeding specific, targeted operational optimization. Here we present Neuradicon, a natural language processing (NLP) framework for quantitative analysis of neuroradiological reports. METHODS Our framework is a hybrid of rule-based and machine-learning models to represent neurological reports in succinct, quantitative form optimally suited to operational guidance. These include probabilistic models for text classification and tagging tasks, alongside auto-encoders for learning latent representations and statistical mapping of the latent space. RESULTS We demonstrate the application of Neuradicon to operational phenotyping of a corpus of 336,569 reports, and report excellent generalizability across time and two independent healthcare institutions. In particular, we report pathology classification metrics with f1-scores of 0.96 on prospective data, and semantic means of interrogating the phenotypes surfaced via latent space representations. CONCLUSION Neuradicon allows the segmentation, analysis, classification, representation and interrogation of neuroradiological reports structure and content. It offers a blueprint for the extraction of rich, quantitative, actionable signals from unstructured text data in an operational context.
Collapse
Affiliation(s)
- Henry Watkins
- Queen Square Institute of Neurology, University College London, London, United Kingdom.
| | - Robert Gray
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Adam Julius
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Yee-Haur Mah
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, United Kingdom
| | - James Teo
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, United Kingdom
| | - Walter H L Pinaya
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, United Kingdom
| | - Paul Wright
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, United Kingdom
| | - Ashwani Jha
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Holger Engleitner
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Jorge Cardoso
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, United Kingdom
| | - Sebastien Ourselin
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, United Kingdom
| | - Geraint Rees
- University College London, London, United Kingdom
| | - Rolf Jaeger
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Parashkev Nachev
- Queen Square Institute of Neurology, University College London, London, United Kingdom.
| |
Collapse
|
3
|
McCaffrey P, Jackups R, Seheult J, Zaydman MA, Balis U, Thaker HM, Rashidi H, Gullapalli RR. Evaluating Use of Generative Artificial Intelligence in Clinical Pathology Practice: Opportunities and the Way Forward. Arch Pathol Lab Med 2025; 149:130-141. [PMID: 39384182 DOI: 10.5858/arpa.2024-0208-ra] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Accepted: 09/05/2024] [Indexed: 10/11/2024]
Abstract
CONTEXT.— Generative artificial intelligence (GAI) technologies are likely to dramatically impact health care workflows in clinical pathology (CP). Applications in CP include education, data mining, decision support, result summaries, and patient trend assessments. OBJECTIVE.— To review use cases of GAI in CP, with a particular focus on large language models. Specific examples are provided for the applications of GAI in the subspecialties of clinical chemistry, microbiology, hematopathology, and molecular diagnostics. Additionally, the review addresses potential pitfalls of GAI paradigms. DATA SOURCES.— Current literature on GAI in health care was reviewed broadly. The use case scenarios for each CP subspecialty review common data sources generated in each subspecialty. The potential for utilization of CP data in the GAI context was subsequently assessed, focusing on issues such as future reporting paradigms, impact on quality metrics, and potential for translational research activities. CONCLUSIONS.— GAI is a powerful tool with the potential to revolutionize health care for patients and practitioners alike. However, GAI must be implemented with much caution considering various shortcomings of the technology such as biases, hallucinations, practical challenges of implementing GAI in existing CP workflows, and end-user acceptance. Human-in-the-loop models of GAI implementation have the potential to revolutionize CP by delivering deeper, meaningful insights into patient outcomes both at an individual and a population level.
Collapse
Affiliation(s)
- Peter McCaffrey
- From the Departments of Pathology (McCaffrey, Thaker) and Radiology (McCaffrey), University of Texas Medical Branch, Galveston
| | - Ronald Jackups
- the Department of Pathology and Immunology, Washington University School of Medicine, St Louis, Missouri (Jackups, Zaydman)
| | - Jansen Seheult
- the Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (Seheult)
| | - Mark A Zaydman
- the Department of Pathology and Immunology, Washington University School of Medicine, St Louis, Missouri (Jackups, Zaydman)
| | - Ulysses Balis
- the Department of Pathology, University of Michigan, Ann Arbor (Balis)
| | - Harshwardhan M Thaker
- From the Departments of Pathology (McCaffrey, Thaker) and Radiology (McCaffrey), University of Texas Medical Branch, Galveston
| | - Hooman Rashidi
- Computational Pathology & AI Center of Excellence, University of Pittsburgh, School of Medicine & UPMC, Pittsburgh, Pennsylvania (Rashidi)
| | - Rama R Gullapalli
- the Department of Pathology, Department of Chemical and Biological Engineering, University of New Mexico, Albuquerque (Gullapalli)
| |
Collapse
|
4
|
Mahdavifar S, Fakhrahmad SM, Ansarifard E. Estimating the Severity of Oral Lesions Via Analysis of Cone Beam Computed Tomography Reports: A Proposed Deep Learning Model. Int Dent J 2025; 75:135-143. [PMID: 39068121 PMCID: PMC11806341 DOI: 10.1016/j.identj.2024.06.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/31/2024] [Revised: 06/03/2024] [Accepted: 06/06/2024] [Indexed: 07/30/2024] Open
Abstract
OBJECTIVES Several factors such as unavailability of specialists, dental phobia, and financial difficulties may lead to a delay between receiving an oral radiology report and consulting a dentist. The primary aim of this study was to distinguish between high-risk and low-risk oral lesions according to the radiologist's reports of cone beam computed tomography (CBCT) images. Such a facility may be employed by dentist or his/her assistant to make the patient aware of the severity and the grade of the oral lesion and referral for immediate treatment or other follow-up care. METHODS A total number of 1134 CBCT radiography reports owned by Shiraz University of Medical Sciences were collected. The severity level of each sample was specified by three experts, and an annotation was carried out accordingly. After preprocessing the data, a deep learning model, referred to as CNN-LSTM, was developed, which aims to detect the degree of severity of the problem based on analysis of the radiologist's report. Unlike traditional models which usually use a simple collection of words, the proposed deep model uses words embedded in dense vector representations, which empowers it to effectively capture semantic similarities. RESULTS The results indicated that the proposed model outperformed its counterparts in terms of precision, recall, and F1 criteria. This suggests its potential as a reliable tool for early estimation of the severity of oral lesions. CONCLUSIONS This study shows the effectiveness of deep learning in the analysis of textual reports and accurately distinguishing between high-risk and low-risk lesions. Employing the proposed model which can Provide timely warnings about the need for follow-up and prompt treatment can shield the patient from the risks associated with delays. CLINICAL SIGNIFICANCE Our collaboratively collected and expert-annotated dataset serves as a valuable resource for exploratory research. The results demonstrate the pivotal role of our deep learning model could play in assessing the severity of oral lesions in dental reports.
Collapse
Affiliation(s)
- Sare Mahdavifar
- Dept. of Computer Science and Engineering and IT, Shiraz University, Shiraz, Iran
| | | | - Elham Ansarifard
- Dept. of Prosthodontics, School of Dentistry, Shiraz University of Medical Sciences, Shiraz, Iran; Biomaterials Research Center, School of Dentistry, Shiraz University of Medical Sciences, Shiraz, Iran.
| |
Collapse
|
5
|
Edwards PJ, Finnikin S, Wilson F, Bennett-Britton I, Carson-Stevens A, Barnes RK, Payne RA. Safety-netting advice documentation in out-of-hours primary care: a retrospective cohort from 2013 to 2020. Br J Gen Pract 2025; 75:e80-e89. [PMID: 38950945 PMCID: PMC11694318 DOI: 10.3399/bjgp.2024.0057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/31/2024] [Accepted: 06/26/2024] [Indexed: 07/03/2024] Open
Abstract
BACKGROUND Providing safety-netting advice (SNA) in out-of-hours (OOH) primary care is a recognised standard of safe care, but it is not known how frequently this occurs in practice. AIM Assess the frequency and type of SNA documented in OOH primary care and explore factors associated with its presence. DESIGN AND SETTING This was a retrospective cohort study using the Birmingham Out-of-hours general practice Research Database. METHOD A stratified sample of 30 adult consultation records per month from July 2013 to February 2020 were assessed using a safety-netting coding tool. Associations were tested using linear and logistic regression. RESULTS The overall frequency of SNA per consultation was 78.0% (1472/1886), increasing from 75.7% (224/296) in 2014 to 81.5% (220/270) in 2019. The proportion of specific SNA and the average number of symptoms patients were told to look out for increased with time. The most common symptom to look out for was if the patients' condition worsened followed by if their symptoms persisted, but only one in five consultations included a timeframe to reconsult for persistent symptoms. SNA was more frequently documented in face-to-face treatment-centre encounters compared with telephone consultations (odds ratio [OR] 1.77, 95% confidence interval [CI] = 1.09 to 2.85, P = 0.02), for possible infections (OR 1.53, 95% CI = 1.13 to 2.07, P = 0.006), and less frequently for mental (versus physical) health consultations (OR 0.33, 95% CI = 0.17 to 0.66, P = 0.002) and where follow-up was planned (OR 0.34, 95% CI = 0.25 to 0.46, P<0.001). CONCLUSION The frequency of SNA documented in OOH primary care was higher than previously reported during in-hours care. Over time, the frequency of SNA and proportion that contained specific advice increased, however, this study highlights potential consultations where SNA could be improved, such as mental health and telephone consultations.
Collapse
Affiliation(s)
- Peter J Edwards
- Centre for Academic Primary Care, Bristol Medical School, University of Bristol, Bristol and honorary research associate, Institute of Applied Health Research, University of Birmingham, Birmingham
| | - Samuel Finnikin
- Institute of Applied Health Research, University of Birmingham, Birmingham
| | | | - Ian Bennett-Britton
- Centre for Academic Primary Care, Bristol Medical School, University of Bristol, Bristol
| | - Andrew Carson-Stevens
- Primary and Emergency Care Research (PRIME) Centre, Division of Population Medicine, School of Medicine, Cardiff University, Cardiff
| | - Rebecca K Barnes
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford
| | - Rupert A Payne
- Exeter Collaboration for Academic Primary Care, University of Exeter Medical School, Exeter
| |
Collapse
|
6
|
Kanda E. Development of Artificial Intelligence Systems for Chronic Kidney Disease. JMA J 2025; 8:48-56. [PMID: 39926055 PMCID: PMC11799718 DOI: 10.31662/jmaj.2024-0090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/03/2024] [Accepted: 06/03/2024] [Indexed: 02/11/2025] Open
Abstract
Chronic kidney disease (CKD) is a complex disease that is related not only to dialysis but also to the onset of cardiovascular disease and life prognosis. As renal function declines with age and depending on lifestyle, the number of patients with CKD is rapidly increasing in Japan. Accurate prognosis prediction for patients with CKD in clinical settings is important for selecting treatment methods and screening patients with high-risk. In recent years, big databases on CKD and dialysis have been constructed through the use of data science technology, and the pathology of CKD is being elucidated. Therefore, we developed an artificial intelligence (AI) system that can accurately predict the prognosis of CKD such as its progression, the timing of dialysis introduction, and death. Aiming for its social implementation, the prognosis prediction system developed for patients with CKD was released on the website. We then developed a clinical practice guideline creation support system called Doctor K as an AI system. When creating clinical practice guidelines, huge amounts of manpower and time are required to conduct a systematic review of thousands of papers. Therefore, we developed a natural language processing (NLP) AI system to significantly improve work efficiency. Doctor K was used in the preparation of the guidelines of the Japanese Society of Nephrology. Furthermore, by comparing and analyzing the medical word virtual space constructed by the NLP AI system based on patient big data, we proved using the latest mathematical theory (category theory) that this system reflects the pathology of CKD. This suggests the possibility that the NLP AI system can predict patient prognosis. We hope that, through these studies, the use of AI based on big data will lead to the development of new treatments and improvement in patient prognosis.
Collapse
Affiliation(s)
- Eiichiro Kanda
- Department of Health Data Science, Kawasaki Medical School, Kukrashiki, Japan
| |
Collapse
|
7
|
Katz A, Ekuma O, Enns JE, Cavett T, Singer A, Sanchez-Ramirez DC, Keynan Y, Lix L, Walld R, Yogendran M, Nickel NC, Urquia M, Star L, Olafson K, Logsetty S, Spiwak R, Waruk J, Matharaarachichi S. Identifying people with post-COVID condition using linked, population-based administrative health data from Manitoba, Canada: prevalence and predictors in a cohort of COVID-positive individuals. BMJ Open 2025; 15:e087920. [PMID: 39788761 PMCID: PMC11751946 DOI: 10.1136/bmjopen-2024-087920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Academic Contribution Register] [Received: 04/22/2024] [Accepted: 12/05/2024] [Indexed: 01/12/2025] Open
Abstract
OBJECTIVE Many individuals exposed to SARS-CoV-2 experience long-term symptoms as part of a syndrome called post-COVID condition (PCC). Research on PCC is still emerging but is urgently needed to support diagnosis, clinical treatment guidelines and health system resource allocation. In this study, we developed a method to identify PCC cases using administrative health data and report PCC prevalence and predictive factors in Manitoba, Canada. DESIGN Cohort study. SETTING Manitoba, Canada. PARTICIPANTS All Manitobans who tested positive for SARS-CoV-2 during population-wide PCR testing from March 2020 to December 2021 (n=66 365) and were subsequently deemed to have PCC based on International Classification of Disease-9/10 diagnostic codes and prescription drug codes (n=11 316). Additional PCC cases were identified using predictive modelling to assess patterns of health service use, including physician visits, emergency department visits and hospitalisation for any reason (n=4155). OUTCOMES We measured PCC prevalence as % PCC cases among Manitobans with positive tests and identified predictive factors associated with PCC by calculating odds ratios with 95% confidence intervals, adjusted for sociodemographic and clinical characteristics (aOR). RESULTS Among 66 365 Manitobans with positive tests, we identified 15 471 (23%) as having PCC. Being female (aOR 1.64, 95% CI 1.58 to 1.71), being age 60-79 (aOR 1.33, 95% CI 1.25 to 1.41) or age 80+ (aOR 1.62, 95% CI 1.46 to 1.80), being hospitalised within 14 days of COVID-19 infection (aOR 1.95, 95% CI 1.80 to 2.10) and having a Charlson Comorbidity Index of 1+ (aOR 1.95, 95% CI 1.78 to 2.14) were predictive of PCC. Receiving 1+ doses of the COVID-19 vaccine (one dose, aOR 0.80, 95% CI 0.74 to 0.86; two doses, aOR 0.29, 95% CI 0.22 to 0.31) decreased the odds of PCC. CONCLUSIONS This data-driven approach expands our understanding of the prevalence and epidemiology of PCC and may be applied in other jurisdictions with population-based data. The study provides additional insights into risk and protective factors for PCC to inform health system planning and service delivery.
Collapse
Affiliation(s)
- Alan Katz
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
- Department of Family Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Okechukwu Ekuma
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Jennifer E Enns
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Teresa Cavett
- Department of Family Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Alexander Singer
- Department of Family Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Diana C Sanchez-Ramirez
- College of Rehabilitation Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Yoav Keynan
- Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Lisa Lix
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Randy Walld
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Marina Yogendran
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Nathan C Nickel
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Marcelo Urquia
- Manitoba Centre for Health Policy, Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Leona Star
- First Nations Health and Social Secretariat of Manitoba, Winnipeg, Manitoba, Canada
| | - Kendiss Olafson
- Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Sarvesh Logsetty
- Department of Surgery, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Rae Spiwak
- Department of Surgery, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Jillian Waruk
- First Nations Health and Social Secretariat of Manitoba, Winnipeg, Manitoba, Canada
| | | |
Collapse
|
8
|
Shen Y, Yu J, Zhou J, Hu G. Twenty-Five Years of Evolution and Hurdles in Electronic Health Records and Interoperability in Medical Research: Comprehensive Review. J Med Internet Res 2025; 27:e59024. [PMID: 39787599 PMCID: PMC11757985 DOI: 10.2196/59024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/31/2024] [Revised: 10/02/2024] [Accepted: 12/05/2024] [Indexed: 01/12/2025] Open
Abstract
BACKGROUND Electronic health records (EHRs) facilitate the accessibility and sharing of patient data among various health care providers, contributing to more coordinated and efficient care. OBJECTIVE This study aimed to summarize the evolution of secondary use of EHRs and their interoperability in medical research over the past 25 years. METHODS We conducted an extensive literature search in the PubMed, Scopus, and Web of Science databases using the keywords Electronic health record and Electronic medical record in the title or abstract and Medical research in all fields from 2000 to 2024. Specific terms were applied to different time periods. RESULTS The review yielded 2212 studies, all of which were then screened and processed in a structured manner. Of these 2212 studies, 2102 (93.03%) were included in the review analysis, of which 1079 (51.33%) studies were from 2000 to 2009, 582 (27.69%) were from 2010 to 2019, 251 (11.94%) were from 2020 to 2023, and 190 (9.04%) were from 2024. CONCLUSIONS The evolution of EHRs marks an important milestone in health care's journey toward integrating technology and medicine. From early documentation practices to the sophisticated use of artificial intelligence and big data analytics today, EHRs have become central to improving patient care, enhancing public health surveillance, and advancing medical research.
Collapse
Affiliation(s)
- Yun Shen
- Chronic Disease Epidemiology, Population and Public Health, Pennington Biomedical Research Center, Baton Rouge, LA, United States
| | - Jiamin Yu
- Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jian Zhou
- Department of Endocrinology and Metabolism, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Gang Hu
- Chronic Disease Epidemiology, Population and Public Health, Pennington Biomedical Research Center, Baton Rouge, LA, United States
| |
Collapse
|
9
|
Menezes MCS, Hoffmann AF, Tan ALM, Nalbandyan M, Omenn GS, Mazzotti DR, Hernández-Arango A, Visweswaran S, Venkatesh S, Mandl KD, Bourgeois FT, Lee JWK, Makmur A, Hanauer DA, Semanik MG, Kerivan LT, Hill T, Forero J, Restrepo C, Vigna M, Ceriana P, Abu-El-Rub N, Avillach P, Bellazzi R, Callaci T, Gutiérrez-Sacristán A, Malovini A, Mathew JP, Morris M, Murthy VL, Buonocore TM, Parimbelli E, Patel LP, Sáez C, Samayamuthu MJ, Thompson JA, Tibollo V, Xia Z, Kohane IS. The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study. Lancet Digit Health 2025; 7:e35-e43. [PMID: 39722251 DOI: 10.1016/s2589-7500(24)00246-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/09/2024] [Revised: 08/09/2024] [Accepted: 10/28/2024] [Indexed: 12/28/2024]
Abstract
BACKGROUND Patient notes contain substantial information but are difficult for computers to analyse due to their unstructured format. Large-language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4), have changed our ability to process text, but we do not know how effectively they handle medical notes. We aimed to assess the ability of GPT-4 to answer predefined questions after reading medical notes in three different languages. METHODS For this retrospective model-evaluation study, we included eight university hospitals from four countries (ie, the USA, Colombia, Singapore, and Italy). Each site submitted seven de-identified medical notes related to seven separate patients to the coordinating centre between June 1, 2023, and Feb 28, 2024. Medical notes were written between Feb 1, 2020, and June 1, 2023. One site provided medical notes in Spanish, one site provided notes in Italian, and the remaining six sites provided notes in English. We included admission notes, progress notes, and consultation notes. No discharge summaries were included in this study. We advised participating sites to choose medical notes that, at time of hospital admission, were for patients who were male or female, aged 18-65 years, had a diagnosis of obesity, had a diagnosis of COVID-19, and had submitted an admission note. Adherence to these criteria was optional and participating sites randomly chose which medical notes to submit. When entering information into GPT-4, we prepended each medical note with an instruction prompt and a list of 14 questions that had been chosen a priori. Each medical note was individually given to GPT-4 in its original language and in separate sessions; the questions were always given in English. At each site, two physicians independently validated responses by GPT-4 and responded to all 14 questions. Each pair of physicians evaluated responses from GPT-4 to the seven medical notes from their own site only. Physicians were not masked to responses from GPT-4 before providing their own answers, but were masked to responses from the other physician. FINDINGS We collected 56 medical notes, of which 42 (75%) were in English, seven (13%) were in Italian, and seven (13%) were in Spanish. For each medical note, GPT-4 responded to 14 questions, resulting in 784 responses. In 622 (79%, 95% CI 76-82) of 784 responses, both physicians agreed with GPT-4. In 82 (11%, 8-13) responses, only one physician agreed with GPT-4. In the remaining 80 (10%, 8-13) responses, neither physician agreed with GPT-4. Both physicians agreed with GPT-4 more often for medical notes written in Spanish (86 [88%, 95% CI 79-93] of 98 responses) and Italian (82 [84%, 75-90] of 98 responses) than in English (454 [77%, 74-80] of 588 responses). INTERPRETATION The results of our model-evaluation study suggest that GPT-4 is accurate when analysing medical notes in three different languages. In the future, research should explore how LLMs can be integrated into clinical workflows to maximise their use in health care. FUNDING None.
Collapse
Affiliation(s)
- Maria Clara Saad Menezes
- Department of Biomedical Informatics, Medical School, Harvard University, Boston, MA, USA; Department of Internal Medicine, University of Texas at Southwestern, Dallas, TX, USA
| | - Alexander F Hoffmann
- Department of Biomedical Informatics, Medical School, Harvard University, Boston, MA, USA
| | - Amelia L M Tan
- Department of Biomedical Informatics, Medical School, Harvard University, Boston, MA, USA
| | - Mariné Nalbandyan
- Office of Informatics and Information Technology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | - Gilbert S Omenn
- Computational Medicine and Bioinformatics, Internal Medicine, Human Genetics, Environmental Health, University of Michigan, Ann Arbor, MI, USA
| | - Diego R Mazzotti
- Division of Medical Informatics and Division of Pulmonary Critical Care and Sleep Medicine, Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, USA
| | - Alejandro Hernández-Arango
- Department of Internal Medicine, University of Antioquia, Hospital Alma Máter de Antioquia, Medellín, Colombia
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Shruthi Venkatesh
- Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kenneth D Mandl
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA
| | | | - James W K Lee
- Department of Surgery, National University Health System, Singapore
| | - Andrew Makmur
- Department of Diagnostic Imaging, National University Health System, Singapore
| | - David A Hanauer
- Department of Learning Health Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Michael G Semanik
- Office of Informatics and Information Technology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | - Lauren T Kerivan
- Department of Surgery, University of Kansas Medical Center, Kansas City, KS, USA
| | - Terra Hill
- Department of Surgery, University of Kansas Medical Center, Kansas City, KS, USA
| | - Julian Forero
- Department of Internal Medicine, University of Antioquia, Hospital Alma Máter de Antioquia, Medellín, Colombia
| | - Carlos Restrepo
- Department of Internal Medicine, University of Antioquia, Hospital Alma Máter de Antioquia, Medellín, Colombia
| | - Matteo Vigna
- Respiratory Rehabilitation Unit, Istituti Clinici Scientifici Maugeri Istituto di Ricovero e Cura a Carattere Scientifico, Pavia, Italy
| | - Piero Ceriana
- Respiratory Rehabilitation Unit, Istituti Clinici Scientifici Maugeri Istituto di Ricovero e Cura a Carattere Scientifico, Pavia, Italy
| | - Noor Abu-El-Rub
- Research Informatics, University of Kansas Medical Center, Kansas City, KS, USA
| | - Paul Avillach
- Department of Biomedical Informatics, Medical School, Harvard University, Boston, MA, USA
| | - Riccardo Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Thomas Callaci
- Office of Informatics and Information Technology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | | | - Alberto Malovini
- Laboratory of Medical Informatics and Artificial Intelligence, Istituti Clinici Scientifici Maugeri Istituto di Ricovero e Cura a Carattere Scientifico, Pavia, Italy
| | - Jomol P Mathew
- Office of Informatics and Information Technology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | - Michele Morris
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Venkatesh L Murthy
- Department of Internal Medicine and Frankel Cardiovascular Center, University of Michigan, Ann Arbor, MI, USA
| | - Tommaso M Buonocore
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Enea Parimbelli
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Lav P Patel
- Research Informatics, University of Kansas Medical Center, Kansas City, KS, USA
| | - Carlos Sáez
- Biomedical Data Science Lab, Instituto Universitario de Tecnologías de la Información y Comunicaciones, Universitat Politècnica de València, Valencia, Spain
| | | | - Jeffrey A Thompson
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, USA
| | - Valentina Tibollo
- Laboratory of Medical Informatics and Artificial Intelligence, Istituti Clinici Scientifici Maugeri Istituto di Ricovero e Cura a Carattere Scientifico, Pavia, Italy
| | - Zongqi Xia
- Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Isaac S Kohane
- Department of Biomedical Informatics, Medical School, Harvard University, Boston, MA, USA.
| |
Collapse
|
10
|
Nargesi AA, Adejumo P, Dhingra LS, Rosand B, Hengartner A, Coppi A, Benigeri S, Sen S, Ahmad T, Nadkarni GN, Lin Z, Ahmad FS, Krumholz HM, Khera R. Automated Identification of Heart Failure With Reduced Ejection Fraction Using Deep Learning-Based Natural Language Processing. JACC. HEART FAILURE 2025; 13:75-87. [PMID: 39453355 DOI: 10.1016/j.jchf.2024.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 02/13/2024] [Revised: 07/02/2024] [Accepted: 08/16/2024] [Indexed: 10/26/2024]
Abstract
BACKGROUND The lack of automated tools for measuring care quality limits the implementation of a national program to assess guideline-directed care in heart failure with reduced ejection fraction (HFrEF). OBJECTIVES The authors aimed to automate the identification of patients with HFrEF at hospital discharge, an opportunity to evaluate and improve the quality of care. METHODS The authors developed a novel deep-learning language model for identifying patients with HFrEF from discharge summaries of hospitalizations with heart failure at Yale New Haven Hospital during 2015 to 2019. HFrEF was defined by left ventricular ejection fraction <40% on antecedent echocardiography. The authors externally validated the model at Northwestern Medicine, community hospitals of Yale, and the MIMIC-III (Medical Information Mart for Intensive Care III) database. RESULTS A total of 13,251 notes from 5,392 unique individuals (age 73 ± 14 years, 48% women), including 2,487 patients with HFrEF (46.1%), were used for model development (train/held-out: 70%/30%). The model achieved an area under receiver-operating characteristic curve (AUROC) of 0.97 and area under precision recall curve (AUPRC) of 0.97 in detecting HFrEF on the held-out set. The model had high performance in identifying HFrEF with AUROC = 0.94 and AUPRC = 0.91 on 19,242 notes from Northwestern Medicine, AUROC = 0.95 and AUPRC = 0.96 on 139 manually abstracted notes from Yale community hospitals, and AUROC = 0.91 and AUPRC = 0.92 on 146 manually reviewed notes from MIMIC-III. Model-based predictions of HFrEF corresponded to a net reclassification improvement of 60.2% ± 1.9% compared with diagnosis codes (P < 0.001). CONCLUSIONS The authors developed a language model that identifies HFrEF from clinical notes with high precision and accuracy, representing a key element in automating quality assessment for individuals with HFrEF.
Collapse
Affiliation(s)
- Arash A Nargesi
- Heart and Vascular Center, Brigham and Women's Hospital, Harvard School of Medicine, Boston, Massachusetts, USA
| | - Philip Adejumo
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA
| | - Lovedeep Singh Dhingra
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA
| | - Benjamin Rosand
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA
| | - Astrid Hengartner
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA
| | - Andreas Coppi
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA; Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, Connecticut, USA
| | - Simon Benigeri
- Division of Cardiology, Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Sounok Sen
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA
| | - Tariq Ahmad
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA
| | - Girish N Nadkarni
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Zhenqiu Lin
- Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, Connecticut, USA
| | - Faraz S Ahmad
- Division of Cardiology, Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Harlan M Krumholz
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA; Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, Connecticut, USA; Department of Health Policy and Management, Yale School of Public Health, New Haven, Connecticut, USA; Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, Connecticut, USA; Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, Connecticut, USA.
| |
Collapse
|
11
|
Parasa S, Berzin T, Leggett C, Gross S, Repici A, Ahmad OF, Chiang A, Coelho-Prabhu N, Cohen J, Dekker E, Keswani RN, Kahn CE, Hassan C, Petrick N, Mountney P, Ng J, Riegler M, Mori Y, Saito Y, Thakkar S, Waxman I, Wallace MB, Sharma P. Consensus statements on the current landscape of artificial intelligence applications in endoscopy, addressing roadblocks, and advancing artificial intelligence in gastroenterology. Gastrointest Endosc 2025; 101:2-9.e1. [PMID: 38639679 DOI: 10.1016/j.gie.2023.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Academic Contribution Register] [Received: 12/01/2023] [Accepted: 12/02/2023] [Indexed: 04/20/2024]
Abstract
BACKGROUND AND AIMS The American Society for Gastrointestinal Endoscopy (ASGE) AI Task Force along with experts in endoscopy, technology space, regulatory authorities, and other medical subspecialties initiated a consensus process that analyzed the current literature, highlighted potential areas, and outlined the necessary research in artificial intelligence (AI) to allow a clearer understanding of AI as it pertains to endoscopy currently. METHODS A modified Delphi process was used to develop these consensus statements. RESULTS Statement 1: Current advances in AI allow for the development of AI-based algorithms that can be applied to endoscopy to augment endoscopist performance in detection and characterization of endoscopic lesions. Statement 2: Computer vision-based algorithms provide opportunities to redefine quality metrics in endoscopy using AI, which can be standardized and can reduce subjectivity in reporting quality metrics. Natural language processing-based algorithms can help with the data abstraction needed for reporting current quality metrics in GI endoscopy effortlessly. Statement 3: AI technologies can support smart endoscopy suites, which may help optimize workflows in the endoscopy suite, including automated documentation. Statement 4: Using AI and machine learning helps in predictive modeling, diagnosis, and prognostication. High-quality data with multidimensionality are needed for risk prediction, prognostication of specific clinical conditions, and their outcomes when using machine learning methods. Statement 5: Big data and cloud-based tools can help advance clinical research in gastroenterology. Multimodal data are key to understanding the maximal extent of the disease state and unlocking treatment options. Statement 6: Understanding how to evaluate AI algorithms in the gastroenterology literature and clinical trials is important for gastroenterologists, trainees, and researchers, and hence education efforts by GI societies are needed. Statement 7: Several challenges regarding integrating AI solutions into the clinical practice of endoscopy exist, including understanding the role of human-AI interaction. Transparency, interpretability, and explainability of AI algorithms play a key role in their clinical adoption in GI endoscopy. Developing appropriate AI governance, data procurement, and tools needed for the AI lifecycle are critical for the successful implementation of AI into clinical practice. Statement 8: For payment of AI in endoscopy, a thorough evaluation of the potential value proposition for AI systems may help guide purchasing decisions in endoscopy. Reliable cost-effectiveness studies to guide reimbursement are needed. Statement 9: Relevant clinical outcomes and performance metrics for AI in gastroenterology are currently not well defined. To improve the quality and interpretability of research in the field, steps need to be taken to define these evidence standards. Statement 10: A balanced view of AI technologies and active collaboration between the medical technology industry, computer scientists, gastroenterologists, and researchers are critical for the meaningful advancement of AI in gastroenterology. CONCLUSIONS The consensus process led by the ASGE AI Task Force and experts from various disciplines has shed light on the potential of AI in endoscopy and gastroenterology. AI-based algorithms have shown promise in augmenting endoscopist performance, redefining quality metrics, optimizing workflows, and aiding in predictive modeling and diagnosis. However, challenges remain in evaluating AI algorithms, ensuring transparency and interpretability, addressing governance and data procurement, determining payment models, defining relevant clinical outcomes, and fostering collaboration between stakeholders. Addressing these challenges while maintaining a balanced perspective is crucial for the meaningful advancement of AI in gastroenterology.
Collapse
Affiliation(s)
| | | | | | - Seth Gross
- NYU Langone Health, New York, New York, USA
| | - Alessandro Repici
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4 20072 Pieve Emanuele, Milan, Italy; IRCCS Humanitas Research Hospital, via Manzoni 56 20089 Rozzano, Milan, Italy
| | | | - Austin Chiang
- Medtronic Gastrointestinal, Santa Clara, California, USA
| | | | | | | | | | - Charles E Kahn
- University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Cesare Hassan
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4 20072 Pieve Emanuele, Milan, Italy; IRCCS Humanitas Research Hospital, via Manzoni 56 20089 Rozzano, Milan, Italy
| | - Nicholas Petrick
- Center for Devices and Radiological Health, U.S. Food and Drug Administration
| | | | - Jonathan Ng
- Iterative Health, Boston, Massachusetts, USA
| | | | | | | | - Shyam Thakkar
- West Virginia University Medicine, Morgantown, West Virginia, USA
| | - Irving Waxman
- Rush University Medical Center, Chicago, Illinois, USA
| | | | | |
Collapse
|
12
|
Rawson TM, Zhu N, Galiwango R, Cocker D, Islam MS, Myall A, Vasikasin V, Wilson R, Shafiq N, Das S, Holmes AH. Using digital health technologies to optimise antimicrobial use globally. Lancet Digit Health 2024; 6:e914-e925. [PMID: 39547912 DOI: 10.1016/s2589-7500(24)00198-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/22/2024] [Revised: 06/22/2024] [Accepted: 09/09/2024] [Indexed: 11/17/2024]
Abstract
Digital health technology (DHT) describes tools and devices that generate or process health data. The application of DHTs could improve the diagnosis, treatment, and surveillance of bacterial infection and the prevention of antimicrobial resistance (AMR). DHTs to optimise antimicrobial use are rapidly being developed. To support the global adoption of DHTs and the opportunities offered to optimise antimicrobial use consensus is needed on what data are required to support antimicrobial decision making. This Series paper will explore bacterial AMR in humans and the need to optimise antimicrobial use in response to this global threat. It will also describe state-of-the-art DHTs to optimise antimicrobial prescribing in high-income and low-income and middle-income countries, and consider what fundamental data are ideally required for and from such technologies to support optimised antimicrobial use.
Collapse
Affiliation(s)
- Timothy M Rawson
- Centre for Antimicrobial Optimisation, Imperial College London, London, UK; Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; The David Price Evans Global Health & Infectious Diseases Group, The University of Liverpool, Liverpool, UK.
| | - Nina Zhu
- Centre for Antimicrobial Optimisation, Imperial College London, London, UK; Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; The David Price Evans Global Health & Infectious Diseases Group, The University of Liverpool, Liverpool, UK
| | - Ronald Galiwango
- The African Centre of Excellence in Bioinformatics and Data Intensive Sciences, The Infectious Diseases Institute, College of Health Sciences, Makerere University, Kampala, Uganda
| | - Derek Cocker
- The David Price Evans Global Health & Infectious Diseases Group, The University of Liverpool, Liverpool, UK
| | | | - Ashleigh Myall
- Centre for Antimicrobial Optimisation, Imperial College London, London, UK; Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; Centre for Mathematics of Precision Healthcare, Imperial College London, London, UK
| | - Vasin Vasikasin
- Centre for Antimicrobial Optimisation, Imperial College London, London, UK; Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; Division of Infectious Diseases, Department of Internal Medicine, Phramongkutklao Hospital and College of Medicine, Bangkok, Thailand
| | - Richard Wilson
- Centre for Antimicrobial Optimisation, Imperial College London, London, UK; Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; The David Price Evans Global Health & Infectious Diseases Group, The University of Liverpool, Liverpool, UK
| | - Nusrat Shafiq
- Clinical Pharmacology Unit, Department of Pharmacology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Shampa Das
- Antimicrobial Pharmacodynamics and Therapeutics, Department of Pharmacology, The University of Liverpool, Liverpool Health Partners, Liverpool, UK
| | - Alison H Holmes
- Centre for Antimicrobial Optimisation, Imperial College London, London, UK; Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; The David Price Evans Global Health & Infectious Diseases Group, The University of Liverpool, Liverpool, UK
| |
Collapse
|
13
|
Mendizabal-Ruiz G, Paredes O, Álvarez Á, Acosta-Gómez F, Hernández-Morales E, González-Sandoval J, Mendez-Zavala C, Borrayo E, Chavez-Badiola A. Artificial Intelligence in Human Reproduction. Arch Med Res 2024; 55:103131. [PMID: 39615376 DOI: 10.1016/j.arcmed.2024.103131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 06/18/2024] [Revised: 11/04/2024] [Accepted: 11/12/2024] [Indexed: 01/04/2025]
Abstract
The use of artificial intelligence (AI) in human reproduction is a rapidly evolving field with both exciting possibilities and ethical considerations. This technology has the potential to improve success rates and reduce the emotional and financial burden of infertility. However, it also raises ethical and privacy concerns. This paper presents an overview of the current and potential applications of AI in human reproduction. It explores the use of AI in various aspects of reproductive medicine, including fertility tracking, assisted reproductive technologies, management of pregnancy complications, and laboratory automation. In addition, we discuss the need for robust ethical frameworks and regulations to ensure the responsible and equitable use of AI in reproductive medicine.
Collapse
Affiliation(s)
- Gerardo Mendizabal-Ruiz
- Conceivable Life Sciences, Department of Research and Development, Guadalajara, Jalisco, Mexico; Laboratorio de Percepción Computacional, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico.
| | - Omar Paredes
- Laboratorio de Innovación Biodigital, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico; IVF 2.0 Limited, Department of Research and Development, London, UK
| | - Ángel Álvarez
- Conceivable Life Sciences, Department of Research and Development, Guadalajara, Jalisco, Mexico; Laboratorio de Percepción Computacional, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Fátima Acosta-Gómez
- Conceivable Life Sciences, Department of Research and Development, Guadalajara, Jalisco, Mexico; Laboratorio de Percepción Computacional, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Estefanía Hernández-Morales
- Conceivable Life Sciences, Department of Research and Development, Guadalajara, Jalisco, Mexico; Laboratorio de Percepción Computacional, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Josué González-Sandoval
- Laboratorio de Percepción Computacional, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Celina Mendez-Zavala
- Laboratorio de Percepción Computacional, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Ernesto Borrayo
- Laboratorio de Innovación Biodigital, Departamento de Bioingeniería Traslacional, Universidad de Guadalajara, Guadalajara, Jalisco, Mexico
| | - Alejandro Chavez-Badiola
- Conceivable Life Sciences, Department of Research and Development, Guadalajara, Jalisco, Mexico; IVF 2.0 Limited, Department of Research and Development, London, UK; New Hope Fertility Center, Deparment of Research, Ciudad de México, Mexico
| |
Collapse
|
14
|
Cook N, Biel FM, Cartwright N, Hoopes M, Al Bataineh A, Rivera P. Assessing the use of unstructured electronic health record data to identify exposure to firearm violence. JAMIA Open 2024; 7:ooae120. [PMID: 39498385 PMCID: PMC11534176 DOI: 10.1093/jamiaopen/ooae120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/12/2024] [Revised: 09/20/2024] [Accepted: 10/16/2024] [Indexed: 11/07/2024] Open
Abstract
Objectives Research on firearm violence is largely limited to people who experienced acute bodily trauma and death which is readily gathered from Inpatient and Emergency Department settings and mortality data. Exposures to firearm violence, such as witnessing firearm violence or losing a loved one to firearm violence, are not routinely collected in health care. As a result, the true public health burden of firearm violence is underestimated. Clinical notes from electronic health records (EHRs) are a promising source of data that may expand our understanding of the impact of firearm violence on health. Pilot work was conducted on a sample of clinical notes to assess how firearm terms present in unstructured clinical notes as part of a larger initiative to develop a natural language processing (NLP) model to identify firearm exposure and injury in ambulatory care data. Materials and Methods We used EHR data from 2012 to 2022 from a large multistate network of primary care and behavioral health clinics. A text string search of broad, gun-only, and shooting terms was applied to 9,598 patients with either/both an ICD-10 or an OCHIN-developed structured data field indicating exposure to firearm violence. A sample of clinical notes from 90 patients was reviewed to ascertain the meaning of terms. Results Among the 90 clinical patient notes, 13 (14%) notes reflect documentation of exposure to firearm violence or injury from firearms. Results from this study identified refinements that should be considered for NLP text classification. Conclusion Unstructured clinical notes from primary and behavioral health clinics have potential to expand understanding of firearm violence.
Collapse
Affiliation(s)
- Nicole Cook
- OCHIN Inc, Portland, OR 97228-5426, United States
| | | | - Natalie Cartwright
- Department of Mathematics, Norwich University, Northfield, VT 05663, United States
| | - Megan Hoopes
- OCHIN Inc, Portland, OR 97228-5426, United States
| | - Ali Al Bataineh
- David Crawford School of Engineering, Norwich University, Northfield, VT 05663, United States
| | - Pedro Rivera
- OCHIN Inc, Portland, OR 97228-5426, United States
| |
Collapse
|
15
|
Chatterjee S, Fruhling A, Kotiadis K, Gartner D. Towards new frontiers of healthcare systems research using artificial intelligence and generative AI. Health Syst (Basingstoke) 2024; 13:263-273. [PMID: 39584173 PMCID: PMC11580149 DOI: 10.1080/20476965.2024.2402128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/26/2024] Open
|
16
|
Lumbiganon S, Abou Chawareb E, Moukhtar Hammad MA, Azad B, Shah D, Yafi FA. Artificial Intelligence as a Tool for Creating Patient Visit Summary: A Scoping Review and Guide to Implementation in an Erectile Dysfunction Clinic. Curr Urol Rep 2024; 26:20. [PMID: 39556140 DOI: 10.1007/s11934-024-01237-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Accepted: 08/14/2024] [Indexed: 11/19/2024]
Abstract
PURPOSE OF REVIEW In modern healthcare, the integration of artificial intelligence (AI) has revolutionized clinical practices, particularly in data management and patient visit summary creation. Manual creation of patient summary is repetitive, time-consuming, prone to errors, and increases clinicians' workload. AI, through voice recognition and Natural Language Processing (NLP), can automate this task more accurately and efficiently. Erectile dysfunction (ED) clinics, which deal with specific pattern of conditions together with an involvement of broader systemic issues, can greatly benefit from AI-driven patient summary. This scoping review examined the evidence on AI-generated patient summary and evaluated their implementation in ED clinics. RECENT FINDINGS A total of 381 articles were initially identified, 11 studies were included for the analysis. These studies showcased various methodologies, such as AI-assisted clinical notes and NLP algorithms. Most studies have demonstrated the ability of AI to be used in real life clinical scenarios. Major electronic health record platforms are also integrating AI to their system. However, to date, no studies have specifically addressed AI for patient summary creation in ED clinics.
Collapse
Affiliation(s)
- Supanut Lumbiganon
- Department of Urology, University of California, Irvine, CA, USA
- Department of Surgery, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | | | | | - Babak Azad
- Department of Urology, University of California, Irvine, CA, USA
| | - Dillan Shah
- Department of Urology, University of California, Irvine, CA, USA
| | - Faysal A Yafi
- Department of Urology, University of California, Irvine, CA, USA.
| |
Collapse
|
17
|
Du X, Novoa-Laurentiev J, Plasek JM, Chuang YW, Wang L, Marshall GA, Mueller SK, Chang F, Datta S, Paek H, Lin B, Wei Q, Wang X, Wang J, Ding H, Manion FJ, Du J, Bates DW, Zhou L. Enhancing early detection of cognitive decline in the elderly: a comparative study utilizing large language models in clinical notes. EBioMedicine 2024; 109:105401. [PMID: 39396423 DOI: 10.1016/j.ebiom.2024.105401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/16/2024] [Revised: 09/28/2024] [Accepted: 09/30/2024] [Indexed: 10/15/2024] Open
Abstract
BACKGROUND Large language models (LLMs) have shown promising performance in various healthcare domains, but their effectiveness in identifying specific clinical conditions in real medical records is less explored. This study evaluates LLMs for detecting signs of cognitive decline in real electronic health record (EHR) clinical notes, comparing their error profiles with traditional models. The insights gained will inform strategies for performance enhancement. METHODS This study, conducted at Mass General Brigham in Boston, MA, analysed clinical notes from the four years prior to a 2019 diagnosis of mild cognitive impairment in patients aged 50 and older. We developed prompts for two LLMs, Llama 2 and GPT-4, on Health Insurance Portability and Accountability Act (HIPAA)-compliant cloud-computing platforms using multiple approaches (e.g., hard prompting, retrieval augmented generation, and error analysis-based instructions) to select the optimal LLM-based method. Baseline models included a hierarchical attention-based neural network and XGBoost. Subsequently, we constructed an ensemble of the three models using a majority vote approach. Confusion-matrix-based scores were used for model evaluation. FINDINGS We used a randomly annotated sample of 4949 note sections from 1969 patients (women: 1046 [53.1%]; age: mean, 76.0 [SD, 13.3] years), filtered with keywords related to cognitive functions, for model development. For testing, a random annotated sample of 1996 note sections from 1161 patients (women: 619 [53.3%]; age: mean, 76.5 [SD, 10.2] years) without keyword filtering was utilised. GPT-4 demonstrated superior accuracy and efficiency compared to Llama 2, but did not outperform traditional models. The ensemble model outperformed the individual models in terms of all evaluation metrics with statistical significance (p < 0.01), achieving a precision of 90.2% [95% CI: 81.9%-96.8%], a recall of 94.2% [95% CI: 87.9%-98.7%], and an F1-score of 92.1% [95% CI: 86.8%-96.4%]. Notably, the ensemble model showed a significant improvement in precision, increasing from a range of 70%-79% to above 90%, compared to the best-performing single model. Error analysis revealed that 63 samples were incorrectly predicted by at least one model; however, only 2 cases (3.2%) were mutual errors across all models, indicating diverse error profiles among them. INTERPRETATION LLMs and traditional machine learning models trained using local EHR data exhibited diverse error profiles. The ensemble of these models was found to be complementary, enhancing diagnostic performance. Future research should investigate integrating LLMs with smaller, localised models and incorporating medical data and domain knowledge to enhance performance on specific tasks. FUNDING This research was supported by the National Institute on Aging grants (R44AG081006, R01AG080429) and National Library of Medicine grant (R01LM014239).
Collapse
Affiliation(s)
- Xinsong Du
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA.
| | - John Novoa-Laurentiev
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Joseph M Plasek
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| | - Ya-Wen Chuang
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA; Division of Nephrology, Taichung Veterans General Hospital, Taichung, 407219, Taiwan; Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung, 402202, Taiwan; School of Medicine, College of Medicine, China Medical University, Taichung, 406040, Taiwan
| | - Liqin Wang
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| | - Gad A Marshall
- Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA; Department of Neurology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Stephanie K Mueller
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| | - Frank Chang
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Surabhi Datta
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Hunki Paek
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Bin Lin
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Qiang Wei
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Xiaoyan Wang
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Jingqi Wang
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Hao Ding
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Frank J Manion
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - Jingcheng Du
- Intelligent Medical Objects, Rosemont, Illinois, 60018, USA
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| | - Li Zhou
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
| |
Collapse
|
18
|
Lee K, Paek H, Huang LC, Hilton CB, Datta S, Higashi J, Ofoegbu N, Wang J, Rubinstein SM, Cowan AJ, Kwok M, Warner JL, Xu H, Wang X. SEETrials: Leveraging large language models for safety and efficacy extraction in oncology clinical trials. INFORMATICS IN MEDICINE UNLOCKED 2024; 50:101589. [PMID: 39493413 PMCID: PMC11530223 DOI: 10.1016/j.imu.2024.101589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/05/2024] Open
Abstract
Background Initial insights into oncology clinical trial outcomes are often gleaned manually from conference abstracts. We aimed to develop an automated system to extract safety and efficacy information from study abstracts with high precision and fine granularity, transforming them into computable data for timely clinical decision-making. Methods We collected clinical trial abstracts from key conferences and PubMed (2012-2023). The SEETrials system was developed with three modules: preprocessing, prompt engineering with knowledge ingestion, and postprocessing. We evaluated the system's performance qualitatively and quantitatively and assessed its generalizability across different cancer types- multiple myeloma (MM), breast, lung, lymphoma, and leukemia. Furthermore, the efficacy and safety of innovative therapies, including CAR-T, bispecific antibodies, and antibody-drug conjugates (ADC), in MM were analyzed across a large scale of clinical trial studies. Results SEETrials achieved high precision (0.964), recall (sensitivity) (0.988), and F1 score (0.974) across 70 data elements present in the MM trial studies Generalizability tests on four additional cancers yielded precision, recall, and F1 scores within the 0.979-0.992 range. Variation in the distribution of safety and efficacy-related entities was observed across diverse therapies, with certain adverse events more common in specific treatments. Comparative performance analysis using overall response rate (ORR) and complete response (CR) highlighted differences among therapies: CAR-T (ORR: 88 %, 95 % CI: 84-92 %; CR: 95 %, 95 % CI: 53-66 %), bispecific antibodies (ORR: 64 %, 95 % CI: 55-73 %; CR: 27 %, 95 % CI: 16-37 %), and ADC (ORR: 51 %, 95 % CI: 37-65 %; CR: 26 %, 95 % CI: 1-51 %). Notable study heterogeneity was identified (>75 % I 2 heterogeneity index scores) across several outcome entities analyzed within therapy subgroups. Conclusion SEETrials demonstrated highly accurate data extraction and versatility across different therapeutics and various cancer domains. Its automated processing of large datasets facilitates nuanced data comparisons, promoting the swift and effective dissemination of clinical insights.
Collapse
Affiliation(s)
| | | | | | - C Beau Hilton
- Division of Hematology and Oncology, Vanderbilt University, Nashville, TN, USA
| | | | | | | | | | | | - Andrew J. Cowan
- Division of Hematology and Oncology, University of Washington, Seattle, WA, USA
| | - Mary Kwok
- Division of Hematology and Oncology, University of Washington, Seattle, WA, USA
| | - Jeremy L. Warner
- Lifespan Cancer Institute, Rhode Island Hospital, Providence, RI, USA
- Center for Clinical Cancer Informatics and Data Science, Legorreta Cancer Center, Brown University, Providence, RI, USA
| | - Hua Xu
- Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA
| | | |
Collapse
|
19
|
Lázaro E, Moscardó V. Qualitative Health-Related Quality of Life and Natural Language Processing: Characteristics, Implications, and Challenges. Healthcare (Basel) 2024; 12:2008. [PMID: 39408187 PMCID: PMC11475930 DOI: 10.3390/healthcare12192008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/17/2024] [Revised: 10/02/2024] [Accepted: 10/04/2024] [Indexed: 10/20/2024] Open
Abstract
OBJECTIVES This article focuses on describing the main characteristics of the application of NLP in the qualitative assessment of quality of life, as well as its implications and challenges. METHODS The qualitative methodology allows analysing patient comments in unstructured free text and obtaining valuable information through manual analysis of these data. However, large amounts of data are a healthcare challenge since it would require a high number of staff and time resources that are not available in most healthcare organizations. RESULTS One potential solution to mitigate the resource constraints of qualitative analysis is the use of machine learning and artificial intelligence, specifically methodologies based on natural language processing.
Collapse
Affiliation(s)
- Esther Lázaro
- Faculty of Health Sciences, Valencian International University, Calle Pintor Sorolla 21, 46002 Valencia, Spain;
| | | |
Collapse
|
20
|
Walsh CG, Wilimitis D, Chen Q, Wright A, Kolli J, Robinson K, Ripperger MA, Johnson KB, Carrell D, Desai RJ, Mosholder A, Dharmarajan S, Adimadhyam S, Fabbri D, Stojanovic D, Matheny ME, Bejan CA. Scalable incident detection via natural language processing and probabilistic language models. Sci Rep 2024; 14:23429. [PMID: 39379449 PMCID: PMC11461638 DOI: 10.1038/s41598-024-72756-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/21/2023] [Accepted: 09/10/2024] [Indexed: 10/10/2024] Open
Abstract
Post marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risks under-coding and imprecision. Clinical textual data might bridge these gaps, and natural language processing (NLP) has been shown to aid in scalable phenotyping across healthcare records in multiple clinical domains. In this study, we developed and validated a novel incident phenotyping approach using unstructured clinical textual data agnostic to Electronic Health Record (EHR) and note type. It's based on a published, validated approach (PheRe) used to ascertain social determinants of health and suicidality across entire healthcare records. To demonstrate generalizability, we validated this approach on two separate phenotypes that share common challenges with respect to accurate ascertainment: (1) suicide attempt; (2) sleep-related behaviors. With samples of 89,428 records and 35,863 records for suicide attempt and sleep-related behaviors, respectively, we conducted silver standard (diagnostic coding) and gold standard (manual chart review) validation. We showed Area Under the Precision-Recall Curve of ~ 0.77 (95% CI 0.75-0.78) for suicide attempt and AUPR ~ 0.31 (95% CI 0.28-0.34) for sleep-related behaviors. We also evaluated performance by coded race and demonstrated differences in performance by race differed across phenotypes. Scalable phenotyping models, like most healthcare AI, require algorithmovigilance and debiasing prior to implementation.
Collapse
Affiliation(s)
- Colin G Walsh
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA.
- Vanderbilt University Medical Center, Nashville, USA.
| | - Drew Wilimitis
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Qingxia Chen
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Aileen Wright
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jhansi Kolli
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Katelyn Robinson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Michael A Ripperger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Kevin B Johnson
- Department of Biostatistics, Epidemiology and Informatics, and Pediatrics, University of Pennsylvania, Pennsylvania, USA
- Department of Computer and Information Science, Bioengineering, University of Pennsylvania, Pennsylvania, USA
- Department of Science Communication, University of Pennsylvania, Pennsylvania, USA
| | - David Carrell
- Washington Health Research Institute, , Kaiser Permanente Washington, Washington, USA
| | - Rishi J Desai
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, USA
| | - Andrew Mosholder
- Center for Drug Evaluation and Research, United States Food and Drug Administration, Maryland, USA
- Office of Surveillance and Epidemiology, United States Food and Drug Administration, Maryland, USA
| | - Sai Dharmarajan
- Center for Drug Evaluation and Research, United States Food and Drug Administration, Maryland, USA
- Office of Translational Science, United States Food and Drug Administration, Maryland, USA
| | - Sruthi Adimadhyam
- Department of Population Medicine, Harvard Medical School, Harvard Pilgrim Health Care Institute, Boston, USA
| | - Daniel Fabbri
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Danijela Stojanovic
- Center for Drug Evaluation and Research, United States Food and Drug Administration, Maryland, USA
- Office of Surveillance and Epidemiology, United States Food and Drug Administration, Maryland, USA
| | - Michael E Matheny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Cosmin A Bejan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
21
|
Cheligeer K, Wu G, Laws A, Quan ML, Li A, Brisson AM, Xie J, Xu Y. Validation of large language models for detecting pathologic complete response in breast cancer using population-based pathology reports. BMC Med Inform Decis Mak 2024; 24:283. [PMID: 39363322 PMCID: PMC11447988 DOI: 10.1186/s12911-024-02677-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/01/2024] [Accepted: 09/09/2024] [Indexed: 10/05/2024] Open
Abstract
AIMS The primary goal of this study is to evaluate the capabilities of Large Language Models (LLMs) in understanding and processing complex medical documentation. We chose to focus on the identification of pathologic complete response (pCR) in narrative pathology reports. This approach aims to contribute to the advancement of comprehensive reporting, health research, and public health surveillance, thereby enhancing patient care and breast cancer management strategies. METHODS The study utilized two analytical pipelines, developed with open-source LLMs within the healthcare system's computing environment. First, we extracted embeddings from pathology reports using 15 different transformer-based models and then employed logistic regression on these embeddings to classify the presence or absence of pCR. Secondly, we fine-tuned the Generative Pre-trained Transformer-2 (GPT-2) model by attaching a simple feed-forward neural network (FFNN) layer to improve the detection performance of pCR from pathology reports. RESULTS In a cohort of 351 female breast cancer patients who underwent neoadjuvant chemotherapy (NAC) and subsequent surgery between 2010 and 2017 in Calgary, the optimized method displayed a sensitivity of 95.3% (95%CI: 84.0-100.0%), a positive predictive value of 90.9% (95%CI: 76.5-100.0%), and an F1 score of 93.0% (95%CI: 83.7-100.0%). The results, achieved through diverse LLM integration, surpassed traditional machine learning models, underscoring the potential of LLMs in clinical pathology information extraction. CONCLUSIONS The study successfully demonstrates the efficacy of LLMs in interpreting and processing digital pathology data, particularly for determining pCR in breast cancer patients post-NAC. The superior performance of LLM-based pipelines over traditional models highlights their significant potential in extracting and analyzing key clinical data from narrative reports. While promising, these findings highlight the need for future external validation to confirm the reliability and broader applicability of these methods.
Collapse
Affiliation(s)
- Ken Cheligeer
- The Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
- Provincial Research Data Services, Alberta Health Services, Calgary, Canada
| | - Guosong Wu
- The Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Alison Laws
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Canada
- Department of Oncology, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - May Lynn Quan
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Canada
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Canada
- Department of Oncology, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Andrea Li
- The Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Anne-Marie Brisson
- Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Jason Xie
- The Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Yuan Xu
- The Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Canada.
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Canada.
- Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Canada.
- Department of Oncology, Cumming School of Medicine, University of Calgary, Calgary, Canada.
| |
Collapse
|
22
|
Frankenberger WD, Zorc JJ, Cato KD. Prioritizing Pediatric Emergency Triage-Sorting Out the Challenges. JAMA Pediatr 2024; 178:972-973. [PMID: 39133494 DOI: 10.1001/jamapediatrics.2024.2677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Academic Contribution Register] [Indexed: 08/13/2024]
Affiliation(s)
- Warren D Frankenberger
- Center for Pediatric Nursing Research and Evidence-Based Practice, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
| | - Joseph J Zorc
- Division of Emergency Medicine, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Kenrick D Cato
- Center for Pediatric Nursing Research and Evidence-Based Practice, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
- University of Pennsylvania School of Nursing, Philadelphia
| |
Collapse
|
23
|
Guralnik E. US public health surveillance, reimagined. Learn Health Syst 2024; 8:e10445. [PMID: 39444500 PMCID: PMC11493541 DOI: 10.1002/lrh2.10445] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/09/2024] [Revised: 06/24/2024] [Accepted: 07/25/2024] [Indexed: 10/25/2024] Open
Abstract
Introduction This study presents two novel concepts for standardizing electronic health records (EHR)-based public health surveillance through utilization of existing informatics methods and data platforms. Methods Drawing from the collective experience in applied epidemiology, health services research and health informatics, the author presents a vision for an alternative path to public health surveillance by repurposing existing tools and resources, such as (1) computable phenotypes which have already been created and validated for a variety of chronic diseases of interest to public health and (2) large data platforms/collaboratives, such as All of Us Research Program and National COVID Cohort Collaborative. Opportunities and challenges are discussed regarding EHR-based chronic disease surveillance, as well as the concept of phenotype definitions and large data platforms reuse for public health needs. Results/Framework Reusing of computable phenotypes for EHR-based public health surveillance would require secure data platforms and nationally representative data. Standardization metrics for reuse of previously developed and validated computable phenotypes are also necessary and are currently being developed by the author. This study presents a reimagined Learning Health System framework by incorporating Public Health and two novel concept sets of solutions into the healthcare ecosystem. Conclusion/Next Steps Alternative approaches to limited resources and current infrastructure of the US Public Health System, especially as applied to disease surveillance, are needed and may be possible when repurposing the resources and methodologies across the Learning Health System.
Collapse
Affiliation(s)
- Elina Guralnik
- Department of Health Administration and PolicyCollege of Public Health, George Mason UniversityFairfaxVAUSA
| |
Collapse
|
24
|
Xu D, Xu Z. Machine learning applications in preventive healthcare: A systematic literature review on predictive analytics of disease comorbidity from multiple perspectives. Artif Intell Med 2024; 156:102950. [PMID: 39163727 DOI: 10.1016/j.artmed.2024.102950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 10/25/2023] [Revised: 06/17/2024] [Accepted: 08/13/2024] [Indexed: 08/22/2024]
Abstract
Artificial intelligence is constantly revolutionizing biomedical research and healthcare management. Disease comorbidity is a major threat to the quality of life for susceptible groups, especially middle-aged and elderly patients. The presence of multiple chronic diseases makes precision diagnosis challenging to realize and imposes a heavy burden on the healthcare system and economy. Given an enormous amount of accumulated health data, machine learning techniques show their capability in handling this puzzle. The present study conducts a review to uncover current research efforts in applying these methods to understanding comorbidity mechanisms and making clinical predictions considering these complex patterns. A descriptive metadata analysis of 791 unique publications aims to capture the overall research progression between January 2012 and June 2023. To delve into comorbidity-focused research, 61 of these scientific papers are systematically assessed. Four predictive analytics of tasks are detected: disease comorbidity data extraction, clustering, network, and risk prediction. It is observed that some machine learning-driven applications address inherent data deficiencies in healthcare datasets and provide a model interpretation that identifies significant risk factors of comorbidity development. Based on insights, both technical and practical, gained from relevant literature, this study intends to guide future interests in comorbidity research and draw conclusions about chronic disease prevention and diagnosis with managerial implications.
Collapse
Affiliation(s)
- Duo Xu
- School of Economics and Management, Southeast University, Nanjing 211189, China.
| | - Zeshui Xu
- School of Economics and Management, Southeast University, Nanjing 211189, China; Business School, Sichuan University, Chengdu 610064, China.
| |
Collapse
|
25
|
Veras Florentino PT, Araújo VDO, Zatti H, Luis CV, Cavalcanti CRS, de Oliveira MHC, Leão AHFF, Bertoldo Junior J, Barbosa GGC, Ravera E, Cebukin A, David RB, de Melo DBV, Machado TM, Bellei NCJ, Boaventura V, Barral-Netto M, Smaili SS. Text mining method to unravel long COVID's clinical condition in hospitalized patients. Cell Death Dis 2024; 15:671. [PMID: 39271699 PMCID: PMC11399332 DOI: 10.1038/s41419-024-07043-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 04/13/2024] [Revised: 08/28/2024] [Accepted: 08/29/2024] [Indexed: 09/15/2024]
Abstract
Long COVID is characterized by persistent that extends symptoms beyond established timeframes. Its varied presentation across different populations and healthcare systems poses significant challenges in understanding its clinical manifestations and implications. In this study, we present a novel application of text mining technique to automatically extract unstructured data from a long COVID survey conducted at a prominent university hospital in São Paulo, Brazil. Our phonetic text clustering (PTC) method enables the exploration of unstructured Electronic Healthcare Records (EHR) data to unify different written forms of similar terms into a single phonemic representation. We used n-gram text analysis to detect compound words and negated terms in Portuguese-BR, focusing on medical conditions and symptoms related to long COVID. By leveraging text mining, we aim to contribute to a deeper understanding of this chronic condition and its implications for healthcare systems globally. The model developed in this study has the potential for scalability and applicability in other healthcare settings, thereby supporting broader research efforts and informing clinical decision-making for long COVID patients.
Collapse
Affiliation(s)
- Pilar Tavares Veras Florentino
- Laboratório de Medicina e Saúde Pública de Precisão (MeSP2), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
- Centro de Integração de Dados e Conhecimentos para a Saúde (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Vinícius de Oliveira Araújo
- Centro de Integração de Dados e Conhecimentos para a Saúde (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
- Faculdade de Medicina da Bahia, Universidade Federal da Bahia, Salvador, Brazil
| | - Henrique Zatti
- Centro de Integração de Dados e Conhecimentos para a Saúde (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Caio Vinícius Luis
- Departamento de Farmacologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | | | | | | | - Juracy Bertoldo Junior
- Centro de Integração de Dados e Conhecimentos para a Saúde (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - George G Caique Barbosa
- Centro de Integração de Dados e Conhecimentos para a Saúde (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
| | - Ernesto Ravera
- Departamento de Farmacologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Alberto Cebukin
- Departamento de Farmacologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Renata Bernardes David
- Departamento de Farmacologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | | | - Tales Mota Machado
- Centro de Integração de Dados e Conhecimentos para a Saúde (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
- Diretoria de Tecnologia da Informação, Universidade Federal de Ouro Preto, Ouro Preto, Brazil
| | - Nancy C J Bellei
- Disciplina de Moléstias Infecciosas, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Viviane Boaventura
- Laboratório de Medicina e Saúde Pública de Precisão (MeSP2), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil
- Faculdade de Medicina da Bahia, Universidade Federal da Bahia, Salvador, Brazil
| | - Manoel Barral-Netto
- Laboratório de Medicina e Saúde Pública de Precisão (MeSP2), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Brazil.
- Faculdade de Medicina da Bahia, Universidade Federal da Bahia, Salvador, Brazil.
| | - Soraya S Smaili
- Departamento de Farmacologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil.
| |
Collapse
|
26
|
Kalra N, Verma P, Verma S. Advancements in AI based healthcare techniques with FOCUS ON diagnostic techniques. Comput Biol Med 2024; 179:108917. [PMID: 39059212 DOI: 10.1016/j.compbiomed.2024.108917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 04/16/2024] [Revised: 07/15/2024] [Accepted: 07/15/2024] [Indexed: 07/28/2024]
Abstract
Since the past decade, the interest towards more precise and efficient healthcare techniques with special emphasis on diagnostic techniques has increased. Artificial Intelligence has proved to be instrumental in development of various such techniques. The various types of AI like ML, NLP, RPA etc. are being used, which have streamlined and organised the Electronic Health Records (EHR) along with aiding the healthcare provider with decision making and sample and data analysis. This article also deals with the 3 major categories of diagnostic techniques - Imaging based, Pathology based and Preventive diagnostic techniques and what all changes and modifications were brought upon them, due to use of AI. Due to such a high demand, the investment in AI based healthcare techniques has increased substantially, with predicted market size of almost 188 billon USD by 2030. In India itself, AI in healthcare is expected to raise the GDP by 25 billion USD by 2028. But there are also several challenges associated with this like unavailability of quality data, black box issue etc. One of the major challenges is the ethical considerations and issues during use of medical records as it is a very sensitive document. Due to this, there is several trust issues associated with adoption of AI by many organizations. These challenges have also been discussed in this article. Need for further development in the AI based diagnostic techniques is also done in the article. Alongside, the production of such techniques and devices which are easy to use and simple to incorporate into the daily workflows have immense scope in the upcoming times. The increasing scope of Clinical Decision Support System, Telemedicine etc. make AI a promising field in the healthcare and diagnostics arena. Concluding the article, it can be said that despite the presence of various challenges to the implementation and usage, the future prospects for AI in healthcare is immense and work needs to be done in order to ensure the availability of resources for same so that high level of accuracy can be achieved and better health outcomes can be provided to patients. Ethical concerns need to be addressed for smooth implementation and to reduce the burden of the developers, which has been discussed in this narrative review article.
Collapse
Affiliation(s)
- Nishita Kalra
- Department of Pharmaceutical Chemistry/Analysis, Delhi Pharmaceutical Sciences & Research University, Pushp Vihar, Sector 3, New Delhi, 110017, India
| | - Prachi Verma
- Department of Pharmaceutical Chemistry/Analysis, Delhi Pharmaceutical Sciences & Research University, Pushp Vihar, Sector 3, New Delhi, 110017, India
| | - Surajpal Verma
- Department of Pharmaceutical Chemistry/Analysis, Delhi Pharmaceutical Sciences & Research University, Pushp Vihar, Sector 3, New Delhi, 110017, India.
| |
Collapse
|
27
|
Cao T, Brady V, Whisenant M, Wang X, Gu Y, Wu H. Toward Reliable Symptom Coding in Electronic Health Records for Symptom Assessment and Research: Identification and Categorization of International Classification of Diseases, Ninth Revision, Clinical Modification Symptom Codes. Comput Inform Nurs 2024; 42:636-647. [PMID: 38968447 PMCID: PMC11377150 DOI: 10.1097/cin.0000000000001146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 07/07/2024]
Abstract
To date, symptom documentation has mostly relied on clinical notes in electronic health records or patient-reported outcomes using disease-specific symptom inventories. To provide a common and precise language for symptom recording, assessment, and research, a comprehensive list of symptom codes is needed. The International Classification of Diseases, Ninth Revision or its clinical modification ( International Classification of Diseases, Ninth Revision, Clinical Modification ) has a range of codes designated for symptoms, but it does not contain codes for all possible symptoms, and not all codes in that range are symptom related. This study aimed to identify and categorize the first list of International Classification of Diseases, Ninth Revision, Clinical Modification symptom codes for a general population and demonstrate their use to characterize symptoms of patients with type 2 diabetes mellitus in the Cerner database. A list of potential symptom codes was automatically extracted from the Unified Medical Language System Metathesaurus. Two clinical experts in symptom science and diabetes manually reviewed this list to identify and categorize codes as symptoms. A total of 1888 International Classification of Diseases, Ninth Revision, Clinical Modification symptom codes were identified and categorized into 65 categories. The symptom characterization using the newly obtained symptom codes and categories was found to be more reasonable than that using the previous symptom codes and categories on the same Cerner diabetes cohort.
Collapse
Affiliation(s)
- Tru Cao
- Author Affiliations: UTHealth Houston School of Public Health (Drs Cao, Wang, and Wu and Mr Gu), UTHealth Houston Cizik School of Nursing (Dr Brady), and The University of Texas MD Anderson Cancer Center (Dr Whisenant)
| | | | | | | | | | | |
Collapse
|
28
|
Askar M, Tafavvoghi M, Småbrekke L, Bongo LA, Svendsen K. Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review. PLoS One 2024; 19:e0309175. [PMID: 39178283 PMCID: PMC11343463 DOI: 10.1371/journal.pone.0309175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/01/2024] [Accepted: 08/06/2024] [Indexed: 08/25/2024] Open
Abstract
AIM In this review, we investigated how Machine Learning (ML) was utilized to predict all-cause somatic hospital admissions and readmissions in adults. METHODS We searched eight databases (PubMed, Embase, Web of Science, CINAHL, ProQuest, OpenGrey, WorldCat, and MedNar) from their inception date to October 2023, and included records that predicted all-cause somatic hospital admissions and readmissions of adults using ML methodology. We used the CHARMS checklist for data extraction, PROBAST for bias and applicability assessment, and TRIPOD for reporting quality. RESULTS We screened 7,543 studies of which 163 full-text records were read and 116 met the review inclusion criteria. Among these, 45 predicted admission, 70 predicted readmission, and one study predicted both. There was a substantial variety in the types of datasets, algorithms, features, data preprocessing steps, evaluation, and validation methods. The most used types of features were demographics, diagnoses, vital signs, and laboratory tests. Area Under the ROC curve (AUC) was the most used evaluation metric. Models trained using boosting tree-based algorithms often performed better compared to others. ML algorithms commonly outperformed traditional regression techniques. Sixteen studies used Natural language processing (NLP) of clinical notes for prediction, all studies yielded good results. The overall adherence to reporting quality was poor in the review studies. Only five percent of models were implemented in clinical practice. The most frequently inadequately addressed methodological aspects were: providing model interpretations on the individual patient level, full code availability, performing external validation, calibrating models, and handling class imbalance. CONCLUSION This review has identified considerable concerns regarding methodological issues and reporting quality in studies investigating ML to predict hospitalizations. To ensure the acceptability of these models in clinical settings, it is crucial to improve the quality of future studies.
Collapse
Affiliation(s)
- Mohsen Askar
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Masoud Tafavvoghi
- Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Lars Småbrekke
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Lars Ailo Bongo
- Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway
| | - Kristian Svendsen
- Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
29
|
Swinckels L, Bennis FC, Ziesemer KA, Scheerman JFM, Bijwaard H, de Keijzer A, Bruers JJ. The Use of Deep Learning and Machine Learning on Longitudinal Electronic Health Records for the Early Detection and Prevention of Diseases: Scoping Review. J Med Internet Res 2024; 26:e48320. [PMID: 39163096 PMCID: PMC11372333 DOI: 10.2196/48320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 04/19/2023] [Revised: 09/29/2023] [Accepted: 04/29/2024] [Indexed: 08/21/2024] Open
Abstract
BACKGROUND Electronic health records (EHRs) contain patients' health information over time, including possible early indicators of disease. However, the increasing amount of data hinders clinicians from using them. There is accumulating evidence suggesting that machine learning (ML) and deep learning (DL) can assist clinicians in analyzing these large-scale EHRs, as algorithms thrive on high volumes of data. Although ML has become well developed, studies mainly focus on engineering but lack medical outcomes. OBJECTIVE This study aims for a scoping review of the evidence on how the use of ML on longitudinal EHRs can support the early detection and prevention of disease. The medical insights and clinical benefits that have been generated were investigated by reviewing applications in a variety of diseases. METHODS This study was conducted according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A literature search was performed in 2022 in collaboration with a medical information specialist in the following databases: PubMed, Embase, Web of Science Core Collection (Clarivate Analytics), and IEEE Xplore Digital Library and computer science bibliography. Studies were eligible when longitudinal EHRs were used that aimed for the early detection of disease via ML in a prevention context. Studies with a technical focus or using imaging or hospital admission data were beyond the scope of this review. Study screening and selection and data extraction were performed independently by 2 researchers. RESULTS In total, 20 studies were included, mainly published between 2018 and 2022. They showed that a variety of diseases could be detected or predicted, particularly diabetes; kidney diseases; diseases of the circulatory system; and mental, behavioral, and neurodevelopmental disorders. Demographics, symptoms, procedures, laboratory test results, diagnoses, medications, and BMI were frequently used EHR data in basic recurrent neural network or long short-term memory techniques. By developing and comparing ML and DL models, medical insights such as a high diagnostic performance, an earlier detection, the most important predictors, and additional health indicators were obtained. A clinical benefit that has been evaluated positively was preliminary screening. If these models are applied in practice, patients might also benefit from personalized health care and prevention, with practical benefits such as workload reduction and policy insights. CONCLUSIONS Longitudinal EHRs proved to be helpful for support in health care. Current ML models on EHRs can support the detection of diseases in terms of accuracy and offer preliminary screening benefits. Regarding the prevention of diseases, ML and specifically DL models can accurately predict or detect diseases earlier than current clinical diagnoses. Adding personally responsible factors allows targeted prevention interventions. While ML models based on textual EHRs are still in the developmental stage, they have high potential to support clinicians and the health care system and improve patient outcomes.
Collapse
Affiliation(s)
- Laura Swinckels
- Department of Oral Public Health, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit, Amsterdam, Netherlands
- Department Oral Hygiene, Cluster Health, Sports and Welfare, Inholland University of Applied Sciences, Amsterdam, Netherlands
- Medical Technology Research Group, Cluster Health, Sport and Welfare, Inholland University of Applied Sciences, Haarlem, Netherlands
- Data Driven Smart Society Research Group, Faculty of Engineering, Design & Computing, Inholland University of Applied Sciences, Alkmaar, Netherlands
| | - Frank C Bennis
- Quantitative Data Analytics Group, Department of Computer Science, Vrije Universiteit, Amsterdam, Netherlands
- Department of Pediatrics, Emma Neuroscience Group, Emma Children's Hospital, Amsterdam UMC, Amsterdam, Netherlands
- Amsterdam Reproduction and Development Research Institute, Amsterdam, Netherlands
| | - Kirsten A Ziesemer
- Medical Library, University Library, Vrije Universiteit, Amsterdam, Netherlands
| | - Janneke F M Scheerman
- Department Oral Hygiene, Cluster Health, Sports and Welfare, Inholland University of Applied Sciences, Amsterdam, Netherlands
- Medical Technology Research Group, Cluster Health, Sport and Welfare, Inholland University of Applied Sciences, Haarlem, Netherlands
| | - Harmen Bijwaard
- Medical Technology Research Group, Cluster Health, Sport and Welfare, Inholland University of Applied Sciences, Haarlem, Netherlands
| | - Ander de Keijzer
- Data Driven Smart Society Research Group, Faculty of Engineering, Design & Computing, Inholland University of Applied Sciences, Alkmaar, Netherlands
- Applied Responsible Artificial Intelligence, Avans University of Applied Sciences, Breda, Netherlands
| | - Josef Jan Bruers
- Department of Oral Public Health, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit, Amsterdam, Netherlands
- Royal Dutch Dental Association (KNMT), Utrecht, Netherlands
| |
Collapse
|
30
|
Albashayreh A, Bandyopadhyay A, Zeinali N, Zhang M, Fan W, Gilbertson White S. Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives. JCO Clin Cancer Inform 2024; 8:e2300235. [PMID: 39116379 DOI: 10.1200/cci.23.00235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/10/2023] [Revised: 04/29/2024] [Accepted: 05/30/2024] [Indexed: 08/10/2024] Open
Abstract
PURPOSE Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer. METHODS We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing. RESULTS The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes). CONCLUSION We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.
Collapse
Affiliation(s)
| | | | | | - Min Zhang
- School of Economics and Management, Communication University of China, Beijing, China
| | - Weiguo Fan
- Tippie College of Business, University of Iowa, Iowa City, IA
| | | |
Collapse
|
31
|
Agrawal S, Vagha S. A Comprehensive Review of Artificial Intelligence in Prostate Cancer Care: State-of-the-Art Diagnostic Tools and Future Outlook. Cureus 2024; 16:e66225. [PMID: 39238711 PMCID: PMC11374581 DOI: 10.7759/cureus.66225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/21/2024] [Accepted: 08/05/2024] [Indexed: 09/07/2024] Open
Abstract
Prostate cancer remains a significant global health challenge, characterized by high incidence and substantial morbidity and mortality rates. Early detection is critical for improving patient outcomes, yet current diagnostic methods have limitations in accuracy and reliability. Artificial intelligence (AI) has emerged as a promising tool to address these challenges in prostate cancer care. AI technologies, including machine learning algorithms and advanced imaging techniques, offer potential solutions to enhance diagnostic accuracy, optimize treatment strategies, and personalize patient care. This review explores the current landscape of AI applications in prostate cancer diagnostics, highlighting state-of-the-art tools and their clinical implications. By synthesizing recent advancements and discussing future directions, the review underscores the transformative potential of AI in revolutionizing prostate cancer diagnosis and management. Ultimately, integrating AI into clinical practice can potentially improve outcomes and quality of life for patients affected by prostate cancer.
Collapse
Affiliation(s)
- Somya Agrawal
- Pathology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Sunita Vagha
- Pathology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| |
Collapse
|
32
|
Luo X, Deng Z, Yang B, Luo MY. Pre-trained language models in medicine: A survey. Artif Intell Med 2024; 154:102904. [PMID: 38917600 DOI: 10.1016/j.artmed.2024.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/15/2023] [Revised: 04/15/2024] [Accepted: 06/03/2024] [Indexed: 06/27/2024]
Abstract
With the rapid progress in Natural Language Processing (NLP), Pre-trained Language Models (PLM) such as BERT, BioBERT, and ChatGPT have shown great potential in various medical NLP tasks. This paper surveys the cutting-edge achievements in applying PLMs to various medical NLP tasks. Specifically, we first brief PLMS and outline the research of PLMs in medicine. Next, we categorise and discuss the types of tasks in medical NLP, covering text summarisation, question-answering, machine translation, sentiment analysis, named entity recognition, information extraction, medical education, relation extraction, and text mining. For each type of task, we first provide an overview of the basic concepts, the main methodologies, the advantages of applying PLMs, the basic steps of applying PLMs application, the datasets for training and testing, and the metrics for task evaluation. Subsequently, a summary of recent important research findings is presented, analysing their motivations, strengths vs weaknesses, similarities vs differences, and discussing potential limitations. Also, we assess the quality and influence of the research reviewed in this paper by comparing the citation count of the papers reviewed and the reputation and impact of the conferences and journals where they are published. Through these indicators, we further identify the most concerned research topics currently. Finally, we look forward to future research directions, including enhancing models' reliability, explainability, and fairness, to promote the application of PLMs in clinical practice. In addition, this survey also collect some download links of some model codes and the relevant datasets, which are valuable references for researchers applying NLP techniques in medicine and medical professionals seeking to enhance their expertise and healthcare service through AI technology.
Collapse
Affiliation(s)
- Xudong Luo
- School of Computer Science and Engineering, Guangxi Normal University, Guilin 541004, China; Guangxi Key Lab of Multi-source Information Mining, Guangxi Normal University, Guilin 541004, China; Key Laboratory of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin 541004, China.
| | - Zhiqi Deng
- School of Computer Science and Engineering, Guangxi Normal University, Guilin 541004, China; Guangxi Key Lab of Multi-source Information Mining, Guangxi Normal University, Guilin 541004, China; Key Laboratory of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin 541004, China.
| | - Binxia Yang
- School of Computer Science and Engineering, Guangxi Normal University, Guilin 541004, China; Guangxi Key Lab of Multi-source Information Mining, Guangxi Normal University, Guilin 541004, China; Key Laboratory of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin 541004, China.
| | - Michael Y Luo
- Emmanuel College, Cambridge University, Cambridge, CB2 3AP, UK.
| |
Collapse
|
33
|
Mora S, Giacobbe DR, Bartalucci C, Viglietti G, Mikulska M, Vena A, Ball L, Robba C, Cappello A, Battaglini D, Brunetti I, Pelosi P, Bassetti M, Giacomini M. Towards the automatic calculation of the EQUAL Candida Score: Extraction of CVC-related information from EMRs of critically ill patients with candidemia in Intensive Care Units. J Biomed Inform 2024; 156:104667. [PMID: 38848885 DOI: 10.1016/j.jbi.2024.104667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 06/22/2023] [Revised: 06/01/2024] [Accepted: 06/03/2024] [Indexed: 06/09/2024]
Abstract
OBJECTIVES Candidemia is the most frequent invasive fungal disease and the fourth most frequent bloodstream infection in hospitalized patients. Its optimal management is crucial for improving patients' survival. The quality of candidemia management can be assessed with the EQUAL Candida Score. The objective of this work is to support its automatic calculation by extracting central venous catheter-related information from Italian text in clinical notes of electronic medical records. MATERIALS AND METHODS The sample includes 4,787 clinical notes of 108 patients hospitalized between January 2018 to December 2020 in the Intensive Care Units of the IRCCS San Martino Polyclinic Hospital in Genoa (Italy). The devised pipeline exploits natural language processing (NLP) to produce numerical representations of clinical notes used as input of machine learning (ML) algorithms to identify CVC presence and removal. It compares the performances of (i) rule-based method, (ii) count-based method together with a ML algorithm, and (iii) a transformers-based model. RESULTS Results, obtained with three different approaches, were evaluated in terms of weighted F1 Score. The random forest classifier showed the higher performance in both tasks reaching 82.35%. CONCLUSION The present work constitutes a first step towards the automatic calculation of the EQUAL Candida Score from unstructured daily collected data by combining ML and NLP methods. The automatic calculation of the EQUAL Candida Score could provide crucial real-time feedback on the quality of candidemia management, aimed at further improving patients' health.
Collapse
Affiliation(s)
- Sara Mora
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy; UO Information and Communication Technologies (ICT), IRCCS Ospedale Policlinico San Martino, Genoa, Italy.
| | - Daniele Roberto Giacobbe
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Claudia Bartalucci
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Giulia Viglietti
- Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Malgorzata Mikulska
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Antonio Vena
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Lorenzo Ball
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Chiara Robba
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Alice Cappello
- Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Denise Battaglini
- Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Iole Brunetti
- Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Paolo Pelosi
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy; Anesthesia and Intensive Care, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Matteo Bassetti
- Department of Health Sciences (DISSAL), University of Genoa, Genoa, Italy; Clinica Malattie Infettive, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
| | - Mauro Giacomini
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, Genoa, Italy
| |
Collapse
|
34
|
Feher B, Tussie C, Giannobile WV. Applied artificial intelligence in dentistry: emerging data modalities and modeling approaches. Front Artif Intell 2024; 7:1427517. [PMID: 39109324 PMCID: PMC11300434 DOI: 10.3389/frai.2024.1427517] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/03/2024] [Accepted: 07/02/2024] [Indexed: 12/01/2024] Open
Abstract
Artificial intelligence (AI) is increasingly applied across all disciplines of medicine, including dentistry. Oral health research is experiencing a rapidly increasing use of machine learning (ML), the branch of AI that identifies inherent patterns in data similarly to how humans learn. In contemporary clinical dentistry, ML supports computer-aided diagnostics, risk stratification, individual risk prediction, and decision support to ultimately improve clinical oral health care efficiency, outcomes, and reduce disparities. Further, ML is progressively used in dental and oral health research, from basic and translational science to clinical investigations. With an ML perspective, this review provides a comprehensive overview of how dental medicine leverages AI for diagnostic, prognostic, and generative tasks. The spectrum of available data modalities in dentistry and their compatibility with various methods of applied AI are presented. Finally, current challenges and limitations as well as future possibilities and considerations for AI application in dental medicine are summarized.
Collapse
Affiliation(s)
- Balazs Feher
- Department of Oral Medicine, Infection, and Immunity, Harvard School of Dental Medicine, Boston, MA, United States
- ITU/WHO/WIPO Global Initiative on Artificial Intelligence for Health, Geneva, Switzerland
- Department of Oral Surgery, University Clinic of Dentistry, Medical University of Vienna, Vienna, Austria
- Department of Oral Biology, University Clinic of Dentistry, Medical University of Vienna, Vienna, Austria
| | - Camila Tussie
- Department of Oral Medicine, Infection, and Immunity, Harvard School of Dental Medicine, Boston, MA, United States
| | - William V. Giannobile
- Department of Oral Medicine, Infection, and Immunity, Harvard School of Dental Medicine, Boston, MA, United States
| |
Collapse
|
35
|
Chen J, Liu L, Huang J, Jiang Y, Yin C, Zhang L, Li Z, Lu H. LSTM-Based Prediction Model for Tuberculosis Among HIV-Infected Patients Using Structured Electronic Medical Records: A Retrospective Machine Learning Study. J Multidiscip Healthc 2024; 17:3557-3573. [PMID: 39070689 PMCID: PMC11283178 DOI: 10.2147/jmdh.s467877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/08/2024] [Accepted: 07/17/2024] [Indexed: 07/30/2024] Open
Abstract
Background Both HIV and TB are chronic infectious diseases requiring long-term treatment and follow-up, resulting in extensive electronic medical records. With the exponential growth of health and medical big data, effectively extracting and analyzing these data has become the research hotspot. As a fundamental aspect of artificial intelligence, machine learning has been extensively applied in medical research, encompassing diagnosis, treatment, patient monitoring, drug development, and epidemiological investigations. This significantly enhances medical information systems and facilitates the interoperability of medical data. Methods In our study, we analyzed longitudinal data from the electronic health records of 4540 patients, gathered from the National Clinical Research Center for Infectious Diseases in Shenzhen, China, spanning from 2017 to 2021. Initially, we employed the fine-tuned ChatGLM to structure the electronic medical records. Subsequently, we utilized a multi-layer perceptron to classify each patient and determined the presence of tuberculosis in HIV patients. Using machine learning-based natural language processing, we structured these records to build a specialized database for HIV and TB co-infection. We studied the epidemiological characteristics, focusing on incidence patterns, patient characteristics, and influencing factors, to uncover the transmission characteristics of these diseases in Shenzhen. Additionally, we used Long Short-Term Memory to create a predictive model for TB co-infection among HIV patients, based on their medical records. This model predicted the risk of TB co-infection, providing scientific evidence for clinical decision-making and enabling early detection and precise intervention. Results Based on the refined ChatGLM model tailored for structured electronic health records, the accuracy of symptom extraction consistently surpassed 0.95 precision. Key symptoms such as diarrhea and normal showed precision rates exceeding 0.90. High scores were also achieved in recall and F1 scores. Among 4540 HIV patients, 758 were diagnosed with concurrent tuberculosis, indicating a 16.7% co-infection rate, while syphilis co-infection affected 25.1%, underscoring the prevalence of concurrent infections among HIV patients. Utilizing electronic health records, a Multilayer Perceptron classifier was developed as a benchmark against Long Short-Term Memory to predict high-risk groups for HIV and tuberculosis co-infections. The Multilayer Perceptron classifier demonstrated predictive ability with AUROC values ranging from 0.616 to 0.682 on the test set, suggesting opportunities for further optimization and generalization despite its accuracy in identifying HIV-TB co-infections. In tuberculosis intelligent diagnosis based on laboratory results, the Long Short-Term Memory showed consistent performance across 5-fold cross-validation, with AUROC values ranging from 0.827 to 0.850, indicating reliability and consistency in tuberculosis prediction. Furthermore, by optimizing classification thresholds, the model achieved an overall accuracy of 81.18% in distinguishing HIV co-infected tuberculosis from simple HIV infection. Conclusion Combining the Multilayer Perceptron classifier with Long Short-Term Memory represented an advanced approach for effectively extracting electronic health records and utilizing it for disease prediction. This underscored the superior performance of deep learning techniques in managing both structured and unstructured medical data. Models leveraging laboratory time-series data demonstrated notably better performance compared to those relying solely on electronic health records for predicting tuberculosis incidence. This emphasized the benefits of deep learning in handling intricate medical data and provided valuable insights for healthcare providers exploring the use of deep learning in disease prediction and management.
Collapse
Affiliation(s)
- Jingfang Chen
- Faculty of Medicine, Macau University of Science and Technology, Macau, 999078, People’s Republic of China
- Department of Research and Teaching, The Third People’s Hospital of Shenzhen, Shenzhen, 518112, People’s Republic of China
| | - Linlin Liu
- Hengyang Medical School, School of Nursing, University of South China, Hengyang, 421001, People’s Republic of China
| | - Junxiong Huang
- Faculty of Medicine, Macau University of Science and Technology, Macau, 999078, People’s Republic of China
| | - Youli Jiang
- Department of Neurology, The People’s Hospital of Longhua, Shenzhen, 518109, People’s Republic of China
| | - Chengliang Yin
- Faculty of Medicine, Macau University of Science and Technology, Macau, 999078, People’s Republic of China
| | - Lukun Zhang
- Department of Infectious Diseases, National Clinical Research Center for Infectious Diseases, The Third People’s Hospital of Shenzhen, Shenzhen, 518112, People’s Republic of China
| | - Zhihuan Li
- Faculty of Medicine, Macau University of Science and Technology, Macau, 999078, People’s Republic of China
| | - Hongzhou Lu
- Faculty of Medicine, Macau University of Science and Technology, Macau, 999078, People’s Republic of China
- Department of Infectious Diseases, National Clinical Research Center for Infectious Diseases, The Third People’s Hospital of Shenzhen, Shenzhen, 518112, People’s Republic of China
| |
Collapse
|
36
|
Osman M, Cooper R, Sayer AA, Witham MD. The use of natural language processing for the identification of ageing syndromes including sarcopenia, frailty and falls in electronic healthcare records: a systematic review. Age Ageing 2024; 53:afae135. [PMID: 38970549 PMCID: PMC11227113 DOI: 10.1093/ageing/afae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/29/2023] [Indexed: 07/08/2024] Open
Abstract
BACKGROUND Recording and coding of ageing syndromes in hospital records is known to be suboptimal. Natural Language Processing algorithms may be useful to identify diagnoses in electronic healthcare records to improve the recording and coding of these ageing syndromes, but the feasibility and diagnostic accuracy of such algorithms are unclear. METHODS We conducted a systematic review according to a predefined protocol and in line with Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Searches were run from the inception of each database to the end of September 2023 in PubMed, Medline, Embase, CINAHL, ACM digital library, IEEE Xplore and Scopus. Eligible studies were identified via independent review of search results by two coauthors and data extracted from each study to identify the computational method, source of text, testing strategy and performance metrics. Data were synthesised narratively by ageing syndrome and computational method in line with the Studies Without Meta-analysis guidelines. RESULTS From 1030 titles screened, 22 studies were eligible for inclusion. One study focussed on identifying sarcopenia, one frailty, twelve falls, five delirium, five dementia and four incontinence. Sensitivity (57.1%-100%) of algorithms compared with a reference standard was reported in 20 studies, and specificity (84.0%-100%) was reported in only 12 studies. Study design quality was variable with results relevant to diagnostic accuracy not always reported, and few studies undertaking external validation of algorithms. CONCLUSIONS Current evidence suggests that Natural Language Processing algorithms can identify ageing syndromes in electronic health records. However, algorithms require testing in rigorously designed diagnostic accuracy studies with appropriate metrics reported.
Collapse
Affiliation(s)
- Mo Osman
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| | - Rachel Cooper
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| | - Avan A Sayer
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| | - Miles D Witham
- AGE Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NIHR Newcastle Biomedical Research Centre, Newcastle upon Tyne NHS Foundation Trust, Cumbria Northumberland Tyne and Wear NHS Foundation Trust and Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
37
|
Goryachev SD, Yildirim C, DuMontier C, La J, Dharne M, Gaziano JM, Brophy MT, Munshi NC, Driver JA, Do NV, Fillmore NR. Natural Language Processing Algorithm to Extract Multiple Myeloma Stage From Oncology Notes in the Veterans Affairs Healthcare System. JCO Clin Cancer Inform 2024; 8:e2300197. [PMID: 39038255 PMCID: PMC11371094 DOI: 10.1200/cci.23.00197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/29/2023] [Revised: 03/14/2024] [Accepted: 05/06/2024] [Indexed: 07/24/2024] Open
Abstract
PURPOSE Stage in multiple myeloma (MM) is an essential measure of disease risk, but its measurement in large databases is often lacking. We aimed to develop and validate a natural language processing (NLP) algorithm to extract oncologists' documentation of stage in the national Veterans Affairs (VA) Healthcare System. METHODS Using nationwide electronic health record (EHR) and cancer registry data from the VA Corporate Data Warehouse, we developed and validated a rule-based NLP algorithm to extract oncologist-determined MM stage. To that end, a clinician annotated MM stage within over 5,000 short snippets of clinical notes, and annotated MM stage at MM treatment initiation for 200 patients. These were allocated into snippet- and patient-level development and validation sets. We developed MM stage extraction and roll-up algorithms within the development sets. After the algorithms were finalized, we validated them using standard measures in held-out validation sets. RESULTS We developed algorithms for three different MM staging systems that have been in widespread use (Revised International Staging System [R-ISS], International Staging System [ISS], and Durie-Salmon [DS]) and for stage reported without a clearly defined system. Precision and recall were uniformly high for MM stage at the snippet level, ranging from 0.92 to 0.99 for the different MM staging systems. Performance in identifying for MM stage at treatment initiation at the patient level was also excellent, with precision of 0.92, 0.96, 0.90, and 0.86 and recall of 0.99, 0.98, 0.94, and 0.92 for R-ISS, ISS, DS, and unclear stage, respectively. CONCLUSION Our MM stage extraction algorithm uses rule-based NLP and data aggregation to accurately measure MM stage documented in oncology notes and pathology reports in VA's national EHR system. It may be adapted to other systems where MM stage is recorded in clinical notes.
Collapse
Affiliation(s)
- Sergey D. Goryachev
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- VA Boston Cooperative Studies Program, Boston, MA
| | - Cenk Yildirim
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- VA Boston Cooperative Studies Program, Boston, MA
| | - Clark DuMontier
- New England Geriatrics Research, Education and Clinical Center, VA Boston Healthcare System, Boston, MA
- Division of Aging, Brigham and Women's Hospital, Boston, MA
- Divison of Population Sciences, Dana-Farber Cancer Institute, Boston, MA
- Harvard Medical School, Boston, MA
| | - Jennifer La
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- Harvard Medical School, Boston, MA
| | | | - J. Michael Gaziano
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- Division of Aging, Brigham and Women's Hospital, Boston, MA
- Harvard Medical School, Boston, MA
| | - Mary T. Brophy
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- VA Boston Cooperative Studies Program, Boston, MA
- Boston University School of Medicine, Boston, MA
| | - Nikhil C. Munshi
- VA Boston Healthcare System, Boston, MA
- Harvard Medical School, Boston, MA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA
| | - Jane A. Driver
- New England Geriatrics Research, Education and Clinical Center, VA Boston Healthcare System, Boston, MA
- Division of Aging, Brigham and Women's Hospital, Boston, MA
- Harvard Medical School, Boston, MA
| | - Nhan V. Do
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- VA Boston Cooperative Studies Program, Boston, MA
- Boston University School of Medicine, Boston, MA
| | - Nathanael R. Fillmore
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Boston, MA
- VA Boston Healthcare System, Boston, MA
- Harvard Medical School, Boston, MA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA
| |
Collapse
|
38
|
Wang H, Alanis N, Haygood L, Swoboda TK, Hoot N, Phillips D, Knowles H, Stinson SA, Mehta P, Sambamoorthi U. Using natural language processing in emergency medicine health service research: A systematic review and meta-analysis. Acad Emerg Med 2024; 31:696-706. [PMID: 38757352 PMCID: PMC11246236 DOI: 10.1111/acem.14937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/28/2024] [Revised: 04/15/2024] [Accepted: 04/17/2024] [Indexed: 05/18/2024]
Abstract
OBJECTIVES Natural language processing (NLP) represents one of the adjunct technologies within artificial intelligence and machine learning, creating structure out of unstructured data. This study aims to assess the performance of employing NLP to identify and categorize unstructured data within the emergency medicine (EM) setting. METHODS We systematically searched publications related to EM research and NLP across databases including MEDLINE, Embase, Scopus, CENTRAL, and ProQuest Dissertations & Theses Global. Independent reviewers screened, reviewed, and evaluated article quality and bias. NLP usage was categorized into syndromic surveillance, radiologic interpretation, and identification of specific diseases/events/syndromes, with respective sensitivity analysis reported. Performance metrics for NLP usage were calculated and the overall area under the summary of receiver operating characteristic curve (SROC) was determined. RESULTS A total of 27 studies underwent meta-analysis. Findings indicated an overall mean sensitivity (recall) of 82%-87%, specificity of 95%, with the area under the SROC at 0.96 (95% CI 0.94-0.98). Optimal performance using NLP was observed in radiologic interpretation, demonstrating an overall mean sensitivity of 93% and specificity of 96%. CONCLUSIONS Our analysis revealed a generally favorable performance accuracy in using NLP within EM research, particularly in the realm of radiologic interpretation. Consequently, we advocate for the adoption of NLP-based research to augment EM health care management.
Collapse
Affiliation(s)
- Hao Wang
- Department of Emergency Medicine, JPS Health Network, 1500 S. Main St., Fort Worth, TX 76104
| | - Naomi Alanis
- Department of Emergency Medicine, JPS Health Network, 1500 S. Main St., Fort Worth, TX 76104
| | - Laura Haygood
- Health Sciences Librarian for Public Health, Brown University, 69 Brown St., Providence, RI 02912
| | - Thomas K. Swoboda
- Department of Emergency Medicine, The Valley Health System, Touro University Nevada School of Osteopathic Medicine, 657 N. Town Center Drive, Las Vegas, NV 89144
| | - Nathan Hoot
- Department of Emergency Medicine, JPS Health Network, 1500 S. Main St., Fort Worth, TX 76104
| | - Daniel Phillips
- Department of Emergency Medicine, JPS Health Network, 1500 S. Main St., Fort Worth, TX 76104
| | - Heidi Knowles
- Department of Emergency Medicine, JPS Health Network, 1500 S. Main St., Fort Worth, TX 76104
| | - Sara Ann Stinson
- Mary Couts Burnett Library, Burnett School of Medicine at Texas Christian University, 2800 S. University Dr., Fort Worth, TX 76109
| | - Prachi Mehta
- Department of Emergency Medicine, JPS Health Network, 1500 S. Main St., Fort Worth, TX 76104
| | - Usha Sambamoorthi
- College of Pharmacy, University of North Texas Health Science Center, 3500 Camp Bowie Blvd, Fort Worth, TX 76107
| |
Collapse
|
39
|
Parsa S, Somani S, Dudum R, Jain SS, Rodriguez F. Artificial Intelligence in Cardiovascular Disease Prevention: Is it Ready for Prime Time? Curr Atheroscler Rep 2024; 26:263-272. [PMID: 38780665 PMCID: PMC11457745 DOI: 10.1007/s11883-024-01210-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Accepted: 05/08/2024] [Indexed: 05/25/2024]
Abstract
PURPOSE OF REVIEW This review evaluates how Artificial Intelligence (AI) enhances atherosclerotic cardiovascular disease (ASCVD) risk assessment, allows for opportunistic screening, and improves adherence to guidelines through the analysis of unstructured clinical data and patient-generated data. Additionally, it discusses strategies for integrating AI into clinical practice in preventive cardiology. RECENT FINDINGS AI models have shown superior performance in personalized ASCVD risk evaluations compared to traditional risk scores. These models now support automated detection of ASCVD risk markers, including coronary artery calcium (CAC), across various imaging modalities such as dedicated ECG-gated CT scans, chest X-rays, mammograms, coronary angiography, and non-gated chest CT scans. Moreover, large language model (LLM) pipelines are effective in identifying and addressing gaps and disparities in ASCVD preventive care, and can also enhance patient education. AI applications are proving invaluable in preventing and managing ASCVD and are primed for clinical use, provided they are implemented within well-regulated, iterative clinical pathways.
Collapse
Affiliation(s)
- Shyon Parsa
- Department of Medicine, Stanford University, Stanford, California, USA
| | - Sulaiman Somani
- Department of Medicine, Stanford University, Stanford, California, USA
| | - Ramzi Dudum
- Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, USA
| | - Sneha S Jain
- Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, USA
| | - Fatima Rodriguez
- Division of Cardiovascular Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, USA.
- Center for Digital Health, Stanford University, Stanford, California, USA.
| |
Collapse
|
40
|
Darer JD, Pesa J, Choudhry Z, Batista AE, Parab P, Yang X, Govindarajan R. Characterizing Myasthenia Gravis Symptoms, Exacerbations, and Crises From Neurologist's Clinical Notes Using Natural Language Processing. Cureus 2024; 16:e65792. [PMID: 39219871 PMCID: PMC11361825 DOI: 10.7759/cureus.65792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Accepted: 07/29/2024] [Indexed: 09/04/2024] Open
Abstract
Background Myasthenia gravis (MG) is a rare, autoantibody neuromuscular disorder characterized by fatigable weakness. Real-world evidence based on administrative and structured datasets regarding MG may miss important details related to the clinical encounter. Examination of free-text clinical progress notes has the potential to illuminate aspects of MG care. Objective The primary objective was to examine and characterize neurologist progress notes in the care of individuals with MG regarding the prevalence of documentation of clinical subtypes, antibody status, symptomatology, and MG deteriorations, including exacerbations and crises. The secondary objectives were to categorize MG deteriorations into practical, objective states as well as examine potential sources of clinical inertia in MG care. Methods We performed a retrospective, cross-sectional analysis of de-identified neurologist clinical notes from 2017 to 2022. A qualitative analysis of physician descriptions of MG deteriorations and a discussion of risks in MG care (risk for adverse effects, risk for clinical decompensation, etc.) was performed. Results Of the 3,085 individuals with MG, clinical subtypes and antibody status identified included gMG (n = 400; 13.0%), ocular MG (n = 253; 8.2%), MG unspecified (2,432; 78.8%), seropositivity for acetylcholine receptor antibody (n = 441; 14.3%), and MuSK antibody (n = 29; 0.9%). The most common gMG manifestations were dysphagia (n = 712; 23.0%), dyspnea (n = 626; 20.3%), and dysarthria (n = 514; 16.7%). In MG crisis patients, documentation of difficulties with MG standard therapies was common (n = 62; 45.2%). The qualitative analysis of MG deterioration types includes symptom fluctuation, symptom worsening with treatment intensification, MG deterioration with rescue therapy, and MG crisis. Qualitative analysis of MG-related risks included the toxicity of new therapies and concern for worsening MG because of changing therapies. Conclusions This study of neurologist progress notes demonstrates the potential for real-world evidence generation in the care of individuals with MG. MG patients suffer fluctuating symptomatology and a spectrum of clinical deteriorations. Adverse effects of MG therapies are common, highlighting the need for effective, less toxic treatments.
Collapse
Affiliation(s)
| | - Jacqueline Pesa
- Real World Value and Evidence, Immunology, Janssen Scientific Affairs, Titusville, USA
| | - Zia Choudhry
- Rare Antibody Diseases, Janssen Scientific Affairs, Titusville, USA
| | | | - Purva Parab
- Biostatistics, Health Analytics, Clarksville, USA
| | - Xiaoyun Yang
- Biostatistics, Health Analytics, Clarksville, USA
| | | |
Collapse
|
41
|
Wieland-Jorna Y, van Kooten D, Verheij RA, de Man Y, Francke AL, Oosterveld-Vlug MG. Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review. JAMIA Open 2024; 7:ooae044. [PMID: 38798774 PMCID: PMC11126158 DOI: 10.1093/jamiaopen/ooae044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/03/2024] [Revised: 03/21/2024] [Accepted: 05/07/2024] [Indexed: 05/29/2024] Open
Abstract
Objective Natural language processing (NLP) can enhance research on activities of daily living (ADL) by extracting structured information from unstructured electronic health records (EHRs) notes. This review aims to give insight into the state-of-the-art, usability, and performance of NLP systems to extract information on ADL from EHRs. Materials and Methods A systematic review was conducted based on searches in Pubmed, Embase, Cinahl, Web of Science, and Scopus. Studies published between 2017 and 2022 were selected based on predefined eligibility criteria. Results The review identified 22 studies. Most studies (65%) used NLP for classifying unstructured EHR data on 1 or 2 ADL. Deep learning, combined with a ruled-based method or machine learning, was the approach most commonly used. NLP systems varied widely in terms of the pre-processing and algorithms. Common performance evaluation methods were cross-validation and train/test datasets, with F1, precision, and sensitivity as the most frequently reported evaluation metrics. Most studies reported relativity high overall scores on the evaluation metrics. Discussion NLP systems are valuable for the extraction of unstructured EHR data on ADL. However, comparing the performance of NLP systems is difficult due to the diversity of the studies and challenges related to the dataset, including restricted access to EHR data, inadequate documentation, lack of granularity, and small datasets. Conclusion This systematic review indicates that NLP is promising for deriving information on ADL from unstructured EHR notes. However, what the best-performing NLP system is, depends on characteristics of the dataset, research question, and type of ADL.
Collapse
Affiliation(s)
- Yvonne Wieland-Jorna
- Netherlands Institute for Health Services Research (Nivel), Utrecht, Postbus 1568, 3500 BN, The Netherlands
- Tranzo, School of Social Sciences and Behavioural Research, Tilburg University, Tilburg, Postbus 90153, 5000 LE, The Netherlands
| | - Daan van Kooten
- Netherlands Institute for Health Services Research (Nivel), Utrecht, Postbus 1568, 3500 BN, The Netherlands
| | - Robert A Verheij
- Netherlands Institute for Health Services Research (Nivel), Utrecht, Postbus 1568, 3500 BN, The Netherlands
- Tranzo, School of Social Sciences and Behavioural Research, Tilburg University, Tilburg, Postbus 90153, 5000 LE, The Netherlands
| | - Yvonne de Man
- Netherlands Institute for Health Services Research (Nivel), Utrecht, Postbus 1568, 3500 BN, The Netherlands
| | - Anneke L Francke
- Netherlands Institute for Health Services Research (Nivel), Utrecht, Postbus 1568, 3500 BN, The Netherlands
- Department of Public and Occupational Health, Location Vrije Universiteit Amsterdam, Amsterdam UMC, Amsterdam, Postbus 7057, 1007 MB, The Netherlands
| | - Mariska G Oosterveld-Vlug
- Netherlands Institute for Health Services Research (Nivel), Utrecht, Postbus 1568, 3500 BN, The Netherlands
| |
Collapse
|
42
|
Smith SJ, Moorin R, Taylor K, Newton J, Smith S. Collecting routine and timely cancer stage at diagnosis by implementing a cancer staging tiered framework: the Western Australian Cancer Registry experience. BMC Health Serv Res 2024; 24:770. [PMID: 38943091 PMCID: PMC11214229 DOI: 10.1186/s12913-024-11224-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/22/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024] Open
Abstract
BACKGROUND Current processes collecting cancer stage data in population-based cancer registries (PBCRs) lack standardisation, resulting in difficulty utilising diverse data sources and incomplete, low-quality data. Implementing a cancer staging tiered framework aims to improve stage collection and facilitate inter-PBCR benchmarking. OBJECTIVE Demonstrate the application of a cancer staging tiered framework in the Western Australian Cancer Staging Project to establish a standardised method for collecting cancer stage at diagnosis data in PBCRs. METHODS The tiered framework, developed in collaboration with a Project Advisory Group and applied to breast, colorectal, and melanoma cancers, provides business rules - procedures for stage collection. Tier 1 represents the highest staging level, involving complete American Joint Committee on Cancer (AJCC) tumour-node-metastasis (TNM) data collection and other critical staging information. Tier 2 (registry-derived stage) relies on supplementary data, including hospital admission data, to make assumptions based on data availability. Tier 3 (pathology stage) solely uses pathology reports. FINDINGS The tiered framework promotes flexible utilisation of staging data, recognising various levels of data completeness. Tier 1 is suitable for all purposes, including clinical and epidemiological applications. Tiers 2 and 3 are recommended for epidemiological analysis alone. Lower tiers provide valuable insights into disease patterns, risk factors, and overall disease burden for public health planning and policy decisions. Capture of staging at each tier depends on data availability, with potential shifts to higher tiers as new data sources are acquired. CONCLUSIONS The tiered framework offers a dynamic approach for PBCRs to record stage at diagnosis, promoting consistency in population-level staging data and enabling practical use for benchmarking across jurisdictions, public health planning, policy development, epidemiological analyses, and assessing cancer outcomes. Evolution with staging classifications and data variable changes will futureproof the tiered framework. Its adaptability fosters continuous refinement of data collection processes and encourages improvements in data quality.
Collapse
Affiliation(s)
- Shantelle J Smith
- School of Population Health, Curtin University, Perth, WA, Australia.
- Faculty of Health Sciences, Curtin Health Innovation Research Institute, Curtin University, Bentley, WA, Australia.
| | - Rachael Moorin
- School of Population Health, Curtin University, Perth, WA, Australia
- Faculty of Health Sciences, Curtin Health Innovation Research Institute, Curtin University, Bentley, WA, Australia
- School of Population and Global Health, The University of Western Australia, Crawley, WA, Australia
| | - Karen Taylor
- Cancer Network WA, North Metropolitan Health Service, Perth, WA, Australia
| | - Jade Newton
- School of Population Health, Curtin University, Perth, WA, Australia
- Faculty of Health Sciences, Curtin Health Innovation Research Institute, Curtin University, Bentley, WA, Australia
| | - Stephanie Smith
- School of Population Health, Curtin University, Perth, WA, Australia
- Curtin Medical School, Curtin University, Perth, WA, Australia
| |
Collapse
|
43
|
Wang M, Vijayaraghavan A, Beck T, Posma JM. Vocabulary Matters: An Annotation Pipeline and Four Deep Learning Algorithms for Enzyme Named Entity Recognition. J Proteome Res 2024; 23:1915-1925. [PMID: 38733346 PMCID: PMC11165580 DOI: 10.1021/acs.jproteome.3c00367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 06/20/2023] [Revised: 01/30/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024]
Abstract
Enzymes are indispensable in many biological processes, and with biomedical literature growing exponentially, effective literature review becomes increasingly challenging. Natural language processing methods offer solutions to streamline this process. This study aims to develop an annotated enzyme corpus for training and evaluating enzyme named entity recognition (NER) models. A novel pipeline, combining dictionary matching and rule-based keyword searching, automatically annotated enzyme entities in >4800 full-text publications. Four deep learning NER models were created with different vocabularies (BioBERT/SciBERT) and architectures (BiLSTM/transformer) and evaluated on 526 manually annotated full-text publications. The annotation pipeline achieved an F1-score of 0.86 (precision = 1.00, recall = 0.76), surpassed by fine-tuned transformers for F1-score (BioBERT: 0.89, SciBERT: 0.88) and recall (0.86) with BiLSTM models having higher precision (0.94) than transformers (0.92). The annotation pipeline runs in seconds on standard laptops with almost perfect precision, but was outperformed by fine-tuned transformers in terms of F1-score and recall, demonstrating generalizability beyond the training data. In comparison, SciBERT-based models exhibited higher precision, and BioBERT-based models exhibited higher recall, highlighting the importance of vocabulary and architecture. These models, representing the first enzyme NER algorithms, enable more effective enzyme text mining and information extraction. Codes for automated annotation and model generation are available from https://github.com/omicsNLP/enzymeNER and https://zenodo.org/doi/10.5281/zenodo.10581586.
Collapse
Affiliation(s)
- Meiqi Wang
- Section
of Bioinformatics, Division of Systems Medicine, Department of Metabolism,
Digestion and Reproduction, Imperial College
London, London W12 0NN, U.K.
| | - Avish Vijayaraghavan
- Section
of Bioinformatics, Division of Systems Medicine, Department of Metabolism,
Digestion and Reproduction, Imperial College
London, London W12 0NN, U.K.
- UKRI
Centre for Doctoral Training in AI for Healthcare, Department of Computing, Imperial College London, London SW7 2AZ, U.K.
| | - Tim Beck
- School
of Medicine, University of Nottingham, Biodiscovery
Institute, Nottingham NG7 2RD, U.K.
- Health
Data Research (HDR) U.K., London NW1 2BE, U.K.
| | - Joram M. Posma
- Section
of Bioinformatics, Division of Systems Medicine, Department of Metabolism,
Digestion and Reproduction, Imperial College
London, London W12 0NN, U.K.
- Health
Data Research (HDR) U.K., London NW1 2BE, U.K.
| |
Collapse
|
44
|
Iscoe M, Socrates V, Gilson A, Chi L, Li H, Huang T, Kearns T, Perkins R, Khandjian L, Taylor RA. Identifying signs and symptoms of urinary tract infection from emergency department clinical notes using large language models. Acad Emerg Med 2024; 31:599-610. [PMID: 38567658 DOI: 10.1111/acem.14883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 10/12/2023] [Revised: 01/24/2024] [Accepted: 01/24/2024] [Indexed: 04/04/2024]
Abstract
BACKGROUND Natural language processing (NLP) tools including recently developed large language models (LLMs) have myriad potential applications in medical care and research, including the efficient labeling and classification of unstructured text such as electronic health record (EHR) notes. This opens the door to large-scale projects that rely on variables that are not typically recorded in a structured form, such as patient signs and symptoms. OBJECTIVES This study is designed to acquaint the emergency medicine research community with the foundational elements of NLP, highlighting essential terminology, annotation methodologies, and the intricacies involved in training and evaluating NLP models. Symptom characterization is critical to urinary tract infection (UTI) diagnosis, but identification of symptoms from the EHR has historically been challenging, limiting large-scale research, public health surveillance, and EHR-based clinical decision support. We therefore developed and compared two NLP models to identify UTI symptoms from unstructured emergency department (ED) notes. METHODS The study population consisted of patients aged ≥ 18 who presented to an ED in a northeastern U.S. health system between June 2013 and August 2021 and had a urinalysis performed. We annotated a random subset of 1250 ED clinician notes from these visits for a list of 17 UTI symptoms. We then developed two task-specific LLMs to perform the task of named entity recognition: a convolutional neural network-based model (SpaCy) and a transformer-based model designed to process longer documents (Clinical Longformer). Models were trained on 1000 notes and tested on a holdout set of 250 notes. We compared model performance (precision, recall, F1 measure) at identifying the presence or absence of UTI symptoms at the note level. RESULTS A total of 8135 entities were identified in 1250 notes; 83.6% of notes included at least one entity. Overall F1 measure for note-level symptom identification weighted by entity frequency was 0.84 for the SpaCy model and 0.88 for the Longformer model. F1 measure for identifying presence or absence of any UTI symptom in a clinical note was 0.96 (232/250 correctly classified) for the SpaCy model and 0.98 (240/250 correctly classified) for the Longformer model. CONCLUSIONS The study demonstrated the utility of LLMs and transformer-based models in particular for extracting UTI symptoms from unstructured ED clinical notes; models were highly accurate for detecting the presence or absence of any UTI symptom on the note level, with variable performance for individual symptoms.
Collapse
Affiliation(s)
- Mark Iscoe
- Department of Emergency Medicine, Yale School of Medicine, New Haven, Connecticut, USA
- Section for Biomedical Informatics and Data Science, Yale University School of Medicine, New Haven, Connecticut, USA
| | - Vimig Socrates
- Section for Biomedical Informatics and Data Science, Yale University School of Medicine, New Haven, Connecticut, USA
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
| | - Aidan Gilson
- Yale School of Medicine, New Haven, Connecticut, USA
| | - Ling Chi
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Huan Li
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
| | - Thomas Huang
- Yale School of Medicine, New Haven, Connecticut, USA
| | - Thomas Kearns
- Department of Emergency Medicine, Yale School of Medicine, New Haven, Connecticut, USA
| | - Rachelle Perkins
- Department of Emergency Medicine, Yale School of Medicine, New Haven, Connecticut, USA
| | - Laura Khandjian
- Department of Emergency Medicine, Yale School of Medicine, New Haven, Connecticut, USA
| | - R Andrew Taylor
- Department of Emergency Medicine, Yale School of Medicine, New Haven, Connecticut, USA
- Section for Biomedical Informatics and Data Science, Yale University School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
45
|
Petit-Jean T, Gérardin C, Berthelot E, Chatellier G, Frank M, Tannier X, Kempf E, Bey R. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. J Am Med Inform Assoc 2024; 31:1280-1290. [PMID: 38573195 PMCID: PMC11105139 DOI: 10.1093/jamia/ocae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/11/2023] [Revised: 02/28/2024] [Accepted: 03/13/2024] [Indexed: 04/05/2024] Open
Abstract
OBJECTIVE To develop and validate a natural language processing (NLP) pipeline that detects 18 conditions in French clinical notes, including 16 comorbidities of the Charlson index, while exploring a collaborative and privacy-enhancing workflow. MATERIALS AND METHODS The detection pipeline relied both on rule-based and machine learning algorithms, respectively, for named entity recognition and entity qualification, respectively. We used a large language model pre-trained on millions of clinical notes along with annotated clinical notes in the context of 3 cohort studies related to oncology, cardiology, and rheumatology. The overall workflow was conceived to foster collaboration between studies while respecting the privacy constraints of the data warehouse. We estimated the added values of the advanced technologies and of the collaborative setting. RESULTS The pipeline reached macro-averaged F1-score positive predictive value, sensitivity, and specificity of 95.7 (95%CI 94.5-96.3), 95.4 (95%CI 94.0-96.3), 96.0 (95%CI 94.0-96.7), and 99.2 (95%CI 99.0-99.4), respectively. F1-scores were superior to those observed using alternative technologies or non-collaborative settings. The models were shared through a secured registry. CONCLUSIONS We demonstrated that a community of investigators working on a common clinical data warehouse could efficiently and securely collaborate to develop, validate and use sensitive artificial intelligence models. In particular, we provided an efficient and robust NLP pipeline that detects conditions mentioned in clinical notes.
Collapse
Affiliation(s)
- Thomas Petit-Jean
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
| | - Christel Gérardin
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
- Institut Pierre-Louis d’Epidémiologie et de Santé Publique, INSERM, Sorbonne Université, Paris, 75012, France
| | - Emmanuelle Berthelot
- Department of Cardiology, Hôpital Bicêtre, Assistance Publique-Hôpitaux de Paris, Le Kremlin Bicêtre, 94270, France
| | - Gilles Chatellier
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
- Department of Medical Informatics, Assistance Publique-Hôpitaux de Paris, Centre-Université de Paris (APHP-CUP), Université de Paris, Paris, 75015, France
| | - Marie Frank
- Department of Medical Informatics, Hôpitaux Universitaires Paris-Saclay, Assistance Publique-Hôpitaux de Paris, Le Kremlin-Bicêtre, 94270, France
| | - Xavier Tannier
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé (LIMICS), INSERM, Université Sorbonne Paris Nord, Sorbonne Université, Paris, 75005, France
| | - Emmanuelle Kempf
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances pour la e-Santé (LIMICS), INSERM, Université Sorbonne Paris Nord, Sorbonne Université, Paris, 75005, France
- Department of Medical Oncology, Henri Mondor and Albert Chenevier Teaching Hospital, Assistance Publique-Hôpitaux de Paris, Créteil, 94000, France
| | - Romain Bey
- Innovation and Data Unit, IT Department, Assistance Publique-Hôpitaux de Paris, Paris, 75012, France
| |
Collapse
|
46
|
Lee K, Paek H, Huang LC, Hilton CB, Datta S, Higashi J, Ofoegbu N, Wang J, Rubinstein SM, Cowan AJ, Kwok M, Warner JL, Xu H, Wang X. SEETrials: Leveraging Large Language Models for Safety and Efficacy Extraction in Oncology Clinical Trials. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.18.24301502. [PMID: 38798420 PMCID: PMC11118548 DOI: 10.1101/2024.01.18.24301502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/29/2024]
Abstract
Background Initial insights into oncology clinical trial outcomes are often gleaned manually from conference abstracts. We aimed to develop an automated system to extract safety and efficacy information from study abstracts with high precision and fine granularity, transforming them into computable data for timely clinical decision-making. Methods We collected clinical trial abstracts from key conferences and PubMed (2012-2023). The SEETrials system was developed with four modules: preprocessing, prompt modeling, knowledge ingestion and postprocessing. We evaluated the system's performance qualitatively and quantitatively and assessed its generalizability across different cancer types- multiple myeloma (MM), breast, lung, lymphoma, and leukemia. Furthermore, the efficacy and safety of innovative therapies, including CAR-T, bispecific antibodies, and antibody-drug conjugates (ADC), in MM were analyzed across a large scale of clinical trial studies. Results SEETrials achieved high precision (0.958), recall (sensitivity) (0.944), and F1 score (0.951) across 70 data elements present in the MM trial studies Generalizability tests on four additional cancers yielded precision, recall, and F1 scores within the 0.966-0.986 range. Variation in the distribution of safety and efficacy-related entities was observed across diverse therapies, with certain adverse events more common in specific treatments. Comparative performance analysis using overall response rate (ORR) and complete response (CR) highlighted differences among therapies: CAR-T (ORR: 88%, 95% CI: 84-92%; CR: 95%, 95% CI: 53-66%), bispecific antibodies (ORR: 64%, 95% CI: 55-73%; CR: 27%, 95% CI: 16-37%), and ADC (ORR: 51%, 95% CI: 37-65%; CR: 26%, 95% CI: 1-51%). Notable study heterogeneity was identified (>75% I 2 heterogeneity index scores) across several outcome entities analyzed within therapy subgroups. Conclusion SEETrials demonstrated highly accurate data extraction and versatility across different therapeutics and various cancer domains. Its automated processing of large datasets facilitates nuanced data comparisons, promoting the swift and effective dissemination of clinical insights.
Collapse
|
47
|
Maciejewski C, Ozierański K, Barwiołek A, Basza M, Bożym A, Ciurla M, Janusz Krajsman M, Maciejewska M, Lodziński P, Opolski G, Grabowski M, Cacko A, Balsam P. AssistMED project: Transforming cardiology cohort characterisation from electronic health records through natural language processing - Algorithm design, preliminary results, and field prospects. Int J Med Inform 2024; 185:105380. [PMID: 38447318 DOI: 10.1016/j.ijmedinf.2024.105380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 10/12/2023] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 03/08/2024]
Abstract
INTRODUCTION Electronic health records (EHR) are of great value for clinical research. However, EHR consists primarily of unstructured text which must be analysed by a human and coded into a database before data analysis- a time-consuming and costly process limiting research efficiency. Natural language processing (NLP) can facilitate data retrieval from unstructured text. During AssistMED project, we developed a practical, NLP tool that automatically provides comprehensive clinical characteristics of patients from EHR, that is tailored to clinical researchers needs. MATERIAL AND METHODS AssistMED retrieves patient characteristics regarding clinical conditions, medications with dosage, and echocardiographic parameters with clinically oriented data structure and provides researcher-friendly database output. We validate the algorithm performance against manual data retrieval and provide critical quantitative and qualitative analysis. RESULTS AssistMED analysed the presence of 56 clinical conditions, medications from 16 drug groups with dosage and 15 numeric echocardiographic parameters in a sample of 400 patients hospitalized in the cardiology unit. No statistically significant differences between algorithm and human retrieval were noted. Qualitative analysis revealed that disagreements with manual annotation were primarily accounted to random algorithm errors, erroneous human annotation and lack of advanced context awareness of our tool. CONCLUSIONS Current NLP approaches are feasible to acquire accurate and detailed patient characteristics tailored to clinical researchers' needs from EHR. We present an in-depth description of an algorithm development and validation process, discuss obstacles and pinpoint potential solutions, including opportunities arising with recent advancements in the field of NLP, such as large language models.
Collapse
Affiliation(s)
- Cezary Maciejewski
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland; Doctoral School, Medical University of Warsaw, 02-091 Warszawa, Poland; Department of Medical Informatics and Telemedicine, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Krzysztof Ozierański
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.
| | - Adam Barwiołek
- Codifive sp. z o.o., Lindleya 16, 02-013 Warszawa, Poland
| | - Mikołaj Basza
- Medical University of Silesia in Katowice, 40-055 Katowice, Poland
| | - Aleksandra Bożym
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Michalina Ciurla
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Maciej Janusz Krajsman
- Department of Medical Informatics and Telemedicine, Medical University of Warsaw, 02-091 Warszawa, Poland
| | | | - Piotr Lodziński
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Grzegorz Opolski
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Marcin Grabowski
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Andrzej Cacko
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland; Department of Medical Informatics and Telemedicine, Medical University of Warsaw, 02-091 Warszawa, Poland
| | - Paweł Balsam
- 1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland
| |
Collapse
|
48
|
Falter M, Godderis D, Scherrenberg M, Kizilkilic SE, Xu L, Mertens M, Jansen J, Legroux P, Kindermans H, Sinnaeve P, Neven F, Dendale P. Using natural language processing for automated classification of disease and to identify misclassified ICD codes in cardiac disease. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2024; 5:229-234. [PMID: 38774372 PMCID: PMC11104467 DOI: 10.1093/ehjdh/ztae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Academic Contribution Register] [Received: 08/30/2023] [Revised: 01/30/2024] [Accepted: 02/05/2024] [Indexed: 05/24/2024]
Abstract
Aims ICD codes are used for classification of hospitalizations. The codes are used for administrative, financial, and research purposes. It is known, however, that errors occur. Natural language processing (NLP) offers promising solutions for optimizing the process. To investigate methods for automatic classification of disease in unstructured medical records using NLP and to compare these to conventional ICD coding. Methods and results Two datasets were used: the open-source Medical Information Mart for Intensive Care (MIMIC)-III dataset (n = 55.177) and a dataset from a hospital in Belgium (n = 12.706). Automated searches using NLP algorithms were performed for the diagnoses 'atrial fibrillation (AF)' and 'heart failure (HF)'. Four methods were used: rule-based search, logistic regression, term frequency-inverse document frequency (TF-IDF), Extreme Gradient Boosting (XGBoost), and Bio-Bidirectional Encoder Representations from Transformers (BioBERT). All algorithms were developed on the MIMIC-III dataset. The best performing algorithm was then deployed on the Belgian dataset. After preprocessing a total of 1438 reports was retained in the Belgian dataset. XGBoost on TF-IDF matrix resulted in an accuracy of 0.94 and 0.92 for AF and HF, respectively. There were 211 mismatches between algorithm and ICD codes. One hundred and three were due to a difference in data availability or differing definitions. In the remaining 108 mismatches, 70% were due to incorrect labelling by the algorithm and 30% were due to erroneous ICD coding (2% of total hospitalizations). Conclusion A newly developed NLP algorithm attained a high accuracy for classifying disease in medical records. XGBoost outperformed the deep learning technique BioBERT. NLP algorithms could be used to identify ICD-coding errors and optimize and support the ICD-coding process.
Collapse
Affiliation(s)
- Maarten Falter
- Faculty of Medicine and Life Sciences, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
- Heart Centre Hasselt, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
- Department of Cardiology, KULeuven, Faculty of Medicine, Herestraat 49, 3000 Leuven, Belgium
| | - Dries Godderis
- Data Science Institute, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
| | - Martijn Scherrenberg
- Faculty of Medicine and Life Sciences, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
- Heart Centre Hasselt, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
- Faculty of Medicine and Health Sciences, Antwerp University, Universiteitsplein 1, 2610 Antwerp, Belgium
| | - Sevda Ece Kizilkilic
- Faculty of Medicine and Life Sciences, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
- Heart Centre Hasselt, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
- Faculty of Medicine and Health Sciences, Ghent University, Corneel Heymanslaan 10, 9000 Gent, Belgium
| | - Linqi Xu
- Faculty of Medicine and Life Sciences, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
- Heart Centre Hasselt, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
| | - Marc Mertens
- Department of Information and Communications Technology, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
| | - Jan Jansen
- Department of Information and Communications Technology, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
| | - Pascal Legroux
- Department of Information and Communications Technology, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
| | - Hanne Kindermans
- Faculty of Medicine and Life Sciences, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
| | - Peter Sinnaeve
- Department of Cardiology, KULeuven, Faculty of Medicine, Herestraat 49, 3000 Leuven, Belgium
| | - Frank Neven
- Data Science Institute, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
| | - Paul Dendale
- Faculty of Medicine and Life Sciences, Hasselt University, Agoralaan gebouw D, 3590 Diepenbeek, Hasselt, Belgium
- Heart Centre Hasselt, Jessa Hospital, Stadsomvaart 11, 3500 Hasselt, Belgium
| |
Collapse
|
49
|
Eyre H, Alba PR, Gibson CJ, Gatsby E, Lynch KE, Patterson OV, DuVall SL. Bridging information gaps in menopause status classification through natural language processing. JAMIA Open 2024; 7:ooae013. [PMID: 38419670 PMCID: PMC10901606 DOI: 10.1093/jamiaopen/ooae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 04/03/2023] [Revised: 01/22/2024] [Accepted: 02/06/2024] [Indexed: 03/02/2024] Open
Abstract
Objective To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient's menopausal status. Materials and methods A rule-based NLP system was designed to capture evidence of a patient's menopause status including dates of a patient's last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. NLP-derived output was used in combination with structured EHR data to classify a patient's menopausal status. NLP processing and patient classification were performed on a cohort of 307 512 female Veterans receiving healthcare at the US Department of Veterans Affairs (VA). Results NLP was validated at 99.6% precision. Including the NLP-derived data into a menopause phenotype increased the number of patients with data relevant to their menopausal status by 118%. Using structured codes alone, 81 173 (27.0%) are able to be classified as postmenopausal or premenopausal. However, with the inclusion of NLP, this number increased 167 804 (54.6%) patients. The premenopausal category grew by 532.7% with the inclusion of NLP data. Discussion By employing NLP, it became possible to identify documented data elements that predate VA care, originate outside VA networks, or have no corresponding structured field in the VA EHR that would be otherwise inaccessible for further analysis. Conclusion NLP can be used to identify concepts relevant to a patient's menopausal status in clinical notes. Adding NLP-derived data to an algorithm classifying a patient's menopausal status significantly increases the number of patients classified using EHR data, ultimately enabling more detailed assessments of the impact of menopause on health outcomes.
Collapse
Affiliation(s)
- Hannah Eyre
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT 84113, United States
- Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, UT 84112, United States
| | - Patrick R Alba
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT 84113, United States
- Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, UT 84112, United States
| | - Carolyn J Gibson
- San Francisco VA Healthcare System, San Francisco, CA 94121, United States
- University of California, San Francisco, San Francisco, CA 94115, United States
| | - Elise Gatsby
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT 84113, United States
| | - Kristine E Lynch
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT 84113, United States
- Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, UT 84112, United States
| | - Olga V Patterson
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT 84113, United States
- Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, UT 84112, United States
| | - Scott L DuVall
- VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT 84113, United States
- Department of Internal Medicine, School of Medicine, University of Utah, Salt Lake City, UT 84112, United States
| |
Collapse
|
50
|
Zupanc SN, Durieux BN, Walling AM, Lindvall C. Bolstering Advance Care Planning Measurement Using Natural Language Processing. J Palliat Med 2024; 27:447-450. [PMID: 38324042 DOI: 10.1089/jpm.2023.0528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 02/08/2024] Open
Abstract
Despite its growth as a clinical activity and research topic, the complex dynamic nature of advance care planning (ACP) has posed serious challenges for researchers hoping to quantitatively measure it. Methods for measurement have traditionally depended on lengthy manual chart abstractions or static documents (e.g., advance directive forms) even though completion of such documents is only one aspect of ACP. Natural language processing (NLP), in the form of an assisted electronic health record (EHR) review, is a technological advancement that may help researchers better measure ACP activity. In this article, we aim to show how NLP-assisted EHR review supports more accurate and robust measurement of ACP. We do so by presenting three example applications that illustrate how using NLP for this purpose supports (1) measurement in research, (2) detailed insights into ACP in quality improvement, and (3) identification of current limitations of ACP in clinical settings.
Collapse
Affiliation(s)
- Sophia N Zupanc
- Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- UCSF School of Medicine, San Francisco, California, USA
| | - Brigitte N Durieux
- Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Anne M Walling
- Department of Medicine, University of California Los Angeles, Los Angeles, California, USA
- VDepartment of Medicine, A Greater Los Angeles Health System, Los Angeles, California, USA
| | - Charlotta Lindvall
- Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA
- Department of Medicine, Harvard Medical School, Harvard University, Boston, Massachusetts, USA
| |
Collapse
|