101
|
Sochat V, Culquicondor A, Ojea A, Milroy D. The Flux Operator. F1000Res 2024; 13:203. [PMID: 38868668 PMCID: PMC11167326 DOI: 10.12688/f1000research.147989.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/06/2024] [Indexed: 06/14/2024] Open
Abstract
Converged computing is an emerging area of computing that brings together the best of both worlds for high performance computing (HPC) and cloud-native communities. The economic influence of cloud computing and the need for workflow portability, flexibility, and manageability are driving this emergence. Navigating the uncharted territory and building an effective space for both HPC and cloud require collaborative technological development and research. In this work, we focus on developing components for the converged workload manager, the central component of batch workflows running in any environment. From the cloud we base our work on Kubernetes, the de facto standard batch workload orchestrator. From HPC the orchestrator counterpart is Flux Framework, a fully hierarchical resource management and graph-based scheduler with a modular architecture that supports sophisticated scheduling and job management. Bringing these managers together consists of implementing Flux inside of Kubernetes, enabling hierarchical resource management and scheduling that scales without burdening the Kubernetes scheduler. This paper introduces the Flux Operator - an on-demand HPC workload manager deployed in Kubernetes. Our work describes design decisions, mapping components between environments, and experimental features. We perform experiments that compare application performance when deployed by the Flux Operator and the MPI Operator and present the results. Finally, we review remaining challenges and describe our vision of the future for improved technological innovation and collaboration through converged computing.
Collapse
Affiliation(s)
- Vanessa Sochat
- Lawrence Livermore National Laboratory, Livermore, California, 94550, USA
| | | | - Antonio Ojea
- Google, Inc., Mountain View, California, 94040, USA
| | - Daniel Milroy
- Lawrence Livermore National Laboratory, Livermore, California, 94550, USA
| |
Collapse
|
102
|
Salam B, Kravchenko D, Nowak S, Sprinkart AM, Weinhold L, Odenthal A, Mesropyan N, Bischoff LM, Attenberger U, Kuetting DL, Luetkens JA, Isaak A. Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand. J Cardiovasc Magn Reson 2024; 26:101035. [PMID: 38460841 PMCID: PMC10981113 DOI: 10.1016/j.jocmr.2024.101035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/19/2024] [Accepted: 03/05/2024] [Indexed: 03/11/2024] Open
Abstract
BACKGROUND Patients are increasingly using Generative Pre-trained Transformer 4 (GPT-4) to better understand their own radiology findings. PURPOSE To evaluate the performance of GPT-4 in transforming cardiovascular magnetic resonance (CMR) reports into text that is comprehensible to medical laypersons. METHODS ChatGPT with GPT-4 architecture was used to generate three different explained versions of 20 various CMR reports (n = 60) using the same prompt: "Explain the radiology report in a language understandable to a medical layperson". Two cardiovascular radiologists evaluated understandability, factual correctness, completeness of relevant findings, and lack of potential harm, while 13 medical laypersons evaluated the understandability of the original and the GPT-4 reports on a Likert scale (1 "strongly disagree", 5 "strongly agree"). Readability was measured using the Automated Readability Index (ARI). Linear mixed-effects models (values given as median [interquartile range]) and intraclass correlation coefficient (ICC) were used for statistical analysis. RESULTS GPT-4 reports were generated on average in 52 s ± 13. GPT-4 reports achieved a lower ARI score (10 [9-12] vs 5 [4-6]; p < 0.001) and were subjectively easier to understand for laypersons than original reports (1 [1] vs 4 [4,5]; p < 0.001). Eighteen out of 20 (90%) standard CMR reports and 2/60 (3%) GPT-generated reports had an ARI score corresponding to the 8th grade level or higher. Radiologists' ratings of the GPT-4 reports reached high levels for correctness (5 [4, 5]), completeness (5 [5]), and lack of potential harm (5 [5]); with "strong agreement" for factual correctness in 94% (113/120) and completeness of relevant findings in 81% (97/120) of reports. Test-retest agreement for layperson understandability ratings between the three simplified reports generated from the same original report was substantial (ICC: 0.62; p < 0.001). Interrater agreement between radiologists was almost perfect for lack of potential harm (ICC: 0.93, p < 0.001) and moderate to substantial for completeness (ICC: 0.76, p < 0.001) and factual correctness (ICC: 0.55, p < 0.001). CONCLUSION GPT-4 can reliably transform complex CMR reports into more understandable, layperson-friendly language while largely maintaining factual correctness and completeness, and can thus help convey patient-relevant radiology information in an easy-to-understand manner.
Collapse
Affiliation(s)
- Babak Salam
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Dmitrij Kravchenko
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Sebastian Nowak
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Alois M Sprinkart
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Leonie Weinhold
- University Hospital Bonn, Department of Medical Biometry, Informatics, and Epidemiology, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Anna Odenthal
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Narine Mesropyan
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Leon M Bischoff
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Ulrike Attenberger
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Daniel L Kuetting
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Julian A Luetkens
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany
| | - Alexander Isaak
- Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| |
Collapse
|
103
|
Ambrosio L, Schol J, La Pietra VA, Russo F, Vadalà G, Sakai D. Threats and opportunities of using ChatGPT in scientific writing-The risk of getting spine less. JOR Spine 2024; 7:e1296. [PMID: 38222818 PMCID: PMC10782071 DOI: 10.1002/jsp2.1296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 09/10/2023] [Accepted: 09/17/2023] [Indexed: 01/16/2024] Open
Abstract
ChatGPT and AI chatbots are revolutionizing several science fields, including medical writing. However, the inadequate use of such advantageous tools can raise numerous methodological and ethical issues.
Collapse
Affiliation(s)
- Luca Ambrosio
- Operative Research Unit of Orthopaedic and Trauma SurgeryFondazione Policlinico Universitario Campus Bio‐MedicoRomeItaly
- Research Unit of Orthopaedic and Trauma Surgery, Department of Medicine and SurgeryUniversità Campus Bio‐Medico di RomaRomeItaly
- Department of Orthopaedic SurgeryTokai University School of MedicineIseharaJapan
| | - Jordy Schol
- Department of Orthopaedic SurgeryTokai University School of MedicineIseharaJapan
| | | | - Fabrizio Russo
- Operative Research Unit of Orthopaedic and Trauma SurgeryFondazione Policlinico Universitario Campus Bio‐MedicoRomeItaly
- Research Unit of Orthopaedic and Trauma Surgery, Department of Medicine and SurgeryUniversità Campus Bio‐Medico di RomaRomeItaly
| | - Gianluca Vadalà
- Operative Research Unit of Orthopaedic and Trauma SurgeryFondazione Policlinico Universitario Campus Bio‐MedicoRomeItaly
- Research Unit of Orthopaedic and Trauma Surgery, Department of Medicine and SurgeryUniversità Campus Bio‐Medico di RomaRomeItaly
| | - Daisuke Sakai
- Department of Orthopaedic SurgeryTokai University School of MedicineIseharaJapan
| |
Collapse
|
104
|
Horgan R, Martins JG, Saade G, Abuhamad A, Kawakita T. ChatGPT in maternal-fetal medicine practice: a primer for clinicians. Am J Obstet Gynecol MFM 2024; 6:101302. [PMID: 38281582 DOI: 10.1016/j.ajogmf.2024.101302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 01/02/2024] [Accepted: 01/21/2024] [Indexed: 01/30/2024]
Abstract
ChatGPT (Generative Pre-trained Transformer), a language model that was developed by OpenAI and launched in November 2022, generates human-like responses to prompts using deep-learning technology. The integration of large language processing models into healthcare has the potential to improve the accessibility of medical information for both patients and health professionals alike. In this commentary, we demonstrated the ability of ChatGPT to produce patient information sheets. Four board-certified, maternal-fetal medicine attending physicians rated the accuracy and humanness of the information according to 2 predefined scales of accuracy and completeness. The median score for accuracy of information was rated 4.8 on a 6-point scale and the median score for completeness of information was 2.2 on a 3-point scale for the 5 patient information leaflets generated by ChatGPT. Concerns raised included the omission of clinically important information for patient counseling in some patient information leaflets and the inability to verify the source of information because ChatGPT does not provide references. ChatGPT is a powerful tool that has the potential to enhance patient care, but such a tool requires extensive validation and is perhaps best considered as an adjunct to clinical practice rather than as a tool to be used freely by the public for healthcare information.
Collapse
Affiliation(s)
- Rebecca Horgan
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA..
| | - Juliana G Martins
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| | - George Saade
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| | - Alfred Abuhamad
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| | - Tetsuya Kawakita
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Eastern Virginia Medical School, Norfolk, VA
| |
Collapse
|
105
|
Kangiszer G, Mahtani AU, Pintea M, Jacobs C, Sragovicz H, Nguyen T, Yeturu S, Lieberman M, Waldman C, Bhavnani SP, Hermel M. Low Performance of ChatGPT on Echocardiography Board Review Questions. JACC Cardiovasc Imaging 2024; 17:330-332. [PMID: 37943230 DOI: 10.1016/j.jcmg.2023.09.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 09/06/2023] [Accepted: 09/13/2023] [Indexed: 11/10/2023]
|
106
|
Inam M, Sheikh S, Minhas AMK, Vaughan EM, Krittanawong C, Samad Z, Lavie CJ, Khoja A, D'Cruze M, Slipczuk L, Alarakhiya F, Naseem A, Haider AH, Virani SS. A review of top cardiology and cardiovascular medicine journal guidelines regarding the use of generative artificial intelligence tools in scientific writing. Curr Probl Cardiol 2024; 49:102387. [PMID: 38185435 DOI: 10.1016/j.cpcardiol.2024.102387] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 01/04/2024] [Indexed: 01/09/2024]
Abstract
BACKGROUND Generative Artificial Intelligence (AI) tools have experienced rapid development over the last decade and are gaining increasing popularity as assistive models in academic writing. However, the ability of AI to generate reliable and accurate research articles is a topic of debate. Major scientific journals have issued policies regarding the contribution of AI tools in scientific writing. METHODS We conducted a review of the author and peer reviewer guidelines of the top 25 Cardiology and Cardiovascular Medicine journals as per the 2023 SCImago rankings. Data were obtained though reviewing journal websites and directly emailing the editorial office. Descriptive data regarding journal characteristics were coded on SPSS. Subgroup analyses of the journal guidelines were conducted based on the publishing company policies. RESULTS Our analysis revealed that all scientific journals in our study permitted the documented use of AI in scientific writing with certain limitations as per ICMJE recommendations. We found that AI tools cannot be included in the authorship or be used for image generation, and that all authors are required to assume full responsibility of their submitted and published work. The use of generative AI tools in the peer review process is strictly prohibited. CONCLUSION Guidelines regarding the use of generative AI in scientific writing are standardized, detailed, and unanimously followed by all journals in our study according to the recommendations set forth by international forums. It is imperative to ensure that these policies are carefully followed and updated to maintain scientific integrity.
Collapse
Affiliation(s)
- Maha Inam
- Office of the Vice Provost, Research, Aga Khan University, Karachi, Pakistan
| | - Sana Sheikh
- Department of Medicine, Aga Khan University, Karachi, Pakistan
| | - Abdul Mannan Khan Minhas
- Section of Cardiovascular Research, Department of Medicine, Baylor College of Medicine, Houston, TX, United States
| | - Elizabeth M Vaughan
- Section of Cardiovascular Research, Department of Medicine, Baylor College of Medicine, Houston, TX, United States; Department of Internal Medicine, UTMB, Galveston, TX, United States
| | - Chayakrit Krittanawong
- Leon H. Charney Division of Cardiology, New York University Langone Health, New York, NY, United States
| | - Zainab Samad
- Section of Cardiology, Department of Medicine, Aga Khan University Hospital, Karachi, Pakistan
| | - Carl J Lavie
- Department of Cardiovascular Diseases, John Ochsner Heart and Vascular Institute, Ochsner Clinical School, The University of Queensland School of Medicine, New Orleans, LA, United States
| | - Adeel Khoja
- Department of Medicine, Aga Khan University, Karachi, Pakistan; Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Melaine D'Cruze
- Institute for Educational Development, Aga Khan University Hospital, Karachi, Pakistan
| | - Leandro Slipczuk
- Cardiology Division, Montefiore Medical Center, Bronx, NY, United States; Albert Einstein College of Medicine, Bronx, NY, United States
| | | | - Azra Naseem
- Institute for Educational Development, Aga Khan University Hospital, Karachi, Pakistan
| | - Adil H Haider
- Dean's Office, Medical College, Aga Khan University Hospital, Karachi, Pakistan
| | - Salim S Virani
- Office of the Vice Provost, Research, Aga Khan University, Karachi, Pakistan; Section of Cardiovascular Research, Department of Medicine, Baylor College of Medicine, Houston, TX, United States; Section of Cardiology, Department of Medicine, Aga Khan University Hospital, Karachi, Pakistan; The Texas Heart Institute, Houston, TX, United States.
| |
Collapse
|
107
|
Bera K, O'Connor G, Jiang S, Tirumani SH, Ramaiya N. Analysis of ChatGPT publications in radiology: Literature so far. Curr Probl Diagn Radiol 2024; 53:215-225. [PMID: 37891083 DOI: 10.1067/j.cpradiol.2023.10.013] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 10/18/2023] [Indexed: 10/29/2023]
Abstract
OBJECTIVE To perform a detailed qualitative and quantitative analysis of the published literature on ChatGPT and radiology in the nine months since its public release, detailing the scope of the work in the short timeframe. METHODS A systematic literature search was carried out of the MEDLINE, EMBASE databases through August 15, 2023 for articles that were focused on ChatGPT and imaging/radiology. Articles were classified into original research and reviews/perspectives. Quantitative analysis was carried out by two experienced radiologists using objective scoring systems for evaluating original and non-original research. RESULTS 51 articles were published involving ChatGPT and radiology/imaging dating from 26 Jan 2023 to the last article published on 14 Aug 2023. 23 articles were original research while the rest included reviews/perspectives or brief communications. For quantitative analysis scored by two readers, we included 23 original research and 17 non-original research articles (after excluding 11 letters as responses to previous articles). Mean score for original research was 3.20 out of 5 (across five questions), while mean score for non-original research was 1.17 out of 2 (across six questions). Mean score grading performance of ChatGPT in original research was 3.20 out of five (across two questions). DISCUSSION While it is early days for ChatGPT and its impact in radiology, there has already been a plethora of articles talking about the multifaceted nature of the tool and how it can impact every aspect of radiology from patient education, pre-authorization, protocol selection, generating differentials, to structuring radiology reports. Most articles show impressive performance of ChatGPT which can only improve with more research and improvements in the tool itself. There have also been several articles which have highlighted the limitations of ChatGPT in its current iteration, which will allow radiologists and researchers to improve these areas.
Collapse
Affiliation(s)
- Kaustav Bera
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA.
| | - Gregory O'Connor
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Sirui Jiang
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Sree Harsha Tirumani
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Nikhil Ramaiya
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| |
Collapse
|
108
|
Balas M, Janic A, Daigle P, Nijhawan N, Hussain A, Gill H, Lahaie GL, Belliveau MJ, Crawford SA, Arjmand P, Ing EB. Evaluating ChatGPT on Orbital and Oculofacial Disorders: Accuracy and Readability Insights. Ophthalmic Plast Reconstr Surg 2024; 40:217-222. [PMID: 37989540 DOI: 10.1097/iop.0000000000002552] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
PURPOSE To assess the accuracy and readability of responses generated by the artificial intelligence model, ChatGPT (version 4.0), to questions related to 10 essential domains of orbital and oculofacial disease. METHODS A set of 100 questions related to the diagnosis, treatment, and interpretation of orbital and oculofacial diseases was posed to ChatGPT 4.0. Responses were evaluated by a panel of 7 experts based on appropriateness and accuracy, with performance scores measured on a 7-item Likert scale. Inter-rater reliability was determined via the intraclass correlation coefficient. RESULTS The artificial intelligence model demonstrated accurate and consistent performance across all 10 domains of orbital and oculofacial disease, with an average appropriateness score of 5.3/6.0 ("mostly appropriate" to "completely appropriate"). Domains of cavernous sinus fistula, retrobulbar hemorrhage, and blepharospasm had the highest domain scores (average scores of 5.5 to 5.6), while the proptosis domain had the lowest (average score of 5.0/6.0). The intraclass correlation coefficient was 0.64 (95% CI: 0.52 to 0.74), reflecting moderate inter-rater reliability. The responses exhibited a high reading-level complexity, representing the comprehension levels of a college or graduate education. CONCLUSIONS This study demonstrates the potential of ChatGPT 4.0 to provide accurate information in the field of ophthalmology, specifically orbital and oculofacial disease. However, challenges remain in ensuring accurate and comprehensive responses across all disease domains. Future improvements should focus on refining the model's correctness and eventually expanding the scope to visual data interpretation. Our results highlight the vast potential for artificial intelligence in educational and clinical ophthalmology contexts.
Collapse
Affiliation(s)
| | | | - Patrick Daigle
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Navdeep Nijhawan
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Ahsen Hussain
- Department of Ophthalmology and Visual Sciences, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Harmeet Gill
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Gabriela L Lahaie
- Department of Ophthalmology, Queen's University, Kingston, Ontario, Canada
| | - Michel J Belliveau
- Department of Ophthalmology, University of Ottawa and The Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - Sean A Crawford
- Temerty Faculty of Medicine
- Division of Vascular Surgery, Department of Surgery, University of Toronto, Toronto, Ontario, Canada
| | | | - Edsel B Ing
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Ophthalmology and Vision Sciences, University of Alberta, Edmonton, Alberta, Canada
| |
Collapse
|
109
|
Schmidt KW, Lechner F. [ChatGPT: aid to medical ethics decision making?]. DIE ANAESTHESIOLOGIE 2024; 73:186-192. [PMID: 38315183 DOI: 10.1007/s00101-024-01385-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
BACKGROUND Physicians have to make countless decisions every day. The medical, ethical and legal aspects are often intertwined and subject to change over time. Involving an ethics committee or arranging an ethical consultation are examples of potential aids to decision making. Whether and how artificial intelligence (AI) and the large language model (LLM) of the company OpenAI (San Francisco, CA, USA), known under the name ChatGPT, can also help and support ethical decision making is increasingly becoming a matter of controversial debate. MATERIAL AND METHODS Based on a case example, in which a female physician is confronted with ethical and legal issues and presents these to ChatGPT to come up with answers, the first indications of the strengths and weaknesses are ascertained. CONCLUSION Due to the rapid technical development and access to ever increasing quantities of data, the utilization should be closely observed and evaluated.
Collapse
Affiliation(s)
- Kurt W Schmidt
- Zentrum für Ethik in der Medizin, Agaplesion Markus Krankenhaus, Wilhelm-Epstein-Str. 4, 60431, Frankfurt a. M., Deutschland.
| | - Fabian Lechner
- Institut für Künstliche Intelligenz, Universitätsklinikum Gießen und Marburg, Marburg, Deutschland
| |
Collapse
|
110
|
Haver HL, Gupta AK, Ambinder EB, Bahl M, Oluyemi ET, Jeudy J, Yi PH. Evaluating the Use of ChatGPT to Accurately Simplify Patient-centered Information about Breast Cancer Prevention and Screening. Radiol Imaging Cancer 2024; 6:e230086. [PMID: 38305716 PMCID: PMC10988327 DOI: 10.1148/rycan.230086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 11/28/2023] [Accepted: 12/26/2023] [Indexed: 02/03/2024]
Abstract
Purpose To evaluate the use of ChatGPT as a tool to simplify answers to common questions about breast cancer prevention and screening. Materials and Methods In this retrospective, exploratory study, ChatGPT was requested to simplify responses to 25 questions about breast cancer to a sixth-grade reading level in March and August 2023. Simplified responses were evaluated for clinical appropriateness. All original and simplified responses were assessed for reading ease on the Flesch Reading Ease Index and for readability on five scales: Flesch-Kincaid Grade Level, Gunning Fog Index, Coleman-Liau Index, Automated Readability Index, and the Simple Measure of Gobbledygook (ie, SMOG) Index. Mean reading ease, readability, and word count were compared between original and simplified responses using paired t tests. McNemar test was used to compare the proportion of responses with adequate reading ease (score of 60 or greater) and readability (sixth-grade level). Results ChatGPT improved mean reading ease (original responses, 46 vs simplified responses, 70; P < .001) and readability (original, grade 13 vs simplified, grade 8.9; P < .001) and decreased word count (original, 193 vs simplified, 173; P < .001). Ninety-two percent (23 of 25) of simplified responses were considered clinically appropriate. All 25 (100%) simplified responses met criteria for adequate reading ease, compared with only two of 25 original responses (P < .001). Two of the 25 simplified responses (8%) met criteria for adequate readability. Conclusion ChatGPT simplified answers to common breast cancer screening and prevention questions by improving the readability by four grade levels, though the potential to produce incorrect information necessitates physician oversight when using this tool. Keywords: Mammography, Screening, Informatics, Breast, Education, Health Policy and Practice, Oncology, Technology Assessment Supplemental material is available for this article. © RSNA, 2023.
Collapse
Affiliation(s)
- Hana L. Haver
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| | - Anuj K. Gupta
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| | - Emily B. Ambinder
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| | - Manisha Bahl
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| | - Eniola T. Oluyemi
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| | - Jean Jeudy
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| | - Paul H. Yi
- From the University of Maryland Medical Intelligent Imaging (UM2ii)
Center, Department of Diagnostic Radiology and Nuclear Medicine, University of
Maryland School of Medicine, 670 W Baltimore St, First Floor, Rm 1172,
Baltimore, MD 21201 (H.L.H., A.K.G., J.J., P.H.Y.); The Russell H. Morgan
Department of Radiology and Radiological Science, Johns Hopkins University
School of Medicine, Baltimore, Md (E.B.A., E.T.O.); Department of Radiology,
Division of Breast Imaging, Massachusetts General Hospital, Boston, Mass (M.B.);
Malone Center for Engineering in Healthcare, Whiting School of Engineering,
Johns Hopkins University, Baltimore, Md (P.H.Y.); and Fischell Department of
Bioengineering, A. James Clark School of Engineering, University of
Maryland–College Park, College Park, Md (P.H.Y.)
| |
Collapse
|
111
|
Hu Y, Hu Z, Liu W, Gao A, Wen S, Liu S, Lin Z. Exploring the potential of ChatGPT as an adjunct for generating diagnosis based on chief complaint and cone beam CT radiologic findings. BMC Med Inform Decis Mak 2024; 24:55. [PMID: 38374067 PMCID: PMC10875853 DOI: 10.1186/s12911-024-02445-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 01/28/2024] [Indexed: 02/21/2024] Open
Abstract
AIM This study aimed to assess the performance of OpenAI's ChatGPT in generating diagnosis based on chief complaint and cone beam computed tomography (CBCT) radiologic findings. MATERIALS AND METHODS 102 CBCT reports (48 with dental diseases (DD) and 54 with neoplastic/cystic diseases (N/CD)) were collected. ChatGPT was provided with chief complaint and CBCT radiologic findings. Diagnostic outputs from ChatGPT were scored based on five-point Likert scale. For diagnosis accuracy, the scoring was based on the accuracy of chief complaint related diagnosis and chief complaint unrelated diagnoses (1-5 points); for diagnosis completeness, the scoring was based on how many accurate diagnoses included in ChatGPT's output for one case (1-5 points); for text quality, the scoring was based on how many text errors included in ChatGPT's output for one case (1-5 points). For 54 N/CD cases, the consistence of the diagnosis generated by ChatGPT with pathological diagnosis was also calculated. The constitution of text errors in ChatGPT's outputs was evaluated. RESULTS After subjective ratings by expert reviewers on a five-point Likert scale, the final score of diagnosis accuracy, diagnosis completeness and text quality of ChatGPT was 3.7, 4.5 and 4.6 for the 102 cases. For diagnostic accuracy, it performed significantly better on N/CD (3.8/5) compared to DD (3.6/5). For 54 N/CD cases, 21(38.9%) cases have first diagnosis completely consistent with pathological diagnosis. No text errors were observed in 88.7% of all the 390 text items. CONCLUSION ChatGPT showed potential in generating radiographic diagnosis based on chief complaint and radiologic findings. However, the performance of ChatGPT varied with task complexity, necessitating professional oversight due to a certain error rate.
Collapse
Affiliation(s)
- Yanni Hu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Ziyang Hu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
- Department of Stomatology, Shenzhen Longhua District Central Hospital, Shenzhen, People's Republic of China
| | - Wenjing Liu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Antian Gao
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Shanhui Wen
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Shu Liu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Zitong Lin
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China.
| |
Collapse
|
112
|
Zhang L, Li W, Wang Z. Sub-Diffraction Readout Method of High-Capacity Optical Data Storage Based on Polarization Modulation. NANOMATERIALS (BASEL, SWITZERLAND) 2024; 14:364. [PMID: 38392737 PMCID: PMC10892038 DOI: 10.3390/nano14040364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/02/2024] [Accepted: 02/10/2024] [Indexed: 02/24/2024]
Abstract
The big data era demands an efficient and permanent data storage technology with the capacity of PB to EB scale. Optical data storage (ODS) offers a good candidate for long-lifetime storage, as the developing far-field super-resolution nanoscale writing technology improves its capacity to the PB scale. However, methods to efficiently read out this intensive ODS data are still lacking. In this paper, we demonstrate a sub-diffraction readout method based on polarization modulation, which experimentally achieves the sub-diffraction readout on Disperse Red 13 thin film with a resolution of 500 nm, exceeding the diffraction limit by 1.2 times (NA = 0.5). Differing from conventional binary encoding, we propose a specific polarization encoding method that enhances the capacity of ODS by 1.5 times. In the simulation, our method provides an optical data storage readout resolution of 150 nm, potentially to 70 nm, equivalent to 1.1 PB in a DVD-sized disk. This sub-diffraction readout method has great potential as a powerful readout tool for next-generation optical data storage.
Collapse
Affiliation(s)
- Li Zhang
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China;
- School of Microelectronics, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wenwen Li
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
| | - Zhongyang Wang
- Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China;
- School of Microelectronics, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
113
|
Abi-Rafeh J, Xu HH, Kazan R, Tevlin R, Furnas H. Large Language Models and Artificial Intelligence: A Primer for Plastic Surgeons on the Demonstrated and Potential Applications, Promises, and Limitations of ChatGPT. Aesthet Surg J 2024; 44:329-343. [PMID: 37562022 DOI: 10.1093/asj/sjad260] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/12/2023] Open
Abstract
BACKGROUND The rapidly evolving field of artificial intelligence (AI) holds great potential for plastic surgeons. ChatGPT, a recently released AI large language model (LLM), promises applications across many disciplines, including healthcare. OBJECTIVES The aim of this article was to provide a primer for plastic surgeons on AI, LLM, and ChatGPT, including an analysis of current demonstrated and proposed clinical applications. METHODS A systematic review was performed identifying medical and surgical literature on ChatGPT's proposed clinical applications. Variables assessed included applications investigated, command tasks provided, user input information, AI-emulated human skills, output validation, and reported limitations. RESULTS The analysis included 175 articles reporting on 13 plastic surgery applications and 116 additional clinical applications, categorized by field and purpose. Thirty-four applications within plastic surgery are thus proposed, with relevance to different target audiences, including attending plastic surgeons (n = 17, 50%), trainees/educators (n = 8, 24.0%), researchers/scholars (n = 7, 21%), and patients (n = 2, 6%). The 15 identified limitations of ChatGPT were categorized by training data, algorithm, and ethical considerations. CONCLUSIONS Widespread use of ChatGPT in plastic surgery will depend on rigorous research of proposed applications to validate performance and address limitations. This systemic review aims to guide research, development, and regulation to safely adopt AI in plastic surgery.
Collapse
|
114
|
Brandão M, Mendes F, Martins M, Cardoso P, Macedo G, Mascarenhas T, Mascarenhas Saraiva M. Revolutionizing Women's Health: A Comprehensive Review of Artificial Intelligence Advancements in Gynecology. J Clin Med 2024; 13:1061. [PMID: 38398374 PMCID: PMC10889757 DOI: 10.3390/jcm13041061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 02/04/2024] [Accepted: 02/05/2024] [Indexed: 02/25/2024] Open
Abstract
Artificial intelligence has yielded remarkably promising results in several medical fields, namely those with a strong imaging component. Gynecology relies heavily on imaging since it offers useful visual data on the female reproductive system, leading to a deeper understanding of pathophysiological concepts. The applicability of artificial intelligence technologies has not been as noticeable in gynecologic imaging as in other medical fields so far. However, due to growing interest in this area, some studies have been performed with exciting results. From urogynecology to oncology, artificial intelligence algorithms, particularly machine learning and deep learning, have shown huge potential to revolutionize the overall healthcare experience for women's reproductive health. In this review, we aim to establish the current status of AI in gynecology, the upcoming developments in this area, and discuss the challenges facing its clinical implementation, namely the technological and ethical concerns for technology development, implementation, and accountability.
Collapse
Affiliation(s)
- Marta Brandão
- Faculty of Medicine, University of Porto, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (M.B.); (P.C.); (G.M.); (T.M.)
| | - Francisco Mendes
- Department of Gastroenterology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (F.M.); (M.M.)
- WGO Gastroenterology and Hepatology Training Center, 4200-427 Porto, Portugal
| | - Miguel Martins
- Department of Gastroenterology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (F.M.); (M.M.)
- WGO Gastroenterology and Hepatology Training Center, 4200-427 Porto, Portugal
| | - Pedro Cardoso
- Faculty of Medicine, University of Porto, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (M.B.); (P.C.); (G.M.); (T.M.)
- Department of Gastroenterology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (F.M.); (M.M.)
- WGO Gastroenterology and Hepatology Training Center, 4200-427 Porto, Portugal
| | - Guilherme Macedo
- Faculty of Medicine, University of Porto, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (M.B.); (P.C.); (G.M.); (T.M.)
- Department of Gastroenterology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (F.M.); (M.M.)
- WGO Gastroenterology and Hepatology Training Center, 4200-427 Porto, Portugal
| | - Teresa Mascarenhas
- Faculty of Medicine, University of Porto, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (M.B.); (P.C.); (G.M.); (T.M.)
- Department of Obstetrics and Gynecology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal
| | - Miguel Mascarenhas Saraiva
- Faculty of Medicine, University of Porto, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (M.B.); (P.C.); (G.M.); (T.M.)
- Department of Gastroenterology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal; (F.M.); (M.M.)
- WGO Gastroenterology and Hepatology Training Center, 4200-427 Porto, Portugal
| |
Collapse
|
115
|
Meyer A, Riese J, Streichert T. Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study. JMIR MEDICAL EDUCATION 2024; 10:e50965. [PMID: 38329802 PMCID: PMC10884900 DOI: 10.2196/50965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/14/2023] [Accepted: 12/11/2023] [Indexed: 02/09/2024]
Abstract
BACKGROUND The potential of artificial intelligence (AI)-based large language models, such as ChatGPT, has gained significant attention in the medical field. This enthusiasm is driven not only by recent breakthroughs and improved accessibility, but also by the prospect of democratizing medical knowledge and promoting equitable health care. However, the performance of ChatGPT is substantially influenced by the input language, and given the growing public trust in this AI tool compared to that in traditional sources of information, investigating its medical accuracy across different languages is of particular importance. OBJECTIVE This study aimed to compare the performance of GPT-3.5 and GPT-4 with that of medical students on the written German medical licensing examination. METHODS To assess GPT-3.5's and GPT-4's medical proficiency, we used 937 original multiple-choice questions from 3 written German medical licensing examinations in October 2021, April 2022, and October 2022. RESULTS GPT-4 achieved an average score of 85% and ranked in the 92.8th, 99.5th, and 92.6th percentiles among medical students who took the same examinations in October 2021, April 2022, and October 2022, respectively. This represents a substantial improvement of 27% compared to GPT-3.5, which only passed 1 out of the 3 examinations. While GPT-3.5 performed well in psychiatry questions, GPT-4 exhibited strengths in internal medicine and surgery but showed weakness in academic research. CONCLUSIONS The study results highlight ChatGPT's remarkable improvement from moderate (GPT-3.5) to high competency (GPT-4) in answering medical licensing examination questions in German. While GPT-4's predecessor (GPT-3.5) was imprecise and inconsistent, it demonstrates considerable potential to improve medical education and patient care, provided that medically trained users critically evaluate its results. As the replacement of search engines by AI tools seems possible in the future, further studies with nonprofessional questions are needed to assess the safety and accuracy of ChatGPT for the general population.
Collapse
Affiliation(s)
- Annika Meyer
- Institute for Clinical Chemistry, University Hospital Cologne, Cologne, Germany
| | - Janik Riese
- Department of General Surgery, Visceral, Thoracic and Vascular Surgery, University Hospital Greifswald, Greifswald, Germany
| | - Thomas Streichert
- Institute for Clinical Chemistry, University Hospital Cologne, Cologne, Germany
| |
Collapse
|
116
|
Chien A, Tang H, Jagessar B, Chang KW, Peng N, Nael K, Salamon N. AI-Assisted Summarization of Radiologic Reports: Evaluating GPT3davinci, BARTcnn, LongT5booksum, LEDbooksum, LEDlegal, and LEDclinical. AJNR Am J Neuroradiol 2024; 45:244-248. [PMID: 38238092 DOI: 10.3174/ajnr.a8102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 11/09/2023] [Indexed: 02/09/2024]
Abstract
BACKGROUND AND PURPOSE The review of clinical reports is an essential part of monitoring disease progression. Synthesizing multiple imaging reports is also important for clinical decisions. It is critical to aggregate information quickly and accurately. Machine learning natural language processing (NLP) models hold promise to address an unmet need for report summarization. MATERIALS AND METHODS We evaluated NLP methods to summarize longitudinal aneurysm reports. A total of 137 clinical reports and 100 PubMed case reports were used in this study. Models were 1) compared against expert-generated summary using longitudinal imaging notes collected in our institute and 2) compared using publicly accessible PubMed case reports. Five AI models were used to summarize the clinical reports, and a sixth model, the online GPT3davinci NLP large language model (LLM), was added for the summarization of PubMed case reports. We assessed the summary quality through comparison with expert summaries using quantitative metrics and quality reviews by experts. RESULTS In clinical summarization, BARTcnn had the best performance (BERTscore = 0.8371), followed by LongT5Booksum and LEDlegal. In the analysis using PubMed case reports, GPT3davinci demonstrated the best performance, followed by models BARTcnn and then LEDbooksum (BERTscore = 0.894, 0.872, and 0.867, respectively). CONCLUSIONS AI NLP summarization models demonstrated great potential in summarizing longitudinal aneurysm reports, though none yet reached the level of quality for clinical usage. We found the online GPT LLM outperformed the others; however, the BARTcnn model is potentially more useful because it can be implemented on-site. Future work to improve summarization, address other types of neuroimaging reports, and develop structured reports may allow NLP models to ease clinical workflow.
Collapse
Affiliation(s)
- Aichi Chien
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Hubert Tang
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Bhavita Jagessar
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Kai-Wei Chang
- Department of Computer Science (K.C., N.P.), University of California, Los Angeles, Los Angeles, California
| | - Nanyun Peng
- Department of Computer Science (K.C., N.P.), University of California, Los Angeles, Los Angeles, California
| | - Kambiz Nael
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Noriko Salamon
- From the Department of Radiological Science (A.C., H.T., B.J., K.N., N.S.), David Geffen School of Medicine at UCLA, Los Angeles, California
| |
Collapse
|
117
|
Schmidt S, Zimmerer A, Cucos T, Feucht M, Navas L. Simplifying radiologic reports with natural language processing: a novel approach using ChatGPT in enhancing patient understanding of MRI results. Arch Orthop Trauma Surg 2024; 144:611-618. [PMID: 37950763 DOI: 10.1007/s00402-023-05113-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 10/15/2023] [Indexed: 11/13/2023]
Abstract
PURPOSE The aim of this prospective cohort study was to assess the factual accuracy, completeness of medical information, and potential harmfulness of incorrect conclusions by medical professionals in automatically generated texts of varying complexity (1) using ChatGPT, Furthermore, patients without a medical background were asked to evaluate comprehensibility, information density, and conclusion possibilities (2). METHODS In the study, five different simplified versions of MRI findings of the knee of different complexity (A: simple, B: moderate, C: complex) were each created using ChatGPT. Subsequently, a group of four medical professionals (two orthopedic surgeons and two radiologists) and a group of 20 consecutive patients evaluated the created reports. For this purpose, all participants received a group of simplified reports (simple, moderate, and severe) at intervals of 1 week each for their respective evaluation using a specific questionnaire. Each questionnaire consisted of the original report, the simplified report, and a series of statements to assess the quality of the simplified reports. Participants were asked to rate their level of agreement with a five-point Likert scale. RESULTS The evaluation of the medical specialists showed that the findings produced were consistent in quality depending on their complexity. Factual correctness, reproduction of relevant information and comprehensibility for patients were rated on average as "Agree". The question about possible harm resulted in an average of "Disagree". The evaluation of patients also revealed consistent quality of reports, depending on complexity. Simplicity of word choice and sentence structure was rated "Agree" on average, with significant differences between simple and complex findings (p = 0.0039) as well as between moderate and complex findings (p = 0.0222). Participants reported being significantly better at knowing what the text was about (p = 0.001) and drawing the correct conclusions the more simplified the report of findings was (p = 0.013829). The question of whether the text informed them as well as a healthcare professional was answered as "Neutral" across all findings. CONCLUSION By using ChatGPT, MRI reports can be simplified automatically with consistent quality so that the relevant information is understandable to patients. However, a report generated in this way does not replace a thorough discussion between specialist and patient.
Collapse
Affiliation(s)
- Sebastian Schmidt
- Department of Orthopaedic and Trauma Surgery, Orthopädische Klinik Paulinenhilfe, Diakonieklinikum, Rosenbergstrasse 38, 70192, Stuttgart, Germany.
| | - Alexander Zimmerer
- Department of Orthopaedic and Trauma Surgery, Orthopädische Klinik Paulinenhilfe, Diakonieklinikum, Rosenbergstrasse 38, 70192, Stuttgart, Germany
- Department of Orthopaedics and Orthopaedic Surgery, University Medicine Greifswald, Ferdinand-Sauerbruch-Straße, 17475, Greifswald, Germany
| | - Tudor Cucos
- Department of Radiology, ViDia Christliche Kliniken Karlsruhe, Steinhäuser Straße 18, 76135, Karlsruhe, Germany
| | - Matthias Feucht
- Department of Orthopaedic and Trauma Surgery, Orthopädische Klinik Paulinenhilfe, Diakonieklinikum, Rosenbergstrasse 38, 70192, Stuttgart, Germany
| | - Luis Navas
- Department of Orthopaedic and Trauma Surgery, Orthopädische Klinik Paulinenhilfe, Diakonieklinikum, Rosenbergstrasse 38, 70192, Stuttgart, Germany
| |
Collapse
|
118
|
Fontenot J. Spotlight on Leadership: What Nurse Leaders Need to Know About Artificial Intelligence. J Nurs Adm 2024; 54:74-76. [PMID: 38261639 DOI: 10.1097/nna.0000000000001384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
Artificial intelligence (AI) is not a new concept. Since the 2022 release of a popular large language model, AI has become readily accessible to the general population, brought transformational shifts in healthcare, and created significant implications for nurse leaders. Specifically, AI has major indications in the area of evidence-based practice. Historically, new evidence takes years to reach the bedside. Nurse leaders are instrumental in closing the research-to-practice gap and, in doing so, promote optimal patient safety and care delivery methods. This article provides an overview of using AI in the context of nursing leadership in healthcare settings, including appropriate case use. In addition, this article covers the ethical challenges of using AI in clinical settings.
Collapse
Affiliation(s)
- Justin Fontenot
- Author Affiliation: Associate Professor, School of Medicine, Tulane University and Editor-in-Chief, Teaching and Learning in Nursing, New Orleans, LA
| |
Collapse
|
119
|
Wang WH, Wang SY, Huang JY, Liu XD, Yang J, Liao M, Lu Q, Wu Z. An investigation study on the interpretation of ultrasonic medical reports using OpenAI's GPT-3.5-turbo model. JOURNAL OF CLINICAL ULTRASOUND : JCU 2024; 52:105-111. [PMID: 37930057 DOI: 10.1002/jcu.23590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/23/2023] [Accepted: 10/04/2023] [Indexed: 11/07/2023]
Abstract
OBJECTIVES Ultrasound medical reports are an important means of diagnosing diseases and assessing treatment effectiveness. However, their professional terms and complex sentences often make it difficult for ordinary people to understand. Therefore, this study explores the clinical value of using artificial intelligence systems based on ChatGPT to interpret ultrasound medical reports. METHODS In this study, a combination of online and offline questionnaires were used to survey both physicians and non-medical individuals. The questionnaires evaluated ChatGPT's interpretation of ultrasound reports from both professional and comprehensibility perspectives, and the results were analyzed using Excel spreadsheets. Additionally, a portion of the research content was evaluated using the Likert Scale 5-point method in the questionnaire. RESULTS According to survey results, 67.4% of surveyed doctors believe that using ChatGPT for interpreting ultrasound medical reports can help improve work efficiency. At the same time, 69.72% of non-medical professionals believe it is necessary to enhance their understanding of medical ultrasound reports through ChatGPT interpretation, and 62.58% support the application of ChatGPT to ultrasound medical reports. The non-medical group's understanding of ultrasound medical reports significantly improved (p < 0.01) after implementing ChatGPT, However, 67.49% of the general public are concerned about ChatGPT's imperfect functionality, which may cause misleading information. This reflects that the public's trust in new technology is not high enough, and they are also worried about possible privacy leaks and security issues with ChatGPT technology. CONCLUSIONS The higher acceptance and support of non-medical individuals for the interpretation of medical reports by ChatGPT might be due to the system's natural language processing abilities that allow them to better understand and evaluate report contents. However, the expertise and experience of physicians are still irreplaceable. This suggests that the ChatGPT-based ultrasound medical report interpretation system has certain clinical value and application prospects, but further optimization is necessary to address its shortcomings in data quality and professionalism. This study provides a reference and inspiration for promoting the application and development of ultrasound technology and artificial intelligence systems in the medical field.
Collapse
Affiliation(s)
- Wen Hui Wang
- Department of Ultrasound, West China Hospital of Sichuan University, Chengdu, China
| | - Shi Yu Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jia Yan Huang
- Department of Ultrasound, West China Hospital of Sichuan University, Chengdu, China
| | - Xiao di Liu
- Department of Ultrasound, West China Hospital of Sichuan University, Chengdu, China
| | - Jie Yang
- Department of Ultrasound, West China Hospital of Sichuan University, Chengdu, China
| | - Min Liao
- Department of Ultrasound, West China Hospital of Sichuan University, Chengdu, China
| | - Qiang Lu
- Department of Ultrasound, West China Hospital of Sichuan University, Chengdu, China
| | - Zhe Wu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
120
|
Wagner MW, Ertl-Wagner BB. Accuracy of Information and References Using ChatGPT-3 for Retrieval of Clinical Radiological Information. Can Assoc Radiol J 2024; 75:69-73. [PMID: 37078489 DOI: 10.1177/08465371231171125] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2023] Open
Abstract
Purpose: To assess the accuracy of answers provided by ChatGPT-3 when prompted with questions from the daily routine of radiologists and to evaluate the text response when ChatGPT-3 was prompted to provide references for a given answer. Methods: ChatGPT-3 (San Francisco, OpenAI) is an artificial intelligence chatbot based on a large language model (LLM) that has been designed to generate human-like text. A total of 88 questions were submitted to ChatGPT-3 using textual prompt. These 88 questions were equally dispersed across 8 subspecialty areas of radiology. The responses provided by ChatGPT-3 were assessed for correctness by cross-checking them with peer-reviewed, PubMed-listed references. In addition, the references provided by ChatGPT-3 were evaluated for authenticity. Results: A total of 59 of 88 responses (67%) to radiological questions were correct, while 29 responses (33%) had errors. Out of 343 references provided, only 124 references (36.2%) were available through internet search, while 219 references (63.8%) appeared to be generated by ChatGPT-3. When examining the 124 identified references, only 47 references (37.9%) were considered to provide enough background to correctly answer 24 questions (37.5%). Conclusion: In this pilot study, ChatGPT-3 provided correct responses to questions from the daily clinical routine of radiologists in only about two thirds, while the remainder of responses contained errors. The majority of provided references were not found and only a minority of the provided references contained the correct information to answer the question. Caution is advised when using ChatGPT-3 to retrieve radiological information.
Collapse
Affiliation(s)
- Matthias W Wagner
- Department of Diagnostic Imaging, Division of Neuroradiology, The Hospital for Sick Children, Toronto, Canada
- Department of Medical Imaging, University of Toronto, Toronto, Canada
| | - Birgit B Ertl-Wagner
- Department of Diagnostic Imaging, Division of Neuroradiology, The Hospital for Sick Children, Toronto, Canada
- Department of Medical Imaging, University of Toronto, Toronto, Canada
| |
Collapse
|
121
|
Sebro R. Advancing Diagnostics and Patient Care: The Role of Biomarkers in Radiology. Semin Musculoskelet Radiol 2024; 28:3-13. [PMID: 38330966 DOI: 10.1055/s-0043-1776426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
The integration of biomarkers into medical practice has revolutionized the field of radiology, allowing for enhanced diagnostic accuracy, personalized treatment strategies, and improved patient care outcomes. This review offers radiologists a comprehensive understanding of the diverse applications of biomarkers in medicine. By elucidating the fundamental concepts, challenges, and recent advancements in biomarker utilization, it will serve as a bridge between the disciplines of radiology and epidemiology. Through an exploration of various biomarker types, such as imaging biomarkers, molecular biomarkers, and genetic markers, I outline their roles in disease detection, prognosis prediction, and therapeutic monitoring. I also discuss the significance of robust study designs, blinding, power and sample size calculations, performance metrics, and statistical methodologies in biomarker research. By fostering collaboration between radiologists, statisticians, and epidemiologists, I hope to accelerate the translation of biomarker discoveries into clinical practice, ultimately leading to improved patient care.
Collapse
Affiliation(s)
- Ronnie Sebro
- Department of Radiology, Center for Augmented Intelligence, Mayo Clinic, Jacksonville, Florida
- Department of Biostatistics, Center for Quantitative Health Sciences, Mayo Clinic, Jacksonville, Florida
- Department of Orthopedic Surgery, Mayo Clinic, Jacksonville, Florida
| |
Collapse
|
122
|
Ray PP, Majumder P. ChatGPT in Radiology: A Deeper Look Into its Limitations and Potential Pathways for Improvement. Can Assoc Radiol J 2024; 75:202. [PMID: 37171079 DOI: 10.1177/08465371231177674] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Affiliation(s)
| | - Poulami Majumder
- Maulana Abul Kalam Azad University of Technology, Kolkata, India
| |
Collapse
|
123
|
Eppler M, Ganjavi C, Ramacciotti LS, Piazza P, Rodler S, Checcucci E, Gomez Rivas J, Kowalewski KF, Belenchón IR, Puliatti S, Taratkin M, Veccia A, Baekelandt L, Teoh JYC, Somani BK, Wroclawski M, Abreu A, Porpiglia F, Gill IS, Murphy DG, Canes D, Cacciamani GE. Awareness and Use of ChatGPT and Large Language Models: A Prospective Cross-sectional Global Survey in Urology. Eur Urol 2024; 85:146-153. [PMID: 37926642 DOI: 10.1016/j.eururo.2023.10.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 09/27/2023] [Accepted: 10/24/2023] [Indexed: 11/07/2023]
Abstract
BACKGROUND Since its release in November 2022, ChatGPT has captivated society and shown potential for various aspects of health care. OBJECTIVE To investigate potential use of ChatGPT, a large language model (LLM), in urology by gathering opinions from urologists worldwide. DESIGN, SETTING, AND PARTICIPANTS An open web-based survey was distributed via social media and e-mail chains to urologists between April 20, 2023 and May 5, 2023. Participants were asked to answer questions related to their knowledge and experience with artificial intelligence, as well as their opinions of potential use of ChatGPT/LLMs in research and clinical practice. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS Data are reported as the mean and standard deviation for continuous variables, and the frequency and percentage for categorical variables. Charts and tables are used as appropriate, with descriptions of the chart types and the measures used. The data are reported in accordance with the Checklist for Reporting Results of Internet E-Surveys (CHERRIES). RESULTS AND LIMITATIONS A total of 456 individuals completed the survey (64% completion rate). Nearly half (47.7%) reported that they use ChatGPT/LLMs in their academic practice, with fewer using the technology in clinical practice (19.8%). More than half (62.2%) believe there are potential ethical concerns when using ChatGPT for scientific or academic writing, and 53% reported that they have experienced limitations when using ChatGPT in academic practice. CONCLUSIONS Urologists recognise the potential of ChatGPT/LLMs in research but have concerns regarding ethics and patient acceptance. There is a desire for regulations and guidelines to ensure appropriate use. In addition, measures should be taken to establish rules and guidelines to maximise safety and efficiency when using this novel technology. PATIENT SUMMARY A survey asked 456 urologists from around the world about using an artificial intelligence tool called ChatGPT in their work. Almost half of them use ChatGPT for research, but not many use it for patients care. The resonders think ChatGPT could be helpful, but they worry about problems like ethics and want rules to make sure it's used safely.
Collapse
Affiliation(s)
- Michael Eppler
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; AI Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Conner Ganjavi
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; AI Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Lorenzo Storino Ramacciotti
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; AI Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Pietro Piazza
- Division of Urology, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| | - Severin Rodler
- Department of Urology, Klinikum der Universität München, Munich, Germany
| | - Enrico Checcucci
- Department of Surgery, FPO-IRCCS Candiolo Cancer Institute, Candiolo, Italy
| | - Juan Gomez Rivas
- Department of Urology, Clinico San Carlos University Hospital, Madrid, Spain
| | - Karl F Kowalewski
- Department of Urology, University Medical Center Mannheim, Heidelberg University, Mannheim, Germany
| | - Ines Rivero Belenchón
- Urology and Nephrology Department, Virgen del Rocío University Hospital, Seville, Spain
| | - Stefano Puliatti
- Urology Department, University of Modena and Reggio Emilia, Modena, Italy
| | - Mark Taratkin
- Institute for Urology and Reproductive Health, Sechenov University, Moscow, Russia
| | - Alessandro Veccia
- Department of Urology, Azienda Ospedaliera Universitaria Integrata Verona, Verona, Italy
| | - Loïc Baekelandt
- Department of Urology, University Hospitals Leuven, Leuven, Belgium
| | - Jeremy Y-C Teoh
- Department of Surgery, S.H. Ho Urology Centre, The Chinese University of Hong Kong, Hong Kong, China
| | - Bhaskar K Somani
- University Hospital Southampton NHS Foundation Trust, Southampton, UK
| | - Marcelo Wroclawski
- Hospital Israelita Albert Einstein, São Paulo, Brazil; Beneficência Portuguesa de São Paulo, São Paulo, Brazil; Faculdade de Medicina do ABC, Santo Andre, Brazil
| | - Andre Abreu
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; AI Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | | | - Inderbir S Gill
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; AI Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA
| | - Declan G Murphy
- Division of Cancer Surgery, Peter MacCallum Cancer Centre, University of Melbourne, Melbourne, Australia
| | - David Canes
- Division of Urology, Lahey Hospital & Medical Center, Burlington, MA, USA
| | - Giovanni E Cacciamani
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; AI Center, USC Institute of Urology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
124
|
Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Pinto Dos Santos D, Tang A, Wald C, Slavotinek J. Developing, purchasing, implementing and monitoring AI tools in radiology: Practical considerations. A multi-society statement from the ACR, CAR, ESR, RANZCR & RSNA. J Med Imaging Radiat Oncol 2024; 68:7-26. [PMID: 38259140 DOI: 10.1111/1754-9485.13612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 11/23/2023] [Indexed: 01/24/2024]
Abstract
Artificial Intelligence (AI) carries the potential for unprecedented disruption in radiology, with possible positive and negative consequences. The integration of AI in radiology holds the potential to revolutionize healthcare practices by advancing diagnosis, quantification, and management of multiple medical conditions. Nevertheless, the ever-growing availability of AI tools in radiology highlights an increasing need to critically evaluate claims for its utility and to differentiate safe product offerings from potentially harmful, or fundamentally unhelpful ones. This multi-society paper, presenting the views of Radiology Societies in the USA, Canada, Europe, Australia, and New Zealand, defines the potential practical problems and ethical issues surrounding the incorporation of AI into radiological practice. In addition to delineating the main points of concern that developers, regulators, and purchasers of AI tools should consider prior to their introduction into clinical practice, this statement also suggests methods to monitor their stability and safety in clinical use, and their suitability for possible autonomous function. This statement is intended to serve as a useful summary of the practical issues which should be considered by all parties involved in the development of radiology AI resources, and their implementation as clinical tools.
Collapse
Affiliation(s)
| | - Bibb Allen
- Department of Radiology, Grandview Medical Center, Birmingham, Alabama, USA
- American College of Radiology Data Science Institute, Reston, Virginia, USA
| | - Jaron Chong
- Department of Medical Imaging, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Elmar Kotter
- Department of Diagnostic and Interventional Radiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Nina Kottler
- Radiology Partners, El Segundo, California, USA
- Stanford Center for Artificial Intelligence in Medicine & Imaging, Palo Alto, California, USA
| | - John Mongan
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, California, USA
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, South Australia, Australia
| | - Daniel Pinto Dos Santos
- Department of Radiology, University Hospital of Cologne, Cologne, Germany
- Department of Radiology, University Hospital of Frankfurt, Frankfurt, Germany
| | - An Tang
- Department of Radiology, Radiation Oncology, and Nuclear Medicine, Université de Montréal, Montreal, Quebec, Canada
| | - Christoph Wald
- Department of Radiology, Lahey Hospital & Medical Center, Burlington, Massachusetts, USA
- Tufts University Medical School, Boston, Massachusetts, USA
- Commision On Informatics, and Member, Board of Chancellors, American College of Radiology, Reston, Virginia, USA
| | - John Slavotinek
- South Australia Medical Imaging, Flinders Medical Centre Adelaide, Adelaide, South Australia, Australia
- College of Medicine and Public Health, Flinders University, Adelaide, South Australia, Australia
| |
Collapse
|
125
|
Büttner M, Leser U, Schneider L, Schwendicke F. Natural Language Processing: Chances and Challenges in Dentistry. J Dent 2024; 141:104796. [PMID: 38072335 DOI: 10.1016/j.jdent.2023.104796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/25/2023] [Accepted: 11/27/2023] [Indexed: 12/21/2023] Open
Abstract
INTRODUCTION Natural language processing (NLP) is an intersection between Computer Science and Linguistic which aims to enable machines to process and understand human language. We here summarized applications and limitations of NLP in dentistry. DATA AND SOURCES Narrative review. FINDINGS NLP has evolved increasingly fast. For the dental domain, relevant NLP applications are text classification (e.g., symptom classification) and natural language generation and understanding (e.g., clinical chatbots assisting professionals in office work and patient communication). Analyzing large quantities of text will allow understanding diseases and their trajectories and support a more precise and personalized care. Speech recognition systems may serve as virtual assistants and facilitate automated documentation. However, to date, NLP has rarely been applied in dentistry. Existing research focuses mainly on rule-based solutions for narrow tasks. Technologies such as Recurrent Neural Networks and Transformers have been shown to surpass the language processing capabilities of such rule-based solutions in many fields, but are data-hungry (i.e., rely on large amounts of training data), which limits their application in the dental domain at present. Technologies such as federated or transfer learning or data sharing concepts may allow to overcome this limitation, while challenges in terms of explainability, reproducibility, generalizability and evaluation of NLP in dentistry remain to be resolved for enabling approval of such technologies in medical devices and services. CONCLUSIONS NLP will become a cornerstone of a number of applications in dentistry. The community is called to action to improve the current limitations and foster reliable, high-quality dental NLP. CLINICAL SIGNIFICANCE NLP for text classification (e.g., dental symptom classification) and language generation and understanding (e.g., clinical chatbots, speech recognition) will support administrative tasks in dentistry, provide deeper insights for clinicians and support research and education.
Collapse
Affiliation(s)
- Martha Büttner
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin Berlin, Germany.
| | - Ulf Leser
- Department of Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Lisa Schneider
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité - Universitätsmedizin Berlin, Germany
| | - Falk Schwendicke
- Clinic for Operative, Preventive and Pediatric Dentistry and Periodontology, Ludwig-Maximilians-University, Munich, Germany
| |
Collapse
|
126
|
Alanezi F. Assessing the Effectiveness of ChatGPT in Delivering Mental Health Support: A Qualitative Study. J Multidiscip Healthc 2024; 17:461-471. [PMID: 38314011 PMCID: PMC10838501 DOI: 10.2147/jmdh.s447368] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 01/08/2024] [Indexed: 02/06/2024] Open
Abstract
Background Artificial Intelligence (AI) applications are widely researched for their potential in effectively improving the healthcare operations and disease management. However, the research trend shows that these applications also have significant negative implications on the service delivery. Purpose To assess the use of ChatGPT for mental health support. Methods Due to the novelty and unfamiliarity of the ChatGPT technology, a quasi-experimental design was chosen for this study. Outpatients from a public hospital were included in the sample. A two-week experiment followed by semi-structured interviews was conducted in which participants used ChatGPT for mental health support. Semi-structured interviews were conducted with 24 individuals with mental health conditions. Results Eight positive factors (psychoeducation, emotional support, goal setting and motivation, referral and resource information, self-assessment and monitoring, cognitive behavioral therapy, crisis interventions, and psychotherapeutic exercises) and four negative factors (ethical and legal considerations, accuracy and reliability, limited assessment capabilities, and cultural and linguistic considerations) were associated with the use of ChatGPT for mental health support. Conclusion It is important to carefully consider the ethical, reliability, accuracy, and legal challenges and develop appropriate strategies to mitigate them in order to ensure safe and effective use of AI-based applications like ChatGPT in mental health support.
Collapse
Affiliation(s)
- Fahad Alanezi
- College of Business Administration, Department Management Information Systems, Imam Abdulrahman Bin Faisal University, Dammam, 31441, Saudi Arabia
| |
Collapse
|
127
|
Moore R, Al-Tamimi AK, Freeman E. Investigating the Potential of a Conversational Agent (Phyllis) to Support Adolescent Health and Overcome Barriers to Physical Activity: Co-Design Study. JMIR Form Res 2024; 8:e51571. [PMID: 38294857 PMCID: PMC10867744 DOI: 10.2196/51571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 11/08/2023] [Accepted: 11/22/2023] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND Conversational agents (CAs) are a promising solution to support people in improving physical activity (PA) behaviors. However, there is a lack of CAs targeted at adolescents that aim to provide support to overcome barriers to PA. This study reports the results of the co-design, development, and evaluation of a prototype CA called "Phyllis" to support adolescents in overcoming barriers to PA with the aim of improving PA behaviors. The study presents one of the first theory-driven CAs that use existing research, a theoretical framework, and a behavior change model. OBJECTIVE The aim of the study is to use a mixed methods approach to investigate the potential of a CA to support adolescents in overcoming barriers to PA and enhance their confidence and motivation to engage in PA. METHODS The methodology involved co-designing with 8 adolescents to create a relational and persuasive CA with a suitable persona and dialogue. The CA was evaluated to determine its acceptability, usability, and effectiveness, with 46 adolescents participating in the study via a web-based survey. RESULTS The co-design participants were students aged 11 to 13 years, with a sex distribution of 56% (5/9) female and 44% (4/9) male, representing diverse ethnic backgrounds. Participants reported 37 specific barriers to PA, and the most common barriers included a "lack of confidence," "fear of failure," and a "lack of motivation." The CA's persona, named "Phyllis," was co-designed with input from the students, reflecting their preferences for a friendly, understanding, and intelligent personality. Users engaged in 61 conversations with Phyllis and reported a positive user experience, and 73% of them expressed a definite intention to use the fully functional CA in the future, with a net promoter score indicating a high likelihood of recommendation. Phyllis also performed well, being able to recognize a range of different barriers to PA. The CA's persuasive capacity was evaluated in modules focusing on confidence and motivation, with a significant increase in students' agreement in feeling confident and motivated to engage in PA after interacting with Phyllis. Adolescents also expect to have a personalized experience and be able to personalize all aspects of the CA. CONCLUSIONS The results showed high acceptability and a positive user experience, indicating the CA's potential. Promising outcomes were observed, with increasing confidence and motivation for PA. Further research and development are needed to create further interventions to address other barriers to PA and assess long-term behavior change. Addressing concerns regarding bias and privacy is crucial for achieving acceptability in the future. The CA's potential extends to health care systems and multimodal support, providing valuable insights for designing digital health interventions including tackling global inactivity issues among adolescents.
Collapse
Affiliation(s)
- Richard Moore
- Sheffield Hallam University, Sport and Physical Activity Research Centre / Advanced Wellbeing Research Centre, Sheffield, United Kingdom
| | | | - Elizabeth Freeman
- Department of Psychology, Sociology & Politics, Sheffield Hallam University, Sheffield, United Kingdom
| |
Collapse
|
128
|
Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, purchasing, implementing and monitoring AI tools in radiology: practical considerations. A multi-society statement from the ACR, CAR, ESR, RANZCR & RSNA. Insights Imaging 2024; 15:16. [PMID: 38246898 PMCID: PMC10800328 DOI: 10.1186/s13244-023-01541-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open
Abstract
Artificial Intelligence (AI) carries the potential for unprecedented disruption in radiology, with possible positive and negative consequences. The integration of AI in radiology holds the potential to revolutionize healthcare practices by advancing diagnosis, quantification, and management of multiple medical conditions. Nevertheless, the ever-growing availability of AI tools in radiology highlights an increasing need to critically evaluate claims for its utility and to differentiate safe product offerings from potentially harmful, or fundamentally unhelpful ones.This multi-society paper, presenting the views of Radiology Societies in the USA, Canada, Europe, Australia, and New Zealand, defines the potential practical problems and ethical issues surrounding the incorporation of AI into radiological practice. In addition to delineating the main points of concern that developers, regulators, and purchasers of AI tools should consider prior to their introduction into clinical practice, this statement also suggests methods to monitor their stability and safety in clinical use, and their suitability for possible autonomous function. This statement is intended to serve as a useful summary of the practical issues which should be considered by all parties involved in the development of radiology AI resources, and their implementation as clinical tools.Key points • The incorporation of artificial intelligence (AI) in radiological practice demands increased monitoring of its utility and safety.• Cooperation between developers, clinicians, and regulators will allow all involved to address ethical issues and monitor AI performance.• AI can fulfil its promise to advance patient well-being if all steps from development to integration in healthcare are rigorously evaluated.
Collapse
Affiliation(s)
| | - Bibb Allen
- Department of Radiology, Grandview Medical Center, Birmingham, AL, USA
- American College of Radiology Data Science Institute, Reston, VA, USA
| | - Jaron Chong
- Department of Medical Imaging, Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
| | - Elmar Kotter
- Department of Diagnostic and Interventional Radiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Nina Kottler
- Radiology Partners, El Segundo, CA, USA
- Stanford Center for Artificial Intelligence in Medicine & Imaging, Palo Alto, CA, USA
| | - John Mongan
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, USA
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, Australia
| | - Daniel Pinto Dos Santos
- Department of Radiology, University Hospital of Cologne, Cologne, Germany
- Department of Radiology, University Hospital of Frankfurt, Frankfurt, Germany
| | - An Tang
- Department of Radiology, Radiation Oncology, and Nuclear Medicine, Université de Montréal, Montréal, Québec, Canada
| | - Christoph Wald
- Department of Radiology, Lahey Hospital & Medical Center, Burlington, MA, USA
- Tufts University Medical School, Boston, MA, USA
- Commision On Informatics, and Member, Board of Chancellors, American College of Radiology, Virginia, USA
| | - John Slavotinek
- South Australia Medical Imaging, Flinders Medical Centre Adelaide, Adelaide, Australia
- College of Medicine and Public Health, Flinders University, Adelaide, Australia
| |
Collapse
|
129
|
Liao M, Zhu K, Wang G. Can human-machine feedback in a smart learning environment enhance learners' learning performance? A meta-analysis. Front Psychol 2024; 14:1288503. [PMID: 38268803 PMCID: PMC10805823 DOI: 10.3389/fpsyg.2023.1288503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 12/11/2023] [Indexed: 01/26/2024] Open
Abstract
Objective The human-machine feedback in a smart learning environment can influences learners' learning styles, ability enhancement, and affective interactions. However, whether it has stability in improving learning performance and learning processes, the findings of many empirical studies are controversial. This study aimed to analyze the effect of human-machine feedback on learning performance and the potential boundary conditions that produce the effect in a smart learning environment. Methods Web of Science, EBSCO, PsycINFO, and Science Direct were searched for publications from 2010 to 2022. We included randomized controlled trials with learning performance as outcome. The random effects model was used in the meta-analysis. The main effect tests and the heterogeneity tests were used to evaluate the effect of human-machine feedback mechanism on learning performance, and the boundary conditions of the effect were tested by moderating effects. Moreover, the validity of the meta-analysis was proved by publication bias test. Results Out of 35 articles identified, 2,222 participants were included in this study. Human-machine interaction feedback had significant effects on learners' learning process (d = 0.594, k = 26) and learning outcomes (d = 0.407, k = 42). Also, the positive effects of human-machine interaction feedback were regulated by the direction of feedback, the form of feedback, and the type of feedback technique. Conclusion To enhance learning performance through human-machine interactive feedback, we should focus on using two-way and multi-subject feedback. The technology that can provide emotional feedback and feedback loops should be used as a priority. Also, pay attention to the feedback process and mechanism, avoid increasing students' dependence on machines, and strengthen learners' subjectivity from feedback mechanism.
Collapse
Affiliation(s)
- Mengyi Liao
- School of Education, Pingdingshan University, Pingdingshan, Henan, China
| | - Kaige Zhu
- School of Education, Pingdingshan University, Pingdingshan, Henan, China
| | - Guangshuai Wang
- National Engineering Research Center of Educational Big Data, Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan, Hubei, China
| |
Collapse
|
130
|
Farhat F, Silva ES, Hassani H, Madsen DØ, Sohail SS, Himeur Y, Alam MA, Zafar A. The scholarly footprint of ChatGPT: a bibliometric analysis of the early outbreak phase. Front Artif Intell 2024; 6:1270749. [PMID: 38249789 PMCID: PMC10797012 DOI: 10.3389/frai.2023.1270749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 12/08/2023] [Indexed: 01/23/2024] Open
Abstract
This paper presents a comprehensive analysis of the scholarly footprint of ChatGPT, an AI language model, using bibliometric and scientometric methods. The study zooms in on the early outbreak phase from when ChatGPT was launched in November 2022 to early June 2023. It aims to understand the evolution of research output, citation patterns, collaborative networks, application domains, and future research directions related to ChatGPT. By retrieving data from the Scopus database, 533 relevant articles were identified for analysis. The findings reveal the prominent publication venues, influential authors, and countries contributing to ChatGPT research. Collaborative networks among researchers and institutions are visualized, highlighting patterns of co-authorship. The application domains of ChatGPT, such as customer support and content generation, are examined. Moreover, the study identifies emerging keywords and potential research areas for future exploration. The methodology employed includes data extraction, bibliometric analysis using various indicators, and visualization techniques such as Sankey diagrams. The analysis provides valuable insights into ChatGPT's early footprint in academia and offers researchers guidance for further advancements. This study stimulates discussions, collaborations, and innovations to enhance ChatGPT's capabilities and impact across domains.
Collapse
Affiliation(s)
- Faiza Farhat
- Department of Zoology, Aligarh Muslim University, Aligarh, India
| | - Emmanuel Sirimal Silva
- Department of Economics and Law, Glasgow School for Business and Society, Glasgow Caledonian University, Glasgow, United Kingdom
| | - Hossein Hassani
- The Research Institute of Energy Management and Planning (RIEMP), University of Tehran, Tehran, Iran
| | - Dag Øivind Madsen
- USN School of Business, University of South-Eastern Norway, Hønefoss, Norway
| | - Shahab Saquib Sohail
- Department of Computer Science and Engineering, School of Engineering Sciences and Technology, Jamia Hamdard, New Delhi, India
| | - Yassine Himeur
- College of Engineering and Information Technology, University of Dubai, Dubai, United Arab Emirates
| | - M. Afshar Alam
- Department of Computer Science and Engineering, School of Engineering Sciences and Technology, Jamia Hamdard, New Delhi, India
| | - Aasim Zafar
- Department of Computer Science, Aligarh Muslim University, Aligarh, India
| |
Collapse
|
131
|
Jain N, Gottlich C, Fisher J, Campano D, Winston T. Assessing ChatGPT's orthopedic in-service training exam performance and applicability in the field. J Orthop Surg Res 2024; 19:27. [PMID: 38167093 PMCID: PMC10762835 DOI: 10.1186/s13018-023-04467-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND ChatGPT has gained widespread attention for its ability to understand and provide human-like responses to inputs. However, few works have focused on its use in Orthopedics. This study assessed ChatGPT's performance on the Orthopedic In-Service Training Exam (OITE) and evaluated its decision-making process to determine whether adoption as a resource in the field is practical. METHODS ChatGPT's performance on three OITE exams was evaluated through inputting multiple choice questions. Questions were classified by their orthopedic subject area. Yearly, OITE technical reports were used to gauge scores against resident physicians. ChatGPT's rationales were compared with testmaker explanations using six different groups denoting answer accuracy and logic consistency. Variables were analyzed using contingency table construction and Chi-squared analyses. RESULTS Of 635 questions, 360 were useable as inputs (56.7%). ChatGPT-3.5 scored 55.8%, 47.7%, and 54% for the years 2020, 2021, and 2022, respectively. Of 190 correct outputs, 179 provided a consistent logic (94.2%). Of 170 incorrect outputs, 133 provided an inconsistent logic (78.2%). Significant associations were found between test topic and correct answer (p = 0.011), and type of logic used and tested topic (p = < 0.001). Basic Science and Sports had adjusted residuals greater than 1.96. Basic Science and correct, no logic; Basic Science and incorrect, inconsistent logic; Sports and correct, no logic; and Sports and incorrect, inconsistent logic; had adjusted residuals greater than 1.96. CONCLUSIONS Based on annual OITE technical reports for resident physicians, ChatGPT-3.5 performed around the PGY-1 level. When answering correctly, it displayed congruent reasoning with testmakers. When answering incorrectly, it exhibited some understanding of the correct answer. It outperformed in Basic Science and Sports, likely due to its ability to output rote facts. These findings suggest that it lacks the fundamental capabilities to be a comprehensive tool in Orthopedic Surgery in its current form. LEVEL OF EVIDENCE II.
Collapse
Affiliation(s)
- Neil Jain
- Department of Orthopedic Surgery, Texas Tech University Health Sciences Center Lubbock, 3601 4th St, Lubbock, TX, 79430, USA.
| | - Caleb Gottlich
- Department of Orthopedic Surgery, Texas Tech University Health Sciences Center Lubbock, 3601 4th St, Lubbock, TX, 79430, USA
| | - John Fisher
- Department of Orthopedic Surgery, Texas Tech University Health Sciences Center Lubbock, 3601 4th St, Lubbock, TX, 79430, USA
| | - Dominic Campano
- Department of Orthopedic Surgery, Texas Tech University Health Sciences Center Lubbock, 3601 4th St, Lubbock, TX, 79430, USA
| | - Travis Winston
- Department of Orthopedic Surgery, Texas Tech University Health Sciences Center Lubbock, 3601 4th St, Lubbock, TX, 79430, USA
| |
Collapse
|
132
|
Alì M, Fantesini A, Morcella MT, Ibba S, D'Anna G, Fazzini D, Papa S. Adoption of AI in Oncological Imaging: Ethical, Regulatory, and Medical-Legal Challenges. Crit Rev Oncog 2024; 29:29-35. [PMID: 38505879 DOI: 10.1615/critrevoncog.2023050584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
Artificial Intelligence (AI) algorithms have shown great promise in oncological imaging, outperforming or matching radiologists in retrospective studies, signifying their potential for advanced screening capabilities. These AI tools offer valuable support to radiologists, assisting them in critical tasks such as prioritizing reporting, early cancer detection, and precise measurements, thereby bolstering clinical decision-making. With the healthcare landscape witnessing a surge in imaging requests and a decline in available radiologists, the integration of AI has become increasingly appealing. By streamlining workflow efficiency and enhancing patient care, AI presents a transformative solution to the challenges faced by oncological imaging practices. Nevertheless, successful AI integration necessitates navigating various ethical, regulatory, and medical-legal challenges. This review endeavors to provide a comprehensive overview of these obstacles, aiming to foster a responsible and effective implementation of AI in oncological imaging.
Collapse
Affiliation(s)
- Marco Alì
- Radiology Unit, CDI, Centro Diagnostico Italiano, Via Simone Saint Bon, 20, 20147 Milan, Italy
| | - Arianna Fantesini
- Suor Orsola Benincasa University, Corso Vittorio Emanuele 292, Naples, Italy; RE:LAB s.r.l., Via Tamburini, 5, 42122 Reggio Emilia, Italy
| | | | - Simona Ibba
- CDI Centro Diagnostico Italiano, Via Saint Bon 20, Milan, Italy
| | - Gennaro D'Anna
- Neuroimaging Unit, ASST Ovest Milanese, Via Papa Giovanni Paolo II, Legnano (Milan), Italy
| | - Deborah Fazzini
- CDI Centro Diagnostico Italiano, Via Saint Bon 20, Milan, Italy
| | - Sergio Papa
- Radiology Unit, CDI, Centro Diagnostico Italiano, Via Simone Saint Bon, 20, 20147 Milan, Italy
| |
Collapse
|
133
|
Moy L. Editor's Note: 2023-The Year in Review for Radiology. Radiology 2024; 310:e233537. [PMID: 38289216 DOI: 10.1148/radiol.233537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
|
134
|
Ray PP, Majumder P. Evaluating the Limitations of ChatGPT in Generating Competent Radiology Reports for Distal Radius Fractures. Curr Probl Diagn Radiol 2024; 53:166-167. [PMID: 37925239 DOI: 10.1067/j.cpradiol.2023.10.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 08/29/2023] [Accepted: 10/18/2023] [Indexed: 11/06/2023]
|
135
|
Sumbal A, Sumbal R, Amir A. Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing. JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT 2024; 11:23821205241238641. [PMID: 38487300 PMCID: PMC10938614 DOI: 10.1177/23821205241238641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 02/25/2024] [Indexed: 03/17/2024]
Abstract
OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams. METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together. RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking. CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.
Collapse
Affiliation(s)
- Anusha Sumbal
- Dow University of Health Sciences, Karachi, Pakistan
| | - Ramish Sumbal
- Dow University of Health Sciences, Karachi, Pakistan
| | - Alina Amir
- Dow University of Health Sciences, Karachi, Pakistan
| |
Collapse
|
136
|
Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement from the ACR, CAR, ESR, RANZCR and RSNA. Radiol Artif Intell 2024; 6:e230513. [PMID: 38251899 PMCID: PMC10831521 DOI: 10.1148/ryai.230513] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]
Abstract
Artificial Intelligence (AI) carries the potential for unprecedented disruption in radiology, with possible positive and negative consequences. The integration of AI in radiology holds the potential to revolutionize healthcare practices by advancing diagnosis, quantification, and management of multiple medical conditions. Nevertheless, the ever-growing availability of AI tools in radiology highlights an increasing need to critically evaluate claims for its utility and to differentiate safe product offerings from potentially harmful, or fundamentally unhelpful ones. This multi-society paper, presenting the views of Radiology Societies in the USA, Canada, Europe, Australia, and New Zealand, defines the potential practical problems and ethical issues surrounding the incorporation of AI into radiological practice. In addition to delineating the main points of concern that developers, regulators, and purchasers of AI tools should consider prior to their introduction into clinical practice, this statement also suggests methods to monitor their stability and safety in clinical use, and their suitability for possible autonomous function. This statement is intended to serve as a useful summary of the practical issues which should be considered by all parties involved in the development of radiology AI resources, and their implementation as clinical tools. This article is simultaneously published in Insights into Imaging (DOI 10.1186/s13244-023-01541-3), Journal of Medical Imaging and Radiation Oncology (DOI 10.1111/1754-9485.13612), Canadian Association of Radiologists Journal (DOI 10.1177/08465371231222229), Journal of the American College of Radiology (DOI 10.1016/j.jacr.2023.12.005), and Radiology: Artificial Intelligence (DOI 10.1148/ryai.230513). Keywords: Artificial Intelligence, Radiology, Automation, Machine Learning Published under a CC BY 4.0 license. ©The Author(s) 2024. Editor's Note: The RSNA Board of Directors has endorsed this article. It has not undergone review or editing by this journal.
Collapse
Affiliation(s)
| | - Bibb Allen
- Department of Radiology, Grandview Medical
Center, Birmingham, AL, USA
- American College of Radiology Data Science
Institute, Reston, VA, USA
| | - Jaron Chong
- Department of Medical Imaging, Schulich
School of Medicine and Dentistry, Western University, London, ON, Canada
| | - Elmar Kotter
- Department of Diagnostic and
Interventional Radiology, Medical Center, Faculty of Medicine, University of
Freiburg, Freiburg, Germany
| | - Nina Kottler
- Radiology Partners, El Segundo, CA,
USA
- Stanford Center for Artificial
Intelligence in Medicine & Imaging, Palo Alto, CA, USA
| | - John Mongan
- Department of Radiology and Biomedical
Imaging, University of California, San Francisco, USA
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning,
University of Adelaide, Adelaide, Australia
| | - Daniel Pinto dos Santos
- Department of Radiology, University
Hospital of Cologne, Cologne, Germany
- Department of Radiology, University
Hospital of Frankfurt, Frankfurt, Germany
| | - An Tang
- Department of Radiology, Radiation
Oncology, and Nuclear Medicine, Université de Montréal,
Montréal, Québec, Canada
| | - Christoph Wald
- Department of Radiology, Lahey Hospital
& Medical Center, Burlington, MA, USA
- Tufts University Medical School, Boston,
MA, USA
- Commission On Informatics, and Member,
Board of Chancellors, American College of Radiology, Virginia, USA
| | - John Slavotinek
- South Australia Medical Imaging,
Flinders Medical Centre Adelaide, Adelaide, Australia
- College of Medicine and Public Health,
Flinders University, Adelaide, Australia
| |
Collapse
|
137
|
Zhang X, Zhong Y, Jin C, Hu D, Tian M, Zhang H. Medical image Generative Pre-Trained Transformer (MI-GPT): future direction for precision medicine. Eur J Nucl Med Mol Imaging 2024; 51:332-335. [PMID: 37803245 DOI: 10.1007/s00259-023-06450-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Affiliation(s)
- Xiaohui Zhang
- Department of Nuclear Medicine and PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou, 310009, Zhejiang, China
- Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China
- Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China
| | - Yan Zhong
- Department of Nuclear Medicine and PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou, 310009, Zhejiang, China
- Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China
- Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China
| | - Chentao Jin
- Department of Nuclear Medicine and PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou, 310009, Zhejiang, China
- Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China
- Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China
| | - Daoyan Hu
- Department of Nuclear Medicine and PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou, 310009, Zhejiang, China
- Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China
- Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China
| | - Mei Tian
- Department of Nuclear Medicine and PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou, 310009, Zhejiang, China.
- Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China.
- Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China.
- Human Phenome Institute, Fudan University, 825 Zhangheng Road, Shanghai, 201203, China.
| | - Hong Zhang
- Department of Nuclear Medicine and PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, 88 Jiefang Road, Hangzhou, 310009, Zhejiang, China.
- Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China.
- Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China.
- College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China.
- Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, China.
| |
Collapse
|
138
|
Almeida LC, Farina EMJM, Kuriki PEA, Abdala N, Kitamura FC. Performance of ChatGPT on the Brazilian Radiology and Diagnostic Imaging and Mammography Board Examinations. Radiol Artif Intell 2024; 6:e230103. [PMID: 38294325 PMCID: PMC10831524 DOI: 10.1148/ryai.230103] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 09/06/2023] [Accepted: 10/23/2023] [Indexed: 02/01/2024]
Abstract
This prospective exploratory study conducted from January 2023 through May 2023 evaluated the ability of ChatGPT to answer questions from Brazilian radiology board examinations, exploring how different prompt strategies can influence performance using GPT-3.5 and GPT-4. Three multiple-choice board examinations that did not include image-based questions were evaluated: (a) radiology and diagnostic imaging, (b) mammography, and (c) neuroradiology. Five different styles of zero-shot prompting were tested: (a) raw question, (b) brief instruction, (c) long instruction, (d) chain-of-thought, and (e) question-specific automatic prompt generation (QAPG). The QAPG and brief instruction prompt strategies performed best for all examinations (P < .05), obtaining passing scores (≥60%) on the radiology and diagnostic imaging examination when testing both versions of ChatGPT. The QAPG style achieved a score of 60% for the mammography examination using GPT-3.5 and 76% using GPT-4. GPT-4 achieved a score up to 65% in the neuroradiology examination. The long instruction style consistently underperformed, implying that excessive detail might harm performance. GPT-4's scores were less sensitive to prompt style changes. The QAPG prompt style showed a high volume of the "A" option but no statistical difference, suggesting bias was found. GPT-4 passed all three radiology board examinations, and GPT-3.5 passed two of three examinations when using an optimal prompt style. Keywords: ChatGPT, Artificial Intelligence, Board Examinations, Radiology and Diagnostic Imaging, Mammography, Neuroradiology © RSNA, 2023 See also the commentary by Trivedi and Gichoya in this issue.
Collapse
Affiliation(s)
- Leonardo C. Almeida
- From the Department of Artificial Intelligence and Management (L.C.A., E.M.J.M.F., N.A., F.C.K.), Graduate Program in Medicine (Clinical Radiology), Universidade Federal de São Paulo (UNIFESP), Rua Botucatu, 740, 04023-062, São Paulo, São Paulo, Brazil; AI Lab (L.C.A., E.M.J.M.F., P.E.A.K., F.C.K.), Dasa, São Paulo, São Paulo, Brazil
| | - Eduardo M. J. M. Farina
- From the Department of Artificial Intelligence and Management (L.C.A., E.M.J.M.F., N.A., F.C.K.), Graduate Program in Medicine (Clinical Radiology), Universidade Federal de São Paulo (UNIFESP), Rua Botucatu, 740, 04023-062, São Paulo, São Paulo, Brazil; AI Lab (L.C.A., E.M.J.M.F., P.E.A.K., F.C.K.), Dasa, São Paulo, São Paulo, Brazil
| | - Paulo E. A. Kuriki
- From the Department of Artificial Intelligence and Management (L.C.A., E.M.J.M.F., N.A., F.C.K.), Graduate Program in Medicine (Clinical Radiology), Universidade Federal de São Paulo (UNIFESP), Rua Botucatu, 740, 04023-062, São Paulo, São Paulo, Brazil; AI Lab (L.C.A., E.M.J.M.F., P.E.A.K., F.C.K.), Dasa, São Paulo, São Paulo, Brazil
| | | | | |
Collapse
|
139
|
Dutruel SP, Hentel KD, Hecht EM, Kadom N. Patient-Centered Radiology Communications: Engaging Patients as Partners. J Am Coll Radiol 2024; 21:7-18. [PMID: 37863150 DOI: 10.1016/j.jacr.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/12/2023] [Accepted: 10/13/2023] [Indexed: 10/22/2023]
Abstract
Patient-centered care is a model in which, by bringing the patient's perspective to the design and delivery of health care, we can better meet patients' needs, enhancing the quality of care. Patient-centered care requires finding ways to communicate effectively with a diverse patient population that has various levels of health literacy, cultural backgrounds, and unique needs and preferences. Moreover, multimedia resources have the potential to inform and educate patients promoting greater independence. In this review, we discuss the fundamentals of communication with the different modes used in radiology and the key elements of effective communication. Then, we highlight five opportunities along the continuum of care in the radiology practice in which we can improve communications to empower our patients and families and strengthen this partnership. Lastly, we discuss the importance on communication training of the workforce, optimizing and seamlessly integrating technology solutions into our workflows, and the need for patient feedback in the design and delivery of care.
Collapse
Affiliation(s)
- Silvina P Dutruel
- Department of Radiology, Weill Cornell Medical Center, New York, New York.
| | - Keith D Hentel
- Professor, Clinical Radiology, Executive Vice Chairman, Department of Radiology; Vice President, Weill Cornell Imaging at New York-Presbyterian, New York, New York
| | - Elizabeth M Hecht
- Vice Chair for Academic Affairs, Department of Radiology, Weill Cornell Medical Center, New York, New York. https://twitter.com/ehecht_md
| | - Nadja Kadom
- Department of Radiology and Imaging Sciences, Emory University School of Medicine, Atlanta, Georgia; Director of Quality, Department of Radiology, Children's Healthcare of Atlanta, Georgia; Interim Director of Quality, Department of Radiology, Emory Healthcare, Atlanta, Georgia; Chair, Practice and Performance Improvement Committee, ARRS; and Chair, Metrics Committee, ACR
| |
Collapse
|
140
|
Zhang JS, Yoon C, Williams DKA, Pinkas A. Exploring the Usage of ChatGPT Among Medical Students in the United States. JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT 2024; 11:23821205241264695. [PMID: 39092290 PMCID: PMC11292693 DOI: 10.1177/23821205241264695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/08/2024] [Indexed: 08/04/2024]
Abstract
OBJECTIVES Chat Generative Pretrained Transformer (ChatGPT) is a large language model developed by OpenAI that has gained widespread interest. It has been cited for its potential impact on health care and its beneficial role in medical education. However, there is limited investigation into its use among medical students. In this study, we evaluated the frequency of ChatGPT use, motivations for use, and preference for ChatGPT over existing resources among medical students in the United States. METHODS Data was collected from an original survey consisting of 14 questions assessing the frequency and usage of ChatGPT in various contexts within medical education. The survey was distributed via email lists, group messaging applications, and classroom lectures to medical students across the United States. Responses were collected between August and October 2023. RESULTS One hundred thirty-one participants completed the survey and were included in the analysis. Of the total, 48.9% respondents responded that they have used ChatGPT in medical studies. Among ChatGPT users, 43.7% of respondents report using ChatGPT weekly, several times per week, or daily. ChatGPT is most used for writing, revising, editing, and summarizing purposes. 37.5% and 41.3% of respondents reported using ChatGPT more than 25% of the working time for these tasks respectively. Among respondents who have not used ChatGPT, more than 50% of respondents reported they were extremely unlikely or unlikely to use ChatGPT across all surveyed scenarios. ChatGPT users report they are more likely to use ChatGPT over directly asking professors or attendings (45.3%), textbooks (42.2%), and lectures (31.7%), and least likely to be used over popular flashcard application Anki (11.1%) and medical education videos (9.5%). CONCLUSIONS ChatGPT is an increasingly popular resource among medical students, with many preferring ChatGPT over other traditional resources such as professors, textbooks, and lectures. Its impact on medical education will only continue to grow as its capabilities improve.
Collapse
Affiliation(s)
| | - Christine Yoon
- Albert Einstein College of Medicine, Bronx, New York, USA
| | | | - Adi Pinkas
- Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
141
|
Musheyev D, Pan A, Loeb S, Kabarriti AE. How Well Do Artificial Intelligence Chatbots Respond to the Top Search Queries About Urological Malignancies? Eur Urol 2024; 85:13-16. [PMID: 37567827 DOI: 10.1016/j.eururo.2023.07.004] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 06/18/2023] [Accepted: 07/17/2023] [Indexed: 08/13/2023]
Abstract
Artificial intelligence (AI) chatbots are becoming a popular source of information but there are limited data on the quality of information on urological malignancies that they provide. Our objective was to characterize the quality of information and detect misinformation about prostate, bladder, kidney, and testicular cancers from four AI chatbots: ChatGPT, Perplexity, Chat Sonic, and Microsoft Bing AI. We used the top five search queries related to prostate, bladder, kidney, and testicular cancers according to Google Trends from January 2021 to January 2023 and input them into the AI chatbots. Responses were evaluated for quality, understandability, actionability, misinformation, and readability using published instruments. AI chatbot responses had moderate to high information quality (median DISCERN score 4 out of 5, range 2-5) and lacked misinformation. Understandability was moderate (median Patient Education Material Assessment Tool for Printable Materials [PEMAT-P] understandability 66.7%, range 44.4-90.9%) and actionability was moderate to poor (median PEMAT-P actionability 40%, range 0-40%The responses were written at a fairly difficult reading level. AI chatbots produce information that is generally accurate and of moderate to high quality in response to the top urological malignancy-related search queries, but the responses lack clear, actionable instructions and exceed the reading level recommended for consumer health information. PATIENT SUMMARY: Artificial intelligence chatbots produce information that is generally accurate and of moderately high quality in response to popular Google searches about urological cancers. However, their responses are fairly difficult to read, are moderately hard to understand, and lack clear instructions for users to act on.
Collapse
Affiliation(s)
- David Musheyev
- Department of Urology, State University of New York Downstate Health Sciences University, New York, NY, USA
| | - Alexander Pan
- Department of Urology, State University of New York Downstate Health Sciences University, New York, NY, USA
| | - Stacy Loeb
- Department of Urology, New York University and Manhattan Veterans Affairs, New York, NY, USA; Department of Population Health, New York University, New York, NY, USA
| | - Abdo E Kabarriti
- Department of Urology, State University of New York Downstate Health Sciences University, New York, NY, USA.
| |
Collapse
|
142
|
Malik S, Zaheer S. ChatGPT as an aid for pathological diagnosis of cancer. Pathol Res Pract 2024; 253:154989. [PMID: 38056135 DOI: 10.1016/j.prp.2023.154989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 11/26/2023] [Accepted: 11/27/2023] [Indexed: 12/08/2023]
Abstract
Diagnostic workup of cancer patients is highly reliant on the science of pathology using cytopathology, histopathology, and other ancillary techniques like immunohistochemistry and molecular cytogenetics. Data processing and learning by means of artificial intelligence (AI) has become a spearhead for the advancement of medicine, with pathology and laboratory medicine being no exceptions. ChatGPT, an artificial intelligence (AI)-based chatbot, that was recently launched by OpenAI, is currently a talk of the town, and its role in cancer diagnosis is also being explored meticulously. Pathology workflow by integration of digital slides, implementation of advanced algorithms, and computer-aided diagnostic techniques extend the frontiers of the pathologist's view beyond a microscopic slide and enables effective integration, assimilation, and utilization of knowledge that is beyond human limits and boundaries. Despite of it's numerous advantages in the pathological diagnosis of cancer, it comes with several challenges like integration of digital slides with input language parameters, problems of bias, and legal issues which have to be addressed and worked up soon so that we as a pathologists diagnosing malignancies are on the same band wagon and don't miss the train.
Collapse
Affiliation(s)
- Shaivy Malik
- Department of Pathology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India
| | - Sufian Zaheer
- Department of Pathology, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, India.
| |
Collapse
|
143
|
Adhikari K, Naik N, Hameed BZ, Raghunath SK, Somani BK. Exploring the Ethical, Legal, and Social Implications of ChatGPT in Urology. Curr Urol Rep 2024; 25:1-8. [PMID: 37735339 DOI: 10.1007/s11934-023-01185-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/05/2023] [Indexed: 09/23/2023]
Abstract
PURPOSE OF THE REVIEW ChatGPT is programmed to generate responses based on pattern recognition. With this vast popularity and exponential growth, the question arises of moral issues, security and legitimacy. In this review article, we aim to analyze the ethical and legal implications of using ChatGPT in Urology and explore potential solutions addressing these concerns. RECENT FINDINGS There are many potential applications of ChatGPT in urology, and the extent to which it might improve healthcare may cause a profound shift in the way we deliver our services to patients and the overall healthcare system. This encompasses diagnosis and treatment planning, clinical workflow, patient education, augmenting consultations, and urological research. The ethical and legal considerations include patient autonomy and informed consent, privacy and confidentiality, bias and fairness, human oversight and accountability, trust and transparency, liability and malpractice, intellectual property rights, and regulatory framework. The application of ChatGPT in urology has shown great potential to improve patient care and assist urologists in various aspects of clinical practice, research, and education. Complying with data security and privacy regulations, and ensuring human oversight and accountability are some potential solutions to these legal and ethical concerns. Overall, the benefits and risks of using ChatGPT in urology must be weighed carefully, and a cautious approach must be taken to ensure that its use aligns with human values and advances patient care ethically and responsibly.
Collapse
Affiliation(s)
- Kinju Adhikari
- Department of Urology, HCG Cancer Centre, Bangaluru, India
| | - Nithesh Naik
- Department of Mechanical and Industrial Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India
| | - Bm Zeeshan Hameed
- Department of Urology, Father Muller Medical College, Mangalore, Karnataka, India
| | - S K Raghunath
- Department of Urology, HCG Cancer Centre, Bangaluru, India
| | - Bhaskar K Somani
- Department of Urology, University Hospital Southampton NHS Trust, Southampton, SO16 6YD, UK.
| |
Collapse
|
144
|
Wei K, Fritz C, Rajasekaran K. Answering head and neck cancer questions: An assessment of ChatGPT responses. Am J Otolaryngol 2024; 45:104085. [PMID: 37844413 DOI: 10.1016/j.amjoto.2023.104085] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 10/01/2023] [Indexed: 10/18/2023]
Abstract
PURPOSE To examine and compare ChatGPT versus Google websites in answering common head and neck cancer questions. MATERIALS AND METHODS Commonly asked questions about head and neck cancer were obtained and inputted into both ChatGPT-4 and Google search engine. For each question, the ChatGPT response and first website search result were compiled and examined. Content quality was assessed by independent reviewers using standardized grading criteria and the modified Ensuring Quality Information for Patients (EQIP) tool. Readability was determined using the Flesch reading ease scale. RESULTS In total, 49 questions related to head and neck cancer were included. Google sources were on average significantly higher quality than ChatGPT responses (4.2 vs 3.6, p = 0.005). According to the EQIP tool, Google and ChatGPT had on average similar response rates per criterion (24.4 vs 20.5, p = 0.09) while Google had a significantly higher average score per question than ChatGPT (13.8 vs 11.7, p < 0.001) According to the Flesch reading ease scale, ChatGPT and Google sources were both considered similarly difficult to read (33.1 vs 37.0, p = 0.180) and at a college level (14.3 vs 14.2, p = 0.820.) CONCLUSION: ChatGPT responses were as challenging to read as Google sources, but poorer quality due to decreased reliability and accuracy in answering questions. Though promising, ChatGPT in its current form should not be considered dependable. Google sources are a preferred resource for patient educational materials.
Collapse
Affiliation(s)
- Kimberly Wei
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA
| | - Christian Fritz
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA
| | - Karthik Rajasekaran
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA; Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
145
|
Alsadhan A, Al-Anezi F, Almohanna A, Alnaim N, Alzahrani H, Shinawi R, AboAlsamh H, Bakhshwain A, Alenazy M, Arif W, Alyousef S, Alhamidi S, Alghamdi A, AlShrayfi N, Rubaian NB, Alanzi T, AlSahli A, Alturki R, Herzallah N. The opportunities and challenges of adopting ChatGPT in medical research. Front Med (Lausanne) 2023; 10:1259640. [PMID: 38188345 PMCID: PMC10766839 DOI: 10.3389/fmed.2023.1259640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 12/07/2023] [Indexed: 01/09/2024] Open
Abstract
Purpose This study aims to investigate the opportunities and challenges of adopting ChatGPT in medical research. Methods A qualitative approach with focus groups is adopted in this study. A total of 62 participants including academic researchers from different streams in medicine and eHealth, participated in this study. Results A total of five themes with 16 sub-themes related to the opportunities; and a total of five themes with 12 sub-themes related to the challenges were identified. The major opportunities include improved data collection and analysis, improved communication and accessibility, and support for researchers in multiple streams of medical research. The major challenges identified were limitations of training data leading to bias, ethical issues, technical limitations, and limitations in data collection and analysis. Conclusion Although ChatGPT can be used as a potential tool in medical research, there is a need for further evidence to generalize its impact on the different research activities.
Collapse
Affiliation(s)
- Abeer Alsadhan
- Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | - Fahad Al-Anezi
- Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | - Asmaa Almohanna
- Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Norah Alnaim
- Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | | | | | - Hoda AboAlsamh
- Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | | | - Maha Alenazy
- King Saud University, Riyadh, Riyadh, Saudi Arabia
| | - Wejdan Arif
- King Saud University, Riyadh, Riyadh, Saudi Arabia
| | | | | | | | - Nour AlShrayfi
- Public Authority for Applied Education and Training, Kuwait City, Kuwait
| | | | - Turki Alanzi
- Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | - Alaa AlSahli
- King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Rasha Alturki
- Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | | |
Collapse
|
146
|
Ferreira RM. New evidence-based practice: Artificial intelligence as a barrier breaker. World J Methodol 2023; 13:384-389. [PMID: 38229944 PMCID: PMC10789101 DOI: 10.5662/wjm.v13.i5.384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 10/24/2023] [Accepted: 11/08/2023] [Indexed: 12/20/2023] Open
Abstract
The concept of evidence-based practice has persisted over several years and remains a cornerstone in clinical practice, representing the gold standard for optimal patient care. However, despite widespread recognition of its significance, practical application faces various challenges and barriers, including a lack of skills in interpreting studies, limited resources, time constraints, linguistic competencies, and more. Recently, we have witnessed the emergence of a groundbreaking technological revolution known as artificial intelligence. Although artificial intelligence has become increasingly integrated into our daily lives, some reluctance persists among certain segments of the public. This article explores the potential of artificial intelligence as a solution to some of the main barriers encountered in the application of evidence-based practice. It highlights how artificial intelligence can assist in staying updated with the latest evidence, enhancing clinical decision-making, addressing patient misinformation, and mitigating time constraints in clinical practice. The integration of artificial intelligence into evidence-based practice has the potential to revolutionize healthcare, leading to more precise diagnoses, personalized treatment plans, and improved doctor-patient interactions. This proposed synergy between evidence-based practice and artificial intelligence may necessitate adjustments to its core concept, heralding a new era in healthcare.
Collapse
Affiliation(s)
- Ricardo Maia Ferreira
- Department of Sports and Exercise, Polytechnic Institute of Maia (N2i), Maia 4475-690, Porto, Portugal
- Department of Physioterapy, Polytechnic Institute of Coimbra, Coimbra Health School, Coimbra 3046-854, Coimbra, Portugal
- Department of Physioterapy, Polytechnic Institute of Castelo Branco, Dr. Lopes Dias Health School, Castelo Branco 6000-767, Castelo Branco, Portugal
- Sport Physical Activity and Health Research & Innovation Center, Polytechnic Institute of Viana do Castelo, Melgaço, 4960-320, Viana do Castelo, Portugal
| |
Collapse
|
147
|
Zawiah M, Al-Ashwal FY, Gharaibeh L, Abu Farha R, Alzoubi KH, Abu Hammour K, Qasim QA, Abrah F. ChatGPT and Clinical Training: Perception, Concerns, and Practice of Pharm-D Students. J Multidiscip Healthc 2023; 16:4099-4110. [PMID: 38116306 PMCID: PMC10729768 DOI: 10.2147/jmdh.s439223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 12/04/2023] [Indexed: 12/21/2023] Open
Abstract
Background The emergence of Chat-Generative Pre-trained Transformer (ChatGPT) by OpenAI has revolutionized AI technology, demonstrating significant potential in healthcare and pharmaceutical education, yet its real-world applicability in clinical training warrants further investigation. Methods A cross-sectional study was conducted between April and May 2023 to assess PharmD students' perceptions, concerns, and experiences regarding the integration of ChatGPT into clinical pharmacy education. The study utilized a convenient sampling method through online platforms and involved a questionnaire with sections on demographics, perceived benefits, concerns, and experience with ChatGPT. Statistical analysis was performed using SPSS, including descriptive and inferential analyses. Results The findings of the study involving 211 PharmD students revealed that the majority of participants were male (77.3%), and had prior experience with artificial intelligence (68.2%). Over two-thirds were aware of ChatGPT. Most students (n= 139, 65.9%) perceived potential benefits in using ChatGPT for various clinical tasks, with concerns including over-reliance, accuracy, and ethical considerations. Adoption of ChatGPT in clinical training varied, with some students not using it at all, while others utilized it for tasks like evaluating drug-drug interactions and developing care plans. Previous users tended to have higher perceived benefits and lower concerns, but the differences were not statistically significant. Conclusion Utilizing ChatGPT in clinical training offers opportunities, but students' lack of trust in it for clinical decisions highlights the need for collaborative human-ChatGPT decision-making. It should complement healthcare professionals' expertise and be used strategically to compensate for human limitations. Further research is essential to optimize ChatGPT's effective integration.
Collapse
Affiliation(s)
- Mohammed Zawiah
- Department of Clinical Pharmacy, College of Pharmacy, Northern Border University, Rafha, 91911, Saudi Arabia
- Department of Pharmacy Practice, College of Clinical Pharmacy, Hodeidah University, Al Hodeidah, Yemen
| | - Fahmi Y Al-Ashwal
- Department of Clinical Pharmacy, College of Pharmacy, Al-Ayen University, Thi-Qar, Iraq
| | - Lobna Gharaibeh
- Pharmacological and Diagnostic Research Center, Faculty of Pharmacy, Al-Ahliyya Amman University, Amman, Jordan
| | - Rana Abu Farha
- Clinical Pharmacy and Therapeutics Department, Faculty of Pharmacy, Applied Science Private University, Amman, Jordan
| | - Karem H Alzoubi
- Department of Pharmacy Practice and Pharmacotherapeutics, University of Sharjah, Sharjah, 27272, United Arab Emirates
- Department of Clinical Pharmacy, Faculty of Pharmacy, Jordan University of Science and Technology, Irbid, 22110, Jordan
| | - Khawla Abu Hammour
- Department of Clinical Pharmacy and Biopharmaceutics, Faculty of Pharmacy, University of Jordan, Amman, Jordan
| | - Qutaiba A Qasim
- Department of Clinical Pharmacy, College of Pharmacy, Al-Ayen University, Thi-Qar, Iraq
| | - Fahd Abrah
- Discipline of Social and Administrative Pharmacy, School of Pharmaceutical Sciences, Universiti Sains Malaysia, Penang, Malaysia
| |
Collapse
|
148
|
Wang G, Gao K, Liu Q, Wu Y, Zhang K, Zhou W, Guo C. Potential and Limitations of ChatGPT 3.5 and 4.0 as a Source of COVID-19 Information: Comprehensive Comparative Analysis of Generative and Authoritative Information. J Med Internet Res 2023; 25:e49771. [PMID: 38096014 PMCID: PMC10755661 DOI: 10.2196/49771] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 10/01/2023] [Accepted: 11/16/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has necessitated reliable and authoritative information for public guidance. The World Health Organization (WHO) has been a primary source of such information, disseminating it through a question and answer format on its official website. Concurrently, ChatGPT 3.5 and 4.0, a deep learning-based natural language generation system, has shown potential in generating diverse text types based on user input. OBJECTIVE This study evaluates the accuracy of COVID-19 information generated by ChatGPT 3.5 and 4.0, assessing its potential as a supplementary public information source during the pandemic. METHODS We extracted 487 COVID-19-related questions from the WHO's official website and used ChatGPT 3.5 and 4.0 to generate corresponding answers. These generated answers were then compared against the official WHO responses for evaluation. Two clinical experts scored the generated answers on a scale of 0-5 across 4 dimensions-accuracy, comprehensiveness, relevance, and clarity-with higher scores indicating better performance in each dimension. The WHO responses served as the reference for this assessment. Additionally, we used the BERT (Bidirectional Encoder Representations from Transformers) model to generate similarity scores (0-1) between the generated and official answers, providing a dual validation mechanism. RESULTS The mean (SD) scores for ChatGPT 3.5-generated answers were 3.47 (0.725) for accuracy, 3.89 (0.719) for comprehensiveness, 4.09 (0.787) for relevance, and 3.49 (0.809) for clarity. For ChatGPT 4.0, the mean (SD) scores were 4.15 (0.780), 4.47 (0.641), 4.56 (0.600), and 4.09 (0.698), respectively. All differences were statistically significant (P<.001), with ChatGPT 4.0 outperforming ChatGPT 3.5. The BERT model verification showed mean (SD) similarity scores of 0.83 (0.07) for ChatGPT 3.5 and 0.85 (0.07) for ChatGPT 4.0 compared with the official WHO answers. CONCLUSIONS ChatGPT 3.5 and 4.0 can generate accurate and relevant COVID-19 information to a certain extent. However, compared with official WHO responses, gaps and deficiencies exist. Thus, users of ChatGPT 3.5 and 4.0 should also reference other reliable information sources to mitigate potential misinformation risks. Notably, ChatGPT 4.0 outperformed ChatGPT 3.5 across all evaluated dimensions, a finding corroborated by BERT model validation.
Collapse
Affiliation(s)
- Guoyong Wang
- Children's Hospital, Chongqing Medical University, Chongqing, China
- Women and Children's Hospital, Chongqing Medical University, Chongqing, China
| | - Kai Gao
- Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Qianyang Liu
- Women and Children's Hospital, Chongqing Medical University, Chongqing, China
| | - Yuxin Wu
- Children's Hospital, Chongqing Medical University, Chongqing, China
| | - Kaijun Zhang
- Children's Hospital, Chongqing Medical University, Chongqing, China
| | - Wei Zhou
- Women and Children's Hospital, Chongqing Medical University, Chongqing, China
| | - Chunbao Guo
- Women and Children's Hospital, Chongqing Medical University, Chongqing, China
| |
Collapse
|
149
|
Cardona G, Argiles M, Pérez-Mañá L. Accuracy of a Large Language Model as a new tool for optometry education. Clin Exp Optom 2023:1-4. [PMID: 38044041 DOI: 10.1080/08164622.2023.2288174] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 09/18/2023] [Indexed: 12/05/2023] Open
Abstract
CLINICAL RELEVANCE The unsupervised introduction of certain Artificial Intelligence tools in optometry education may challenge the proper acquisition of accurate clinical knowledge and skills proficiency. BACKGROUND Large Language Models like ChatGPT (Generative Pretrained Transformer) are increasingly being used by researchers and students for work and academic assignments. The authoritative and conversationally correct language provided by these tools may mask their inherent limitations when presented with specific scientific and clinical queries. METHODS Three sets of 10 queries related to contact lenses & anterior eye, low vision and binocular vision & vision therapy were presented to ChatGPT, with instructions to provide five relevant references to support each response. Three experts and 53 undergraduate and post-graduate students graded from 0 to 10 the accuracy of the responses, and the references were evaluated for precision and relevance. Students graded from 0 to 10 the potential usefulness of ChatGPT for their academic coursework. RESULTS Median scores were 7, 8 and 6 (experts) and 8, 9 and 7.5 (students) for the contact lenses & anterior eye, low vision and binocular vision & vision therapy categories, respectively. Responses to more specific queries were awarded lower scores by both experts (ρ = -0.612; P < 0.001) and students (ρ = -0.578; P = 0.001). Of 150 references, 24% were accurate and 19.3% relevant. Students graded the usefulness of ChatGPT with 7.5 (2 to 9), 7 (3 to 9) and 8.5 (3 to 10) for contact lenses & anterior eye, low vision and binocular vision & vision therapy, respectively. CONCLUSION Careful expert appraisal of the responses and, particularly, of the references provided by ChatGPT is required in research and academic settings. As the use of these tools becomes widespread, it is essential to take proactive steps to address their limitations and ensure their responsible use.
Collapse
Affiliation(s)
- Genis Cardona
- Department of Optics and Optometry, Universitat Politècnica de Catalunya, Terrassa, Spain
| | - Marc Argiles
- Department of Optics and Optometry, Universitat Politècnica de Catalunya, Terrassa, Spain
| | - Lluis Pérez-Mañá
- Department of Optics and Optometry, Universitat Politècnica de Catalunya, Terrassa, Spain
| |
Collapse
|
150
|
Liu F, Zhu T, Wu X, Yang B, You C, Wang C, Lu L, Liu Z, Zheng Y, Sun X, Yang Y, Clifton L, Clifton DA. A medical multimodal large language model for future pandemics. NPJ Digit Med 2023; 6:226. [PMID: 38042919 PMCID: PMC10693607 DOI: 10.1038/s41746-023-00952-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 10/24/2023] [Indexed: 12/04/2023] Open
Abstract
Deep neural networks have been integrated into the whole clinical decision procedure which can improve the efficiency of diagnosis and alleviate the heavy workload of physicians. Since most neural networks are supervised, their performance heavily depends on the volume and quality of available labels. However, few such labels exist for rare diseases (e.g., new pandemics). Here we report a medical multimodal large language model (Med-MLLM) for radiograph representation learning, which can learn broad medical knowledge (e.g., image understanding, text semantics, and clinical phenotypes) from unlabelled data. As a result, when encountering a rare disease, our Med-MLLM can be rapidly deployed and easily adapted to them with limited labels. Furthermore, our model supports medical data across visual modality (e.g., chest X-ray and CT) and textual modality (e.g., medical report and free-text clinical note); therefore, it can be used for clinical tasks that involve both visual and textual data. We demonstrate the effectiveness of our Med-MLLM by showing how it would perform using the COVID-19 pandemic "in replay". In the retrospective setting, we test the model on the early COVID-19 datasets; and in the prospective setting, we test the model on the new variant COVID-19-Omicron. The experiments are conducted on 1) three kinds of input data; 2) three kinds of downstream tasks, including disease reporting, diagnosis, and prognosis; 3) five COVID-19 datasets; and 4) three different languages, including English, Chinese, and Spanish. All experiments show that our model can make accurate and robust COVID-19 decision-support with little labelled data.
Collapse
Affiliation(s)
- Fenglin Liu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK.
| | - Tingting Zhu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Xian Wu
- Jarvis Research Center, Tencent YouTu Lab, Beijing, China
| | - Bang Yang
- School of Computer Science, Peking University, Beijing, China
| | | | - Chenyang Wang
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Lei Lu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Zhangdaihong Liu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
- Oxford-Suzhou Centre for Advanced Research, Suzhou, China
| | - Yefeng Zheng
- Jarvis Research Center, Tencent YouTu Lab, Beijing, China
| | - Xu Sun
- School of Computer Science, Peking University, Beijing, China
| | - Yang Yang
- School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Lei Clifton
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - David A Clifton
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK.
- Oxford-Suzhou Centre for Advanced Research, Suzhou, China.
| |
Collapse
|