1
|
Ho RA, Shaari AL, Cowan PT, Yan K. ChatGPT Responses to Frequently Asked Questions on Ménière's Disease: A Comparison to Clinical Practice Guideline Answers. OTO Open 2024; 8:e163. [PMID: 38974175 PMCID: PMC11225079 DOI: 10.1002/oto2.163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Revised: 06/01/2024] [Accepted: 06/08/2024] [Indexed: 07/09/2024] Open
Abstract
Objective Evaluate the quality of responses from Chat Generative Pre-Trained Transformer (ChatGPT) models compared to the answers for "Frequently Asked Questions" (FAQs) from the American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS) Clinical Practice Guidelines (CPG) for Ménière's disease (MD). Study Design Comparative analysis. Setting The AAO-HNS CPG for MD includes FAQs that clinicians can give to patients for MD-related questions. The ability of ChatGPT to properly educate patients regarding MD is unknown. Methods ChatGPT-3.5 and 4.0 were each prompted with 16 questions from the MD FAQs. Each response was rated in terms of (1) comprehensiveness, (2) extensiveness, (3) presence of misleading information, and (4) quality of resources. Readability was assessed using Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease Score (FRES). Results ChatGPT-3.5 was comprehensive in 5 responses whereas ChatGPT-4.0 was comprehensive in 9 (31.3% vs 56.3%, P = .2852). ChatGPT-3.5 and 4.0 were extensive in all responses (P = 1.0000). ChatGPT-3.5 was misleading in 5 responses whereas ChatGPT-4.0 was misleading in 3 (31.3% vs 18.75%, P = .6851). ChatGPT-3.5 had quality resources in 10 responses whereas ChatGPT-4.0 had quality resources in 16 (62.5% vs 100%, P = .0177). AAO-HNS CPG FRES (62.4 ± 16.6) demonstrated an appropriate readability score of at least 60, while both ChatGPT-3.5 (39.1 ± 7.3) and 4.0 (42.8 ± 8.5) failed to meet this standard. All platforms had FKGL means that exceeded the recommended level of 6 or lower. Conclusion While ChatGPT-4.0 had significantly better resource reporting, both models have room for improvement in being more comprehensive, more readable, and less misleading for patients.
Collapse
Affiliation(s)
- Rebecca A. Ho
- Department of Otolaryngology–Head and Neck SurgeryRutgers New Jersey Medical SchoolNewarkNew JerseyUSA
| | - Ariana L. Shaari
- Department of Otolaryngology–Head and Neck SurgeryRutgers New Jersey Medical SchoolNewarkNew JerseyUSA
| | - Paul T. Cowan
- Department of Otolaryngology–Head and Neck SurgeryRutgers New Jersey Medical SchoolNewarkNew JerseyUSA
| | - Kenneth Yan
- Department of Otolaryngology–Head and Neck SurgeryRutgers New Jersey Medical SchoolNewarkNew JerseyUSA
| |
Collapse
|
2
|
Brown EDL, Ward M, Maity A, Mittler MA, Larry Lo SF, D'Amico RS. Enhancing Diagnostic Support for Chiari Malformation and Syringomyelia: A Comparative Study of Contextualized ChatGPT Models. World Neurosurg 2024:S1878-8750(24)00935-5. [PMID: 38830507 DOI: 10.1016/j.wneu.2024.05.172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 05/28/2024] [Indexed: 06/05/2024]
Abstract
OBJECTIVES The rapidly increasing adoption of large language models in medicine has drawn attention to potential applications within the field of neurosurgery. This study evaluates the effects of various contextualization methods on ChatGPT's ability to provide expert-consensus aligned recommendations on the diagnosis and management of Chiari Malformation and Syringomyelia. METHODS Native GPT4 and GPT4 models contextualized using various strategies were asked questions revised from the 2022 Chiari and Syringomyelia Consortium International Consensus Document. ChatGPT-provided responses were then compared to consensus statements using reviewer assessments of 1) responding to the prompt, 2) agreement of ChatGPT response with consensus statements, 3) recommendation to consult with a medical professional, and 4) presence of supplementary information. Flesch-Kincaid, SMOG, word count, and Gunning-Fog readability scores were calculated for each model using the quanteda package in R. RESULTS Relative to GPT4, all contextualized GPTs demonstrated increased agreement with consensus statements. PDF+Prompting and Prompting models provided the most elevated agreement scores of 19 of 24 and 23 of 24, respectively, versus 9 of 24 for GPT4 (p=.021, p=.001). A trend toward improved readability was observed when comparing contextualized models at large to ChatGPT4, with significant decreases in average word count (180.7 vs 382.3, p<.001) and Flesch-Kincaid Reading Ease score (11.7 vs 17.2, p=.033). CONCLUSIONS The enhanced performance observed in response to ChatGPT4 contextualization suggests broader applications of large language models in neurosurgery than what the current literature indicates. This study provides proof of concept for the use of contextualized GPT models in neurosurgical contexts and showcases the easy accessibility of improved model performance.
Collapse
Affiliation(s)
- Ethan D L Brown
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA.
| | - Max Ward
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA
| | - Apratim Maity
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA
| | - Mark A Mittler
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA
| | - Sheng-Fu Larry Lo
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA
| | - Randy S D'Amico
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA
| |
Collapse
|
3
|
Pressman SM, Borna S, Gomez-Cabello CA, Haider SA, Haider CR, Forte AJ. Clinical and Surgical Applications of Large Language Models: A Systematic Review. J Clin Med 2024; 13:3041. [PMID: 38892752 PMCID: PMC11172607 DOI: 10.3390/jcm13113041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/15/2024] [Accepted: 05/19/2024] [Indexed: 06/21/2024] Open
Abstract
Background: Large language models (LLMs) represent a recent advancement in artificial intelligence with medical applications across various healthcare domains. The objective of this review is to highlight how LLMs can be utilized by clinicians and surgeons in their everyday practice. Methods: A systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Six databases were searched to identify relevant articles. Eligibility criteria emphasized articles focused primarily on clinical and surgical applications of LLMs. Results: The literature search yielded 333 results, with 34 meeting eligibility criteria. All articles were from 2023. There were 14 original research articles, four letters, one interview, and 15 review articles. These articles covered a wide variety of medical specialties, including various surgical subspecialties. Conclusions: LLMs have the potential to enhance healthcare delivery. In clinical settings, LLMs can assist in diagnosis, treatment guidance, patient triage, physician knowledge augmentation, and administrative tasks. In surgical settings, LLMs can assist surgeons with documentation, surgical planning, and intraoperative guidance. However, addressing their limitations and concerns, particularly those related to accuracy and biases, is crucial. LLMs should be viewed as tools to complement, not replace, the expertise of healthcare professionals.
Collapse
Affiliation(s)
| | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | - Syed Ali Haider
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Clifton R. Haider
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN 55905, USA
| | - Antonio Jorge Forte
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
4
|
Chalhoub R, Mouawad A, Aoun M, Daher M, El-Sett P, Kreichati G, Kharrat K, Sebaaly A. Will ChatGPT be Able to Replace a Spine Surgeon in the Clinical Setting? World Neurosurg 2024; 185:e648-e652. [PMID: 38417624 DOI: 10.1016/j.wneu.2024.02.101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/17/2024] [Accepted: 02/19/2024] [Indexed: 03/01/2024]
Abstract
OBJECTIVE This study evaluates ChatGPT's performance in diagnosing and managing spinal pathologies. METHODS Patients underwent evaluation by two spine surgeons (and the case was discussed and a consensus was reached) and ChatGPT. Patient data, including demographics, symptoms, and available imaging reports, were collected using a standardized form. This information was then processed by ChatGPT for diagnosis and management recommendations. The study assessed ChatGPT's diagnostic and management accuracy through descriptive statistics, comparing its performance to that of experienced spine specialists. RESULTS A total of 97 patients with various spinal pathologies participated in the study, with a gender distribution of 40 males and 57 females. ChatGPT achieved a 70% diagnostic accuracy rate and provided suitable management recommendations for 95% of patients. However, it struggled with certain pathologies, misdiagnosing 100% of vertebral trauma and facet joint syndrome, 40% of spondylolisthesis, stenosis, and scoliosis, and 22% of disc-related pathologies. Furthermore, ChatGPT's management recommendations were poor in 53% of cases, often failing to suggest the most appropriate treatment options and occasionally providing incomplete advice. CONCLUSIONS While helpful in the medical field, ChatGPT falls short in providing reliable management recommendations, with a 30% misdiagnosis rate and 53% mismanagement rate in our study. Its limitations, including reliance on outdated data and the inability to interactively gather patient information, must be acknowledged. Surgeons should use ChatGPT cautiously as a supplementary tool rather than a substitute for their clinical expertise, as the complexities of healthcare demand human judgment and interaction.
Collapse
Affiliation(s)
- Ralph Chalhoub
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon
| | - Antoine Mouawad
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon
| | - Marven Aoun
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon
| | - Mohammad Daher
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon; Department of Orthopedic Surgery, Brown University, Providence, Rhode Island, USA
| | - Pierre El-Sett
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon; Department of Orthopedic Surgery, Hotel Dieu de France Hospital, Beirut, Lebanon
| | - Gaby Kreichati
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon; Department of Orthopedic Surgery, Hotel Dieu de France Hospital, Beirut, Lebanon
| | - Khalil Kharrat
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon; Department of Orthopedic Surgery, Hotel Dieu de France Hospital, Beirut, Lebanon
| | - Amer Sebaaly
- Saint Joseph University, Faculty of medicine, Beirut, Lebanon; Department of Orthopedic Surgery, Hotel Dieu de France Hospital, Beirut, Lebanon.
| |
Collapse
|
5
|
Zhang S, Liau ZQG, Tan KLM, Chua WL. Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement. Knee Surg Relat Res 2024; 36:15. [PMID: 38566254 PMCID: PMC10986046 DOI: 10.1186/s43019-024-00218-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Chat Generative Pretrained Transformer (ChatGPT), a generative artificial intelligence chatbot, may have broad applications in healthcare delivery and patient education due to its ability to provide human-like responses to a wide range of patient queries. However, there is limited evidence regarding its ability to provide reliable and useful information on orthopaedic procedures. This study seeks to evaluate the accuracy and relevance of responses provided by ChatGPT to frequently asked questions (FAQs) regarding total knee replacement (TKR). METHODS A list of 50 clinically-relevant FAQs regarding TKR was collated. Each question was individually entered as a prompt to ChatGPT (version 3.5), and the first response generated was recorded. Responses were then reviewed by two independent orthopaedic surgeons and graded on a Likert scale for their factual accuracy and relevance. These responses were then classified into accurate versus inaccurate and relevant versus irrelevant responses using preset thresholds on the Likert scale. RESULTS Most responses were accurate, while all responses were relevant. Of the 50 FAQs, 44/50 (88%) of ChatGPT responses were classified as accurate, achieving a mean Likert grade of 4.6/5 for factual accuracy. On the other hand, 50/50 (100%) of responses were classified as relevant, achieving a mean Likert grade of 4.9/5 for relevance. CONCLUSION ChatGPT performed well in providing accurate and relevant responses to FAQs regarding TKR, demonstrating great potential as a tool for patient education. However, it is not infallible and can occasionally provide inaccurate medical information. Patients and clinicians intending to utilize this technology should be mindful of its limitations and ensure adequate supervision and verification of information provided.
Collapse
Affiliation(s)
- Siyuan Zhang
- Department of Orthopaedic Surgery, National University Health System, Level 11, NUHS Tower Block, 1E Kent Ridge Road, Singapore, 119228, Singapore.
| | - Zi Qiang Glen Liau
- Department of Orthopaedic Surgery, National University Health System, Level 11, NUHS Tower Block, 1E Kent Ridge Road, Singapore, 119228, Singapore
| | - Kian Loong Melvin Tan
- Department of Orthopaedic Surgery, National University Health System, Level 11, NUHS Tower Block, 1E Kent Ridge Road, Singapore, 119228, Singapore
| | - Wei Liang Chua
- Department of Orthopaedic Surgery, National University Health System, Level 11, NUHS Tower Block, 1E Kent Ridge Road, Singapore, 119228, Singapore
| |
Collapse
|
6
|
Ozgor F, Caglar U, Halis A, Cakir H, Aksu UC, Ayranci A, Sarilar O. Urological Cancers and ChatGPT: Assessing the Quality of Information and Possible Risks for Patients. Clin Genitourin Cancer 2024; 22:454-457.e4. [PMID: 38246831 DOI: 10.1016/j.clgc.2023.12.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/23/2023] [Accepted: 12/31/2023] [Indexed: 01/23/2024]
Abstract
INTRODUCTION OpenAI has created ChatGPT, an artificial intelligence language model that has gained considerable recognition for its capacity to produce text responses resembling human language. Consequently, this study seeks to evaluate the effectiveness of ChatGPT's responses in addressing publicly accessible queries related to prostate, kidney, bladder, and testicular cancers. MATERIAL AND METHODS A comprehensive compilation of frequently asked questions (FAQs) pertaining to prostate, bladder, kidney, and testicular cancers was gathered from diverse sources. Additionally, the recommendations outlined in the European Association of Urology (EAU) 2023 Guideline Oncology were consulted. The chosen questions for evaluation were presented to the ChatGPT 4.0 premium version. The quality of ChatGPT responses was appraised using the global quality score (GQS). Each ChatGPT response was independently reviewed by a panel of physicians, who assigned a GQS score to assess its overall quality. RESULTS For prostate cancer, 64.6% of the questions had a GQS score of 5, compared to 62.9 % for bladder, 68.1% for kidney, and 63.9% for testicular cancers, whereas none of the responses had a GQS score of 1. Meanwhile, the category with the lowest proportion of responses, with a GQS score of 5 for each disease, was prognosis and follow-up. The mean GQS score of the answers given to EAU guideline questions was statistically significantly lower than the average score of the answers given to FAQs. CONCLUSION ChatGPT is a valuable tool for addressing general inquiries regarding urological cancers, boasting commendable accuracy rates. Nonetheless, its performance in responding to questions aligned with the EAU guideline was deemed unsatisfactory.
Collapse
Affiliation(s)
- Faruk Ozgor
- Department of Urology, Haseki Training and Research Hospital, Istanbul, Turkey.
| | - Ufuk Caglar
- Department of Urology, Haseki Training and Research Hospital, Istanbul, Turkey
| | - Ahmet Halis
- Department of Urology, Haseki Training and Research Hospital, Istanbul, Turkey
| | - Hakan Cakir
- Department of Urology, Fulya Acibadem Hospital, Istanbul, Turkey
| | - Ufuk Can Aksu
- Department of Urology, Haseki Training and Research Hospital, Istanbul, Turkey
| | - Ali Ayranci
- Department of Urology, Haseki Training and Research Hospital, Istanbul, Turkey
| | - Omer Sarilar
- Department of Urology, Haseki Training and Research Hospital, Istanbul, Turkey
| |
Collapse
|
7
|
Dyckhoff-Shen S, Koedel U, Brouwer MC, Bodilsen J, Klein M. ChatGPT fails challenging the recent ESCMID brain abscess guideline. J Neurol 2024; 271:2086-2101. [PMID: 38279999 PMCID: PMC10972965 DOI: 10.1007/s00415-023-12168-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/29/2024]
Abstract
BACKGROUND With artificial intelligence (AI) on the rise, it remains unclear if AI is able to professionally evaluate medical research and give scientifically valid recommendations. AIM This study aimed to assess the accuracy of ChatGPT's responses to ten key questions on brain abscess diagnostics and treatment in comparison to the guideline recently published by the European Society for Clinical Microbiology and Infectious Diseases (ESCMID). METHODS All ten PECO (Population, Exposure, Comparator, Outcome) questions which had been developed during the guideline process were presented directly to ChatGPT. Next, ChatGPT was additionally fed with data from studies selected for each PECO question by the ESCMID committee. AI's responses were subsequently compared with the recommendations of the ESCMID guideline. RESULTS For 17 out of 20 challenges, ChatGPT was able to give recommendations on the management of patients with brain abscess, including grade of evidence and strength of recommendation. Without data prompting, 70% of questions were answered very similar to the guideline recommendation. In the answers that differed from the guideline recommendations, no patient hazard was present. Data input slightly improved the clarity of ChatGPT's recommendations, but, however, led to less correct answers including two recommendations that directly contradicted the guideline, being associated with the possibility of a hazard to the patient. CONCLUSION ChatGPT seems to be able to rapidly gather information on brain abscesses and give recommendations on key questions about their management in most cases. Nevertheless, single responses could possibly harm the patients. Thus, the expertise of an expert committee remains inevitable.
Collapse
Affiliation(s)
- Susanne Dyckhoff-Shen
- Department of Neurology with Friedrich-Baur-Institute, LMU University Hospital, LMU Munich (en.), Klinikum Grosshadern of the Ludwig Maximilians University of Munich, Marchioninistr. 15, 81377, Munich, Germany.
| | - Uwe Koedel
- Department of Neurology with Friedrich-Baur-Institute, LMU University Hospital, LMU Munich (en.), Klinikum Grosshadern of the Ludwig Maximilians University of Munich, Marchioninistr. 15, 81377, Munich, Germany
| | - Matthijs C Brouwer
- Department of Neurology, Amsterdam UMC, University of Amsterdam, Amsterdam Neuroscience, Amsterdam, The Netherlands
- European Society for Clinical Microbiology and Infectious Diseases (ESCMID) Study Group for Infections of the Brain (ESGIB), Basel, Switzerland
| | - Jacob Bodilsen
- Department of Infectious Diseases, Aalborg University Hospital, Aalborg, Denmark
- European Society for Clinical Microbiology and Infectious Diseases (ESCMID) Study Group for Infections of the Brain (ESGIB), Basel, Switzerland
| | - Matthias Klein
- Department of Neurology with Friedrich-Baur-Institute, LMU University Hospital, LMU Munich (en.), Klinikum Grosshadern of the Ludwig Maximilians University of Munich, Marchioninistr. 15, 81377, Munich, Germany
- Emergency Department, LMU University Hospital, LMU Munich (en.), Munich, Germany
- European Society for Clinical Microbiology and Infectious Diseases (ESCMID) Study Group for Infections of the Brain (ESGIB), Basel, Switzerland
| |
Collapse
|
8
|
Mejia MR, Arroyave JS, Saturno M, Ndjonko LCM, Zaidat B, Rajjoub R, Ahmed W, Zapolsky I, Cho SK. Use of ChatGPT for Determining Clinical and Surgical Treatment of Lumbar Disc Herniation With Radiculopathy: A North American Spine Society Guideline Comparison. Neurospine 2024; 21:149-158. [PMID: 38291746 PMCID: PMC10992643 DOI: 10.14245/ns.2347052.526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/05/2023] [Accepted: 12/11/2023] [Indexed: 02/01/2024] Open
Abstract
OBJECTIVE Large language models like chat generative pre-trained transformer (ChatGPT) have found success in various sectors, but their application in the medical field remains limited. This study aimed to assess the feasibility of using ChatGPT to provide accurate medical information to patients, specifically evaluating how well ChatGPT versions 3.5 and 4 aligned with the 2012 North American Spine Society (NASS) guidelines for lumbar disk herniation with radiculopathy. METHODS ChatGPT's responses to questions based on the NASS guidelines were analyzed for accuracy. Three new categories-overconclusiveness, supplementary information, and incompleteness-were introduced to deepen the analysis. Overconclusiveness referred to recommendations not mentioned in the NASS guidelines, supplementary information denoted additional relevant details, and incompleteness indicated omitted crucial information from the NASS guidelines. RESULTS Out of 29 clinical guidelines evaluated, ChatGPT-3.5 demonstrated accuracy in 15 responses (52%), while ChatGPT-4 achieved accuracy in 17 responses (59%). ChatGPT-3.5 was overconclusive in 14 responses (48%), while ChatGPT-4 exhibited overconclusiveness in 13 responses (45%). Additionally, ChatGPT-3.5 provided supplementary information in 24 responses (83%), and ChatGPT-4 provided supplemental information in 27 responses (93%). In terms of incompleteness, ChatGPT-3.5 displayed this in 11 responses (38%), while ChatGPT-4 showed incompleteness in 8 responses (23%). CONCLUSION ChatGPT shows promise for clinical decision-making, but both patients and healthcare providers should exercise caution to ensure safety and quality of care. While these results are encouraging, further research is necessary to validate the use of large language models in clinical settings.
Collapse
Affiliation(s)
- Mateo Restrepo Mejia
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Juan Sebastian Arroyave
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Saturno
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Bashar Zaidat
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rami Rajjoub
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Wasil Ahmed
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ivan Zapolsky
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Samuel K. Cho
- Department of Orthopedic Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
9
|
Wang X, Liu XQ. Potential and limitations of ChatGPT and generative artificial intelligence in medical safety education. World J Clin Cases 2023; 11:7935-7939. [DOI: 10.12998/wjcc.v11.i32.7935] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/21/2023] [Accepted: 11/02/2023] [Indexed: 11/16/2023] Open
Abstract
The primary objectives of medical safety education are to provide the public with essential knowledge about medications and to foster a scientific approach to drug usage. The era of using artificial intelligence to revolutionize medical safety education has already dawned, and ChatGPT and other generative artificial intelligence models have immense potential in this domain. Notably, they offer a wealth of knowledge, anonymity, continuous availability, and personalized services. However, the practical implementation of generative artificial intelligence models such as ChatGPT in medical safety education still faces several challenges, including concerns about the accuracy of information, legal responsibilities, and ethical obligations. Moving forward, it is crucial to intelligently upgrade ChatGPT by leveraging the strengths of existing medical practices. This task involves further integrating the model with real-life scenarios and proactively addressing ethical and security issues with the ultimate goal of providing the public with comprehensive, convenient, efficient, and personalized medical services.
Collapse
Affiliation(s)
- Xin Wang
- School of Education, Tianjin University, Tianjin 300350, China
| | - Xin-Qiao Liu
- School of Education, Tianjin University, Tianjin 300350, China
| |
Collapse
|