Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Beutel G, Geerits E, Kielstein JT. Artificial hallucination: GPT on LSD? Crit Care 2023;27:148. [PMID: 37072798 PMCID: PMC10114308 DOI: 10.1186/s13054-023-04425-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 03/30/2023] [Indexed: 04/20/2023] Open

For:	Beutel G, Geerits E, Kielstein JT. Artificial hallucination: GPT on LSD? Crit Care 2023;27:148. [PMID: 37072798 PMCID: PMC10114308 DOI: 10.1186/s13054-023-04425-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 03/30/2023] [Indexed: 04/20/2023] Open

Number

Cited by Other Article(s)

Chen JS, Reddy AJ, Al-Sharif E, Shoji MK, Kalaw FGP, Eslani M, Lang PZ, Arya M, Koretz ZA, Bolo KA, Arnett JJ, Roginiel AC, Do JL, Robbins SL, Camp AS, Scott NL, Rudell JC, Weinreb RN, Baxter SL, Granet DB. Analysis of ChatGPT Responses to Ophthalmic Cases: Can ChatGPT Think like an Ophthalmologist? OPHTHALMOLOGY SCIENCE 2025;5:100600. [PMID: 39346575 PMCID: PMC11437840 DOI: 10.1016/j.xops.2024.100600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 08/09/2024] [Accepted: 08/13/2024] [Indexed: 10/01/2024]

Abstract

Objective

Large language models such as ChatGPT have demonstrated significant potential in question-answering within ophthalmology, but there is a paucity of literature evaluating its ability to generate clinical assessments and discussions. The objectives of this study were to (1) assess the accuracy of assessment and plans generated by ChatGPT and (2) evaluate ophthalmologists' abilities to distinguish between responses generated by clinicians versus ChatGPT.

Design

Cross-sectional mixed-methods study.

Subjects

Sixteen ophthalmologists from a single academic center, of which 10 were board-eligible and 6 were board-certified, were recruited to participate in this study.

Methods

Prompt engineering was used to ensure ChatGPT output discussions in the style of the ophthalmologist author of the Medical College of Wisconsin Ophthalmic Case Studies. Cases where ChatGPT accurately identified the primary diagnoses were included and then paired. Masked human-generated and ChatGPT-generated discussions were sent to participating ophthalmologists to identify the author of the discussions. Response confidence was assessed using a 5-point Likert scale score, and subjective feedback was manually reviewed.

Main Outcome Measures

Accuracy of ophthalmologist identification of discussion author, as well as subjective perceptions of human-generated versus ChatGPT-generated discussions.

Results

Overall, ChatGPT correctly identified the primary diagnosis in 15 of 17 (88.2%) cases. Two cases were excluded from the paired comparison due to hallucinations or fabrications of nonuser-provided data. Ophthalmologists correctly identified the author in 77.9% ± 26.6% of the 13 included cases, with a mean Likert scale confidence rating of 3.6 ± 1.0. No significant differences in performance or confidence were found between board-certified and board-eligible ophthalmologists. Subjectively, ophthalmologists found that discussions written by ChatGPT tended to have more generic responses, irrelevant information, hallucinated more frequently, and had distinct syntactic patterns (all P < 0.01).

Conclusions

Large language models have the potential to synthesize clinical data and generate ophthalmic discussions. While these findings have exciting implications for artificial intelligence-assisted health care delivery, more rigorous real-world evaluation of these models is necessary before clinical deployment.

Financial Disclosures

The author(s) have no proprietary or commercial interest in any materials discussed in this article.

Collapse

Affiliation(s)

Jimmy S Chen Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Akshay J Reddy School of Medicine, California University of Science and Medicine, Colton, California
Eman Al-Sharif Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California Surgery Department, College of Medicine, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Marissa K Shoji Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Fritz Gerald P Kalaw Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Medi Eslani Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Paul Z Lang Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Malvika Arya Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Zachary A Koretz Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Kyle A Bolo Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Justin J Arnett Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Aliya C Roginiel Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Jiun L Do Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Shira L Robbins Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Andrew S Camp Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Nathan L Scott Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Jolene C Rudell Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California
Robert N Weinreb Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Sally L Baxter Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
David B Granet Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California

Collapse

Di Paolo LD, White B, Guénin-Carlut A, Constant A, Clark A. Active inference goes to school: the importance of active learning in the age of large language models. Philos Trans R Soc Lond B Biol Sci 2024;379:20230148. [PMID: 39155715 PMCID: PMC11391319 DOI: 10.1098/rstb.2023.0148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/16/2023] [Accepted: 01/23/2024] [Indexed: 08/20/2024] Open

Wang A, Zhou J, Zhang P, Cao H, Xin H, Xu X, Zhou H. Large language model answers medical questions about standard pathology reports. Front Med (Lausanne) 2024;11:1402457. [PMID: 39359921 PMCID: PMC11445125 DOI: 10.3389/fmed.2024.1402457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Accepted: 08/28/2024] [Indexed: 10/04/2024] Open

Filetti S, Fenza G, Gallo A. Research design and writing of scholarly articles: new artificial intelligence tools available for researchers. Endocrine 2024;85:1104-1116. [PMID: 39085566 DOI: 10.1007/s12020-024-03977-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 07/22/2024] [Indexed: 08/02/2024]

Kim SE, Lee JH, Choi BS, Han HS, Lee MC, Ro DH. Performance of ChatGPT on Solving Orthopedic Board-Style Questions: A Comparative Analysis of ChatGPT 3.5 and ChatGPT 4. Clin Orthop Surg 2024;16:669-673. [PMID: 39092297 PMCID: PMC11262944 DOI: 10.4055/cios23179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 01/29/2024] [Accepted: 01/29/2024] [Indexed: 08/04/2024] Open

Yaïci R, Cieplucha M, Bock R, Moayed F, Bechrakis NE, Berens P, Feltgen N, Friedburg D, Gräf M, Guthoff R, Hoffmann EM, Hoerauf H, Hintschich C, Kohnen T, Messmer EM, Nentwich MM, Pleyer U, Schaudig U, Seitz B, Geerling G, Roth M. [ChatGPT and the German board examination for ophthalmology: an evaluation]. DIE OPHTHALMOLOGIE 2024;121:554-564. [PMID: 38801461 DOI: 10.1007/s00347-024-02046-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 04/18/2024] [Accepted: 04/18/2024] [Indexed: 05/29/2024]

Abstract

PURPOSE

In recent years artificial intelligence (AI), as a new segment of computer science, has also become increasingly more important in medicine. The aim of this project was to investigate whether the current version of ChatGPT (ChatGPT 4.0) is able to answer open questions that could be asked in the context of a German board examination in ophthalmology.

METHODS

After excluding image-based questions, 10 questions from 15 different chapters/topics were selected from the textbook 1000 questions in ophthalmology (1000 Fragen Augenheilkunde 2nd edition, 2014). ChatGPT was instructed by means of a so-called prompt to assume the role of a board certified ophthalmologist and to concentrate on the essentials when answering. A human expert with considerable expertise in the respective topic, evaluated the answers regarding their correctness, relevance and internal coherence. Additionally, the overall performance was rated by school grades and assessed whether the answers would have been sufficient to pass the ophthalmology board examination.

RESULTS

The ChatGPT would have passed the board examination in 12 out of 15 topics. The overall performance, however, was limited with only 53.3% completely correct answers. While the correctness of the results in the different topics was highly variable (uveitis and lens/cataract 100%; optics and refraction 20%), the answers always had a high thematic fit (70%) and internal coherence (71%).

CONCLUSION

The fact that ChatGPT 4.0 would have passed the specialist examination in 12 out of 15 topics is remarkable considering the fact that this AI was not specifically trained for medical questions; however, there is a considerable performance variability between the topics, with some serious shortcomings that currently rule out its safe use in clinical practice.

Collapse

Affiliation(s)

Rémi Yaïci Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland.
M Cieplucha Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland
R Bock Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland
F Moayed Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland
N E Bechrakis Augenklinik, Universitätsklinikum Essen, Essen, Deutschland
P Berens Hertie Institute for AI in Brain Health (Hertie AI), Tübingen, Deutschland
N Feltgen Augenklinik, Universitätsspital Basel, Basel, Schweiz
D Friedburg , Krefeld, Deutschland
M Gräf Universitätsklinikum Gießen und Marburg, Marburg, Gießen, Deutschland
R Guthoff Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland
E M Hoffmann Augenklinik, Universitätsklinikum Mainz, Mainz, Deutschland
H Hoerauf Augenklinik, Universitätsklinikum Göttingen, Göttingen, Deutschland
C Hintschich Augenklinik und Poliklinik, LMU Klinikum, Ludwigs-Maximilians-Universität München, München, Deutschland
T Kohnen Augenklinik, Universitätsklinikum Frankfurt, Frankfurt, Deutschland
E M Messmer Augenklinik und Poliklinik, LMU Klinikum, Ludwigs-Maximilians-Universität München, München, Deutschland
M M Nentwich Augenklinik, Universitätsklinikum Würzburg, Würzburg, Deutschland
U Pleyer Charité - Universitätsmedizin Berlin, Berlin, Deutschland
U Schaudig Asklepios Klinik Barmbek, Hamburg, Deutschland
B Seitz Klinik für Augenheilkunde, Universitätsklinikum des Saarlandes, Homburg, Deutschland
G Geerling Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland
M Roth Klinik für Augenheilkunde, Medizinische Fakultät, Universitätsklinikum Düsseldorf, Heinrich-Heine Universität Düsseldorf, Moorenstr. 5, 40225, Düsseldorf, Deutschland

Collapse

Ahmed W, Zaidat B, Duey A, Saturno M, Cho S. Answer to the Letter to the Editor of G. Shen, et al. concerning "ChatGPT versus NASS clinical guidelines for degenerative spondylolisthesis: a comparative analysis" by Ahmed W, et al. (Eur Spine J [2024]: doi:10.1007/s00586-024-08198-6). EUROPEAN SPINE JOURNAL : OFFICIAL PUBLICATION OF THE EUROPEAN SPINE SOCIETY, THE EUROPEAN SPINAL DEFORMITY SOCIETY, AND THE EUROPEAN SECTION OF THE CERVICAL SPINE RESEARCH SOCIETY 2024;33:2920. [PMID: 38695950 DOI: 10.1007/s00586-024-08282-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 04/16/2024] [Indexed: 07/25/2024]

Shin E, Yu Y, Bies RR, Ramanathan M. Evaluation of ChatGPT and Gemini large language models for pharmacometrics with NONMEM. J Pharmacokinet Pharmacodyn 2024;51:187-197. [PMID: 38656706 DOI: 10.1007/s10928-024-09921-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 04/16/2024] [Indexed: 04/26/2024]

Paul S, Govindaraj S, Jk J. ChatGPT Versus National Eligibility cum Entrance Test for Postgraduate (NEET PG). Cureus 2024;16:e63048. [PMID: 39050297 PMCID: PMC11268980 DOI: 10.7759/cureus.63048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/24/2024] [Indexed: 07/27/2024] Open

UYGUN İLİKHAN S, ÖZER M, TANBERKAN H, BOZKURT V. How to mitigate the risks of deployment of artificial intelligence in medicine? Turk J Med Sci 2024;54:483-492. [PMID: 39050000 PMCID: PMC11265878 DOI: 10.55730/1300-0144.5814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/12/2024] [Accepted: 05/20/2024] [Indexed: 07/27/2024] Open

Abstract

The aim of this study is to examine the risks associated with the use of artificial intelligence (AI) in medicine and to offer policy suggestions to reduce these risks and optimize the benefits of AI technology. AI is a multifaceted technology. If harnessed effectively, it has the capacity to significantly impact the future of humanity in the field of health, as well as in several other areas. However, the rapid spread of this technology also raises significant ethical, legal, and social issues. This study examines the potential dangers of AI integration in medicine by reviewing current scientific work and exploring strategies to mitigate these risks. Biases in data sets for AI systems can lead to inequities in health care. Educational data that is narrowly represented based on a demographic group can lead to biased results from AI systems for those who do not belong to that group. In addition, the concepts of explainability and accountability in AI systems could create challenges for healthcare professionals in understanding and evaluating AI-generated diagnoses or treatment recommendations. This could jeopardize patient safety and lead to the selection of inappropriate treatments. Ensuring the security of personal health information will be critical as AI systems become more widespread. Therefore, improving patient privacy and security protocols for AI systems is imperative. The report offers suggestions for reducing the risks associated with the increasing use of AI systems in the medical sector. These include increasing AI literacy, implementing a participatory society-in-the-loop management strategy, and creating ongoing education and auditing systems. Integrating ethical principles and cultural values into the design of AI systems can help reduce healthcare disparities and improve patient care. Implementing these recommendations will ensure the efficient and equitable use of AI systems in medicine, improve the quality of healthcare services, and ensure patient safety.

Collapse

Khan AA, Yunus R, Sohail M, Rehman TA, Saeed S, Bu Y, Jackson CD, Sharkey A, Mahmood F, Matyal R. Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models. J Cardiothorac Vasc Anesth 2024;38:1251-1259. [PMID: 38423884 DOI: 10.1053/j.jvca.2024.01.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 01/24/2024] [Accepted: 01/29/2024] [Indexed: 03/02/2024]

Shin E, Ramanathan M. Evaluation of prompt engineering strategies for pharmacokinetic data analysis with the ChatGPT large language model. J Pharmacokinet Pharmacodyn 2024;51:101-108. [PMID: 37952004 DOI: 10.1007/s10928-023-09892-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 10/11/2023] [Indexed: 11/14/2023]

Javid M, Bhandari M, Parameshwari P, Reddiboina M, Prasad S. Evaluation of ChatGPT for Patient Counseling in Kidney Stone Clinic: A Prospective Study. J Endourol 2024;38:377-383. [PMID: 38411835 DOI: 10.1089/end.2023.0571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2024] Open

Abstract

Introduction: The potential of large language models (LLMs) is to improve the clinical workflow and to make patient care efficient. We prospectively evaluated the performance of the LLM ChatGPT as a patient counseling tool in the urology stone clinic and validated the generated responses with those of urologists. Methods: We collected 61 questions from 12 kidney stone patients and prompted those to ChatGPT and a panel of experienced urologists (Level 1). Subsequently, the blinded responses of urologists and ChatGPT were presented to two expert urologists (Level 2) for comparative evaluation on preset domains: accuracy, relevance, empathy, completeness, and practicality. All responses were rated on a Likert scale of 1 to 10 for psychometric response evaluation. The mean difference in the scores given by the urologists (Level 2) was analyzed and interrater reliability (IRR) for the level of agreement in the responses between the urologists (Level 2) was analyzed by Cohen's kappa. Results: The mean differences in average scores between the responses from ChatGPT and urologists showed significant differences in accuracy (p < 0.001), empathy (p < 0.001), completeness (p < 0.001), and practicality (p < 0.001), except for the relevance domain (p = 0.051), with ChatGPT's responses being rated higher. The IRR analysis revealed significant agreement only in the empathy domain [k = 0.163, (0.059-0.266)]. Conclusion: We believe the introduction of ChatGPT in the clinical workflow could further optimize the information provided to patients in a busy stone clinic. In this preliminary study, ChatGPT supplemented the answers provided by the urologists, adding value to the conversation. However, in its current state, it is still not ready to be a direct source of authentic information for patients. We recommend its use as a source to build a comprehensive Frequently Asked Questions bank as a prelude to developing an LLM Chatbot for patient counseling.

Collapse

Abou-Abdallah M, Dar T, Mahmudzade Y, Michaels J, Talwar R, Tornari C. The quality and readability of patient information provided by ChatGPT: can AI reliably explain common ENT operations? Eur Arch Otorhinolaryngol 2024:10.1007/s00405-024-08598-w. [PMID: 38530460 DOI: 10.1007/s00405-024-08598-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 03/04/2024] [Indexed: 03/28/2024]

Sandmann S, Riepenhausen S, Plagwitz L, Varghese J. Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks. Nat Commun 2024;15:2050. [PMID: 38448475 PMCID: PMC10917796 DOI: 10.1038/s41467-024-46411-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 02/27/2024] [Indexed: 03/08/2024] Open

Romano MF, Shih LC, Paschalidis IC, Au R, Kolachalama VB. Large Language Models in Neurology Research and Future Practice. Neurology 2023;101:1058-1067. [PMID: 37816646 PMCID: PMC10752640 DOI: 10.1212/wnl.0000000000207967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 09/06/2023] [Indexed: 10/12/2023] Open

Affiliation(s)

Michael F Romano From the Department of Medicine (M.F.R., R.A., V.B.K.), Boston University Chobanian & Avedisian School of Medicine, MA; Department of Radiology and Biomedical Imaging (M.F.R.), University of California, San Francisco; Department of Neurology (L.C.S., R.A.), Boston University Chobanian & Avedisian School of Medicine; Department of Electrical and Computer Engineering (I.C.P.), Division of Systems Engineering, and Department of Biomedical Engineering; Faculty of Computing and Data Sciences (I.C.P., V.B.K.), Boston University; Department of Anatomy and Neurobiology (R.A.); The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine; Department of Epidemiology, Boston University School of Public Health; Boston University Alzheimer's Disease Research Center (R.A.); and Department of Computer Science (V.B.K.), Boston University, MA
Ludy C Shih From the Department of Medicine (M.F.R., R.A., V.B.K.), Boston University Chobanian & Avedisian School of Medicine, MA; Department of Radiology and Biomedical Imaging (M.F.R.), University of California, San Francisco; Department of Neurology (L.C.S., R.A.), Boston University Chobanian & Avedisian School of Medicine; Department of Electrical and Computer Engineering (I.C.P.), Division of Systems Engineering, and Department of Biomedical Engineering; Faculty of Computing and Data Sciences (I.C.P., V.B.K.), Boston University; Department of Anatomy and Neurobiology (R.A.); The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine; Department of Epidemiology, Boston University School of Public Health; Boston University Alzheimer's Disease Research Center (R.A.); and Department of Computer Science (V.B.K.), Boston University, MA
Ioannis C Paschalidis From the Department of Medicine (M.F.R., R.A., V.B.K.), Boston University Chobanian & Avedisian School of Medicine, MA; Department of Radiology and Biomedical Imaging (M.F.R.), University of California, San Francisco; Department of Neurology (L.C.S., R.A.), Boston University Chobanian & Avedisian School of Medicine; Department of Electrical and Computer Engineering (I.C.P.), Division of Systems Engineering, and Department of Biomedical Engineering; Faculty of Computing and Data Sciences (I.C.P., V.B.K.), Boston University; Department of Anatomy and Neurobiology (R.A.); The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine; Department of Epidemiology, Boston University School of Public Health; Boston University Alzheimer's Disease Research Center (R.A.); and Department of Computer Science (V.B.K.), Boston University, MA
Rhoda Au From the Department of Medicine (M.F.R., R.A., V.B.K.), Boston University Chobanian & Avedisian School of Medicine, MA; Department of Radiology and Biomedical Imaging (M.F.R.), University of California, San Francisco; Department of Neurology (L.C.S., R.A.), Boston University Chobanian & Avedisian School of Medicine; Department of Electrical and Computer Engineering (I.C.P.), Division of Systems Engineering, and Department of Biomedical Engineering; Faculty of Computing and Data Sciences (I.C.P., V.B.K.), Boston University; Department of Anatomy and Neurobiology (R.A.); The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine; Department of Epidemiology, Boston University School of Public Health; Boston University Alzheimer's Disease Research Center (R.A.); and Department of Computer Science (V.B.K.), Boston University, MA
Vijaya B Kolachalama From the Department of Medicine (M.F.R., R.A., V.B.K.), Boston University Chobanian & Avedisian School of Medicine, MA; Department of Radiology and Biomedical Imaging (M.F.R.), University of California, San Francisco; Department of Neurology (L.C.S., R.A.), Boston University Chobanian & Avedisian School of Medicine; Department of Electrical and Computer Engineering (I.C.P.), Division of Systems Engineering, and Department of Biomedical Engineering; Faculty of Computing and Data Sciences (I.C.P., V.B.K.), Boston University; Department of Anatomy and Neurobiology (R.A.); The Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine; Department of Epidemiology, Boston University School of Public Health; Boston University Alzheimer's Disease Research Center (R.A.); and Department of Computer Science (V.B.K.), Boston University, MA.

Collapse

Tay JQ. Re: Online patient education in body contouring: A comparison between Google and ChatGPT. J Plast Reconstr Aesthet Surg 2023;87:440-441. [PMID: 37944454 DOI: 10.1016/j.bjps.2023.10.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 10/20/2023] [Indexed: 11/12/2023]

Gilvaz VJ, Reginato AM. Artificial intelligence in rheumatoid arthritis: potential applications and future implications. Front Med (Lausanne) 2023;10:1280312. [PMID: 38034534 PMCID: PMC10687464 DOI: 10.3389/fmed.2023.1280312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 10/13/2023] [Indexed: 12/02/2023] Open

Wu RT, Dang RR. ChatGPT in head and neck scientific writing: A precautionary anecdote. Am J Otolaryngol 2023;44:103980. [PMID: 37459740 DOI: 10.1016/j.amjoto.2023.103980] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 07/04/2023] [Indexed: 09/24/2023]

Ordak M. ChatGPT's Skills in Statistical Analysis Using the Example of Allergology: Do We Have Reason for Concern? Healthcare (Basel) 2023;11:2554. [PMID: 37761751 PMCID: PMC10530997 DOI: 10.3390/healthcare11182554] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/13/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open

Jiao C, Edupuganti NR, Patel PA, Bui T, Sheth V. Evaluating the Artificial Intelligence Performance Growth in Ophthalmic Knowledge. Cureus 2023;15:e45700. [PMID: 37868408 PMCID: PMC10590143 DOI: 10.7759/cureus.45700] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2023] [Indexed: 10/24/2023] Open

Abstract

OBJECTIVE

We aim to compare the capabilities of Chat Generative Pre-Trained Transformer (ChatGPT)-3.5 and ChatGPT-4.0 (OpenAI, San Francisco, CA, USA) in addressing multiple-choice ophthalmic case challenges.

METHODS AND ANALYSIS

Both models' accuracy was compared across different ophthalmology subspecialties using multiple-choice ophthalmic clinical cases provided by the American Academy of Ophthalmology (AAO) "Diagnosis This" questions. Additional analysis was based on image content, question difficulty, character length of models' responses, and model's alignment with responses from human respondents. χ2 test, Fisher's exact test, Student's t-test, and one-way analysis of variance (ANOVA) were conducted where appropriate, with p<0.05 considered significant.

RESULTS

GPT-4.0 significantly outperformed GPT-3.5 (75% versus 46%, p<0.01), with the most noticeable improvement in neuro-ophthalmology (100% versus 38%, p=0.03). While both models struggled with uveitis and refractive questions, GPT-4.0 excelled in other areas, such as pediatric questions (82%). In image-related questions, GPT-4.0 also displayed superior accuracy that trended toward significance (73% versus 46%, p=0.07). GPT-4.0 performed better with easier questions (93.8% (least difficult) versus 76.2% (middle) versus 53.3% (most), p=0.03) and generated more concise answers than GPT-3.5 (651.7±342.9 versus 1,112.9±328.8 characters, p<0.01). Moreover, GPT-4.0's answers were more in line with those of AAO respondents (57.3% versus 41.4%, p<0.01), showing a strong correlation between its accuracy and the proportion of AAO respondents who selected GPT-4.0's answer (ρ=0.713, p<0.01).

CONCLUSION AND RELEVANCE

Our study demonstrated that GPT-4.0 significantly outperforms GPT-3.5 in addressing ophthalmic case challenges, especially in neuro-ophthalmology, with improved accuracy even in image-related questions. These findings underscore the potential of advancing artificial intelligence (AI) models in enhancing ophthalmic diagnostics and medical education.

Collapse

Kumar M, Mani UA, Tripathi P, Saalim M, Roy S. Artificial Hallucinations by Google Bard: Think Before You Leap. Cureus 2023;15:e43313. [PMID: 37700993 PMCID: PMC10492900 DOI: 10.7759/cureus.43313] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/10/2023] [Indexed: 09/14/2023] Open

Li H, Moon JT, Iyer D, Balthazar P, Krupinski EA, Bercu ZL, Newsome JM, Banerjee I, Gichoya JW, Trivedi HM. Decoding radiology reports: Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports. Clin Imaging 2023;101:137-141. [PMID: 37336169 DOI: 10.1016/j.clinimag.2023.06.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/26/2023] [Accepted: 06/06/2023] [Indexed: 06/21/2023]

Li Y, Li Z, Zhang K, Dan R, Jiang S, Zhang Y. ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge. Cureus 2023;15:e40895. [PMID: 37492832 PMCID: PMC10364849 DOI: 10.7759/cureus.40895] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/24/2023] [Indexed: 07/27/2023] Open