1
|
Bellanda VCF, Santos MLD, Ferraz DA, Jorge R, Melo GB. Applications of ChatGPT in the diagnosis, management, education, and research of retinal diseases: a scoping review. Int J Retina Vitreous 2024; 10:79. [PMID: 39420407 PMCID: PMC11487877 DOI: 10.1186/s40942-024-00595-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024] Open
Abstract
PURPOSE This scoping review aims to explore the current applications of ChatGPT in the retina field, highlighting its potential, challenges, and limitations. METHODS A comprehensive literature search was conducted across multiple databases, including PubMed, Scopus, MEDLINE, and Embase, to identify relevant articles published from 2022 onwards. The inclusion criteria focused on studies evaluating the use of ChatGPT in retinal healthcare. Data were extracted and synthesized to map the scope of ChatGPT's applications in retinal care, categorizing articles into various practical application areas such as academic research, charting, coding, diagnosis, disease management, and patient counseling. RESULTS A total of 68 articles were included in the review, distributed across several categories: 8 related to academics and research, 5 to charting, 1 to coding and billing, 44 to diagnosis, 49 to disease management, 2 to literature consulting, 23 to medical education, and 33 to patient counseling. Many articles were classified into multiple categories due to overlapping topics. The findings indicate that while ChatGPT shows significant promise in areas such as medical education and diagnostic support, concerns regarding accuracy, reliability, and the potential for misinformation remain prevalent. CONCLUSION ChatGPT offers substantial potential in advancing retinal healthcare by supporting clinical decision-making, enhancing patient education, and automating administrative tasks. However, its current limitations, particularly in clinical accuracy and the risk of generating misinformation, necessitate cautious integration into practice, with continuous oversight from healthcare professionals. Future developments should focus on improving accuracy, incorporating up-to-date medical guidelines, and minimizing the risks associated with AI-driven healthcare tools.
Collapse
Affiliation(s)
- Victor C F Bellanda
- Ribeirão Preto Medical School, University of São Paulo, 3900 Bandeirantes Ave, Ribeirão Preto, SP, 14049-900, Brazil.
| | | | | | - Rodrigo Jorge
- Ribeirão Preto Medical School, University of São Paulo, 3900 Bandeirantes Ave, Ribeirão Preto, SP, 14049-900, Brazil
| | - Gustavo Barreto Melo
- Sergipe Eye Hospital, Aracaju, SE, Brazil
- Paulista School of Medicine, Federal University of São Paulo, São Paulo, SP, Brazil
| |
Collapse
|
2
|
Rojas-Carabali W, Cifuentes-González C, Wei X, Putera I, Sen A, Thng ZX, Agrawal R, Elze T, Sobrin L, Kempen JH, Lee B, Biswas J, Nguyen QD, Gupta V, de-la-Torre A, Agrawal R. Response to the Comment on "Evaluating the Diagnostic Accuracy and Management Recommendations of ChatGpt in Uveitis". Ocul Immunol Inflamm 2024; 32:1905-1906. [PMID: 38133945 DOI: 10.1080/09273948.2023.2293924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 12/01/2023] [Accepted: 12/06/2023] [Indexed: 12/24/2023]
Affiliation(s)
- William Rojas-Carabali
- National Healthcare Group Eye Institute, Tan Tock Seng Hospital, Singapore, Singapore
- Department of Bioinformatics, Lee Kong Chiang School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Carlos Cifuentes-González
- National Healthcare Group Eye Institute, Tan Tock Seng Hospital, Singapore, Singapore
- Neuroscience ResearchGroup (NEUROS), Neurovitae Center for Neuroscience, Institute of TranslationalMedicine (IMT), Escuela de Medicina y Ciencias de la Salud, Universidad del Rosario, Bogotá, Colombia
| | - Xin Wei
- National Healthcare Group Eye Institute, Tan Tock Seng Hospital, Singapore, Singapore
| | - Ikhwanuliman Putera
- Department of Ophthalmology, Faculty of Medicine Universitas Indonesia - CiptoMangunkusmoKirana Eye Hospital, Jakarta, Indonesia
- Laboratory Medical Immunology, Department of Immunology, ErasmusMC, University Medical Centre, Rotterdam, the Netherlands
- department of Internal Medicine, Division of Clinical Immunology, Erasmus MC, University Medical Center, Rotterdam, The Netherlands
- Department of Ophthalmology, Erasmus MC, University Medical Center, Rotterdam, The Netherlands
| | - Alok Sen
- Department of Vitreoretinal and Uveitis, Sadguru Netra Chikatsalya, Chitrakoot, India
| | - Zheng Xian Thng
- National Healthcare Group Eye Institute, Tan Tock Seng Hospital, Singapore, Singapore
| | - Rajdeep Agrawal
- Department of Bioinformatics, Lee Kong Chiang School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Tobias Elze
- Department of Ophthalmology, Massachusetts Eye and Ear/Harvard Medical School, and Schepens Eye Research Institute, Boston, Massachusetts, USA
| | - Lucia Sobrin
- Department of Ophthalmology, Massachusetts Eye and Ear/Harvard Medical School, and Schepens Eye Research Institute, Boston, Massachusetts, USA
| | - John H Kempen
- Department of Ophthalmology, Massachusetts Eye and Ear/Harvard Medical School, and Schepens Eye Research Institute, Boston, Massachusetts, USA
- Community Ophthalmology, Sight for Souls, Bellevue, Washington, USA
- Department of Ophthalmology, Myungsung Medical College/MCM Comprehensive Specialized Hospital, Addis Ababa, Ethiopia
| | - Bernett Lee
- Department of Bioinformatics, Lee Kong Chiang School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Jyotirmay Biswas
- Department of Ocular Pathology and Uveitis, Medical Research Foundation, Sankara Netralaya, Chennai, India
| | - Quan Dong Nguyen
- Byers Eye Institute, Stanford University, Palo Alto, California, USA
| | - Vishali Gupta
- Post Graduate Institute of Medical Education and Research (PGIMER), Advance Eye Centre, Chandigarh, India
| | - Alejandra de-la-Torre
- Neuroscience ResearchGroup (NEUROS), Neurovitae Center for Neuroscience, Institute of TranslationalMedicine (IMT), Escuela de Medicina y Ciencias de la Salud, Universidad del Rosario, Bogotá, Colombia
| | - Rupesh Agrawal
- National Healthcare Group Eye Institute, Tan Tock Seng Hospital, Singapore, Singapore
- Department of Bioinformatics, Lee Kong Chiang School of Medicine, Nanyang Technological University, Singapore, Singapore
- Department of Ophthalmology and Visual Sciences, Academic Clinical Program, Duke-NUS Medical School, Singapore, Singapore
- Moorfields Eye Hospital, NHS Foundation Trust, London, UK
- Singapore Eye Research Institute, The Academia, Singapore, Singapore
| |
Collapse
|
3
|
Chotcomwongse P, Ruamviboonsuk P, Grzybowski A. Utilizing Large Language Models in Ophthalmology: The Current Landscape and Challenges. Ophthalmol Ther 2024; 13:2543-2558. [PMID: 39180701 PMCID: PMC11408418 DOI: 10.1007/s40123-024-01018-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 08/01/2024] [Indexed: 08/26/2024] Open
Abstract
A large language model (LLM) is an artificial intelligence (AI) model that uses natural language processing (NLP) to understand, interpret, and generate human-like language responses from unstructured text input. Its real-time response capabilities and eloquent dialogue enhance the interactive user experience in human-AI communication like never before. By gathering several sources on the internet, LLM chatbots can interact and respond to a wide range of queries, including problem solving, text summarization, and creating informative notes. Since ophthalmology is one of the medical fields integrating image analysis, telemedicine, AI, and other technologies, LLMs are likely to play an important role in eye care in the near future. This review summarizes the performance and potential applicability of LLMs in ophthalmology according to currently available publications.
Collapse
Affiliation(s)
- Peranut Chotcomwongse
- Vitreoretina Unit, Department of Ophthalmology, Rajavithi Hospital, Rungsit University, Bangkok, Thailand
| | - Paisan Ruamviboonsuk
- Vitreoretina Unit, Department of Ophthalmology, Rajavithi Hospital, Rungsit University, Bangkok, Thailand
| | - Andrzej Grzybowski
- University of Warmia and Mazury, Olsztyn, Poland.
- Institute for Research in Ophthalmology, Foundation for Ophthalmology Development, 61-553, Poznan, Poland.
| |
Collapse
|
4
|
Schumacher I, Bühler VMM, Jaggi D, Roth J. Artificial intelligence derived large language model in decision-making process in uveitis. Int J Retina Vitreous 2024; 10:63. [PMID: 39261870 PMCID: PMC11389245 DOI: 10.1186/s40942-024-00581-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Accepted: 09/01/2024] [Indexed: 09/13/2024] Open
Abstract
BACKGROUND Uveitis is the ophthalmic subfield dealing with a broad range of intraocular inflammatory diseases. With the raising importance of LLM such as ChatGPT and their potential use in the medical field, this research explores the strengths and weaknesses of its applicability in the subfield of uveitis. METHODS A series of highly clinically relevant questions were asked three consecutive times (attempts 1, 2 and 3) of the LLM regarding current uveitis cases. The answers were classified on whether they were accurate and sufficient, partially accurate and sufficient or inaccurate and insufficient. Statistical analysis included descriptive analysis, normality distribution, non-parametric test and reliability tests. References were checked for their correctness in different medical databases. RESULTS The data showed non-normal distribution. Data between subgroups (attempts 1, 2 and 3) was comparable (Kruskal-Wallis H test, p-value = 0.7338). There was a moderate agreement between attempt 1 and attempt 2 (Cohen's kappa, ĸ = 0.5172) as well as between attempt 2 and attempt 3 (Cohen's kappa, ĸ = 0.4913). There was a fair agreement between attempt 1 and attempt 3 (Cohen's kappa, ĸ = 0.3647). The average agreement was moderate (Cohen's kappa, ĸ = 0.4577). Between the three attempts together, there was a moderate agreement (Fleiss' kappa, ĸ = 0.4534). A total of 52 references were generated by the LLM. 22 references (42.3%) were found to be accurate and correctly cited. Another 22 references (42.3%) could not be located in any of the searched databases. The remaining 8 references (15.4%) were found to exist, but were either misinterpreted or incorrectly cited by the LLM. CONCLUSION Our results demonstrate the significant potential of LLMs in uveitis. However, their implementation requires rigorous training and comprehensive testing for specific medical tasks. We also found out that the references made by ChatGPT 4.o were in most cases incorrect. LLMs are likely to become invaluable tools in shaping the future of ophthalmology, enhancing clinical decision-making and patient care.
Collapse
Affiliation(s)
- Inès Schumacher
- Department of Ophthalmology, Inselspital, University Hospital of Bern, Bern, Switzerland
| | | | - Damian Jaggi
- Department of Ophthalmology, Inselspital, University Hospital of Bern, Bern, Switzerland
| | - Janice Roth
- Department of Ophthalmology, Inselspital, University Hospital of Bern, Bern, Switzerland.
- Moorfields Eye Hospital NHS Foundation Trust, City Road, EC1V 2, London, PD, UK.
| |
Collapse
|
5
|
Shah-Mohammadi F, Finkelstein J. Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department. Diagnostics (Basel) 2024; 14:1779. [PMID: 39202267 PMCID: PMC11354035 DOI: 10.3390/diagnostics14161779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 08/10/2024] [Accepted: 08/13/2024] [Indexed: 09/03/2024] Open
Abstract
In emergency department (ED) settings, rapid and precise diagnostic evaluations are critical to ensure better patient outcomes and efficient healthcare delivery. This study assesses the accuracy of differential diagnosis lists generated by the third-generation ChatGPT (ChatGPT-3.5) and the fourth-generation ChatGPT (ChatGPT-4) based on electronic health record notes recorded within the first 24 h of ED admission. These models process unstructured text to formulate a ranked list of potential diagnoses. The accuracy of these models was benchmarked against actual discharge diagnoses to evaluate their utility as diagnostic aids. Results indicated that both GPT-3.5 and GPT-4 reasonably accurately predicted diagnoses at the body system level, with GPT-4 slightly outperforming its predecessor. However, their performance at the more granular category level was inconsistent, often showing decreased precision. Notably, GPT-4 demonstrated improved accuracy in several critical categories that underscores its advanced capabilities in managing complex clinical scenarios.
Collapse
Affiliation(s)
| | - Joseph Finkelstein
- Department of Biomedical Informatics, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA;
| |
Collapse
|
6
|
Garg N, Campbell DJ, Yang A, McCann A, Moroco AE, Estephan LE, Palmer WJ, Krein H, Heffelfinger R. Chatbots as Patient Education Resources for Aesthetic Facial Plastic Surgery: Evaluation of ChatGPT and Google Bard Responses. Facial Plast Surg Aesthet Med 2024. [PMID: 38946595 DOI: 10.1089/fpsam.2023.0368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024] Open
Abstract
Background: ChatGPT and Google Bard™ are popular artificial intelligence chatbots with utility for patients, including those undergoing aesthetic facial plastic surgery. Objective: To compare the accuracy and readability of chatbot-generated responses to patient education questions regarding aesthetic facial plastic surgery using a response accuracy scale and readability testing. Method: ChatGPT and Google Bard™ were asked 28 identical questions using four prompts: none, patient friendly, eighth-grade level, and references. Accuracy was assessed using Global Quality Scale (range: 1-5). Flesch-Kincaid grade level was calculated, and chatbot-provided references were analyzed for veracity. Results: Although 59.8% of responses were good quality (Global Quality Scale ≥4), ChatGPT generated more accurate responses than Google Bard™ on patient-friendly prompting (p < 0.001). Google Bard™ responses were of a significantly lower grade level than ChatGPT for all prompts (p < 0.05). Despite eighth-grade prompting, response grade level for both chatbots was high: ChatGPT (10.5 ± 1.8) and Google Bard™ (9.6 ± 1.3). Prompting for references yielded 108/108 of chatbot-generated references. Forty-one (38.0%) citations were legitimate. Twenty (18.5%) provided accurately reported information from the reference. Conclusion: Although ChatGPT produced more accurate responses and at a higher education level than Google Bard™, both chatbots provided responses above recommended grade levels for patients and failed to provide accurate references.
Collapse
Affiliation(s)
- Neha Garg
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - Daniel J Campbell
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - Angela Yang
- Sidney Kimmel Medical College, Philadelphia, Pennsylvania, USA
| | - Adam McCann
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - Annie E Moroco
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - Leonard E Estephan
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - William J Palmer
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - Howard Krein
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| | - Ryan Heffelfinger
- Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA
| |
Collapse
|
7
|
Mandalos A, Tsouris D. Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency. Cureus 2024; 16:e62471. [PMID: 39015855 PMCID: PMC11251728 DOI: 10.7759/cureus.62471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/16/2024] [Indexed: 07/18/2024] Open
Abstract
PURPOSE To evaluate the efficiency of three artificial intelligence (AI) chatbots (ChatGPT-3.5 (OpenAI, San Francisco, California, United States), Bing Copilot (Microsoft Corporation, Redmond, Washington, United States), Google Gemini (Google LLC, Mountain View, California, United States)) in assisting the ophthalmologist in the diagnostic approach and management of challenging ophthalmic cases and compare their performance with that of a practicing human ophthalmic specialist. The secondary aim was to assess the short- and medium-term consistency of ChatGPT's responses. METHODS Eleven ophthalmic case scenarios of variable complexity were presented to the AI chatbots and to an ophthalmic specialist in a stepwise fashion. Advice regarding the initial differential diagnosis, the final diagnosis, further investigation, and management was asked for. One month later, the same process was repeated twice on the same day for ChatGPT only. RESULTS The individual diagnostic performance of all three AI chatbots was inferior to that of the ophthalmic specialist; however, they provided useful complementary input in the diagnostic algorithm. This was especially true for ChatGPT and Bing Copilot. ChatGPT exhibited reasonable short- and medium-term consistency, with the mean Jaccard similarity coefficient of responses varying between 0.58 and 0.76. CONCLUSION AI chatbots may act as useful assisting tools in the diagnosis and management of challenging ophthalmic cases; however, their responses should be scrutinized for potential inaccuracies, and by no means can they replace consultation with an ophthalmic specialist.
Collapse
|
8
|
Ahimaz P, Bergner AL, Florido ME, Harkavy N, Bhattacharyya S. Genetic counselors' utilization of ChatGPT in professional practice: A cross-sectional study. Am J Med Genet A 2024; 194:e63493. [PMID: 38066714 DOI: 10.1002/ajmg.a.63493] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 11/21/2023] [Accepted: 11/22/2023] [Indexed: 03/10/2024]
Abstract
PURPOSE The precision medicine era has seen increased utilization of artificial intelligence (AI) in the field of genetics. We sought to explore the ways that genetic counselors (GCs) currently use the publicly accessible AI tool Chat Generative Pre-trained Transformer (ChatGPT) in their work. METHODS GCs in North America were surveyed about how ChatGPT is used in different aspects of their work. Descriptive statistics were reported through frequencies and means. RESULTS Of 118 GCs who completed the survey, 33.8% (40) reported using ChatGPT in their work; 47.5% (19) use it in clinical practice, 35% (14) use it in education, and 32.5% (13) use it in research. Most GCs (62.7%; 74) felt that it saves time on administrative tasks but the majority (82.2%; 97) felt that a paramount challenge was the risk of obtaining incorrect information. The majority of GCs not using ChatGPT (58.9%; 46) felt it was not necessary for their work. CONCLUSION A considerable number of GCs in the field are using ChatGPT in different ways, but it is primarily helpful with tasks that involve writing. It has potential to streamline workflow issues encountered in clinical genetics, but practitioners need to be informed and uniformly trained about its limitations.
Collapse
Affiliation(s)
- Priyanka Ahimaz
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Pediatrics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Amanda L Bergner
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Genetics and Development, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Neurology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Michelle E Florido
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Genetics and Development, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Nina Harkavy
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Obstetrics and Gynecology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Sriya Bhattacharyya
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Psychiatry, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| |
Collapse
|