1
|
Aykut A, Sezenoz AS. Exploring the Potential of Code-Free Custom GPTs in Ophthalmology: An Early Analysis of GPT Store and User-Creator Guidance. Ophthalmol Ther 2024; 13:2697-2713. [PMID: 39141071 PMCID: PMC11408450 DOI: 10.1007/s40123-024-01014-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 07/26/2024] [Indexed: 08/15/2024] Open
Abstract
INTRODUCTION OpenAI recently introduced the ability to create custom generative pre-trained transformers (cGPTs) using text-based instruction and/or external documents using retrieval-augmented generation (RAG) architecture without coding knowledge. This study aimed to analyze the features of ophthalmology-related cGPTs and explore their potential utilities. METHODS Data collection took place on January 20 and 21, 2024, and custom GPTs were found by entering ophthalmology keywords into the "Explore GPTS" section of the website. General and specific features of cGPTs were recorded, such as knowledge other than GPT-4 training data. The instruction and description sections were analyzed for compatibility using the Likert scale. We analyzed two custom GPTs with the highest Likert score in detail. We attempted to create a convincingly presented yet potentially harmful cGPT to test safety features. RESULTS We analyzed 22 ophthalmic cGPTs, of which 55% were for general use and the most common subspecialty was glaucoma (18%). Over half (55%) contained knowledge other than GPT-4 training data. The representation of the instructions through the description was between "Moderately representative" and "Very representative" with a median Likert score of 3.5 (IQR 3.0-4.0). The instruction word count was significantly associated with Likert scores (P = 0.03). Tested cGPTs demonstrated potential for specific conversational tone, information, retrieval and combining knowledge from an uploaded source. With these safety settings, creating a malicious GPT was possible. CONCLUSIONS This is the first study to our knowledge to examine the GPT store for a medical field. Our findings suggest that these cGPTs can be immediately implemented in practice and may offer more targeted and effective solutions compared to the standard GPT-4. However, further research is necessary to evaluate their capabilities and limitations comprehensively. The safety features currently appear to be rather limited. It may be helpful for the user to review the instruction section before using a cGPT.
Collapse
Affiliation(s)
- Aslan Aykut
- Department of Ophthalmology and Visual Sciences, Kellogg Eye Center, University of Michigan, 1000 Wall St, Rm 641, Ann Arbor, MI, 48105, USA.
- Department of Ophthalmology, School of Medicine, Marmara University, Istanbul, 34854, Turkey.
| | - Almila Sarigul Sezenoz
- Department of Ophthalmology and Visual Sciences, Kellogg Eye Center, University of Michigan, 1000 Wall St, Rm 641, Ann Arbor, MI, 48105, USA
- Department of Ophthalmology, Faculty of Medicine, Başkent University, Ankara, 06790, Turkey
| |
Collapse
|
2
|
Yang K, Zeb L, Bae S, Pavlidakey PG. Diagnostic Accuracy of ChatGPT for Textbook Descriptions of Epidermal Tumors: An Exploratory Study. Am J Dermatopathol 2024; 46:632-634. [PMID: 38842301 DOI: 10.1097/dad.0000000000002767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Affiliation(s)
- Kevin Yang
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, AL
| | - Lawangeen Zeb
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, AL
| | - Sejong Bae
- Department of Medicine, University of Alabama at Birmingham, Birmingham, AL
- O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL
| | - Peter G Pavlidakey
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, AL
| |
Collapse
|
3
|
Apornvirat S, Thinpanja W, Damrongkiet K, Benjakul N, Laohawetwanit T. ChatGPT for histopathologic diagnosis. Ann Diagn Pathol 2024; 73:152365. [PMID: 39098307 DOI: 10.1016/j.anndiagpath.2024.152365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 07/31/2024] [Indexed: 08/06/2024]
Affiliation(s)
- Sompon Apornvirat
- Division of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand; Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand
| | - Warut Thinpanja
- Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand
| | - Khampee Damrongkiet
- Department of Pathology, King Chulalongkorn Memorial Hospital, Bangkok, Thailand; Department of Anatomical Pathology, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand
| | - Nontawat Benjakul
- Department of Anatomical Pathology, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand; Vajira Pathology-Clinical-Correlation Target Research Interest Group, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand
| | - Thiyaphat Laohawetwanit
- Division of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand; Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand.
| |
Collapse
|
4
|
Laohawetwanit T, Pinto DG, Bychkov A. A survey analysis of the adoption of large language models among pathologists. Am J Clin Pathol 2024:aqae093. [PMID: 39076014 DOI: 10.1093/ajcp/aqae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 06/28/2024] [Indexed: 07/31/2024] Open
Abstract
OBJECTIVES We sought to investigate the adoption and perception of large language model (LLM) applications among pathologists. METHODS A cross-sectional survey was conducted, gathering data from pathologists on their usage and views concerning LLM tools. The survey, distributed globally through various digital platforms, included quantitative and qualitative questions. Patterns in the respondents' adoption and perspectives on these artificial intelligence tools were analyzed. RESULTS Of 215 respondents, 100 (46.5%) reported using LLMs, particularly ChatGPT (OpenAI), for professional purposes, predominantly for information retrieval, proofreading, academic writing, and drafting pathology reports, highlighting a significant time-saving benefit. Academic pathologists demonstrated a better level of understanding of LLMs than their peers. Although chatbots sometimes provided incorrect general domain information, they were considered moderately proficient concerning pathology-specific knowledge. The technology was mainly used for drafting educational materials and programming tasks. The most sought-after feature in LLMs was their image analysis capabilities. Participants expressed concerns about information accuracy, privacy, and the need for regulatory approval. CONCLUSIONS Large language model applications are gaining notable acceptance among pathologists, with nearly half of respondents indicating adoption less than a year after the tools' introduction to the market. They see the benefits but are also worried about these tools' reliability, ethical implications, and security.
Collapse
Affiliation(s)
- Thiyaphat Laohawetwanit
- Division of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand
- Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand
| | - Daniel Gomes Pinto
- Department of Pathology, Hospital Garcia de Orta, Almada, Portugal
- Nova Medical School, Lisbon, Portugal
| | - Andrey Bychkov
- Department of Pathology, Kameda Medical Center, Kamogawa, Japan
| |
Collapse
|
5
|
Maniaci A, Chiesa-Estomba CM, Lechien JR. ChatGPT-4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders. Otolaryngol Head Neck Surg 2024. [PMID: 39045737 DOI: 10.1002/ohn.897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 05/22/2024] [Accepted: 06/09/2024] [Indexed: 07/25/2024]
Abstract
OBJECTIVE To investigate the consistency of Chatbot Generative Pretrained Transformer (ChatGPT)-4 in the analysis of clinical pictures of common laryngological conditions. STUDY DESIGN Prospective uncontrolled study. SETTING Multicenter study. METHODS Patient history and clinical videolaryngostroboscopic images were presented to ChatGPT-4 for differential diagnoses, management, and treatment(s). ChatGPT-4 responses were assessed by 3 blinded laryngologists with the artificial intelligence performance instrument (AIPI). The complexity of cases and the consistency between practitioners and ChatGPT-4 for interpreting clinical images were evaluated with a 5-point Likert Scale. The intraclass correlation coefficient (ICC) was used to measure the strength of interrater agreement. RESULTS Forty patients with a mean complexity score of 2.60 ± 1.15. were included. The mean consistency score for ChatGPT-4 image interpretation was 2.46 ± 1.42. ChatGPT-4 perfectly analyzed the clinical images in 6 cases (15%; 5/5), while the consistency between GPT-4 and judges was high in 5 cases (12.5%; 4/5). Judges reported an ICC of 0.965 for the consistency score (P = .001). ChatGPT-4 erroneously documented vocal fold irregularity (mass or lesion), glottic insufficiency, and vocal cord paralysis in 21 (52.5%), 2 (0.05%), and 5 (12.5%) cases, respectively. ChatGPT-4 and practitioners indicated 153 and 63 additional examinations, respectively (P = .001). The ChatGPT-4 primary diagnosis was correct in 20.0% to 25.0% of cases. The clinical image consistency score was significantly associated with the AIPI score (rs = 0.830; P = .001). CONCLUSION The ChatGPT-4 is more efficient in primary diagnosis, rather than in the image analysis, selecting the most adequate additional examinations and treatments.
Collapse
Affiliation(s)
- Antonino Maniaci
- Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France
- Department of Medicine and Surgery, Kore University, Enna, Italy
| | - Carlos M Chiesa-Estomba
- Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France
- Division of Laryngology and Broncho-esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium
- Department of Otorhinolaryngology-Head and Neck Surgery, Donostia University Hospital Donosti-San, Sebastián, Spain
| | - Jérôme R Lechien
- Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France
- Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris Saclay University, Paris, France
- Department of Otorhinolaryngology and Head and Neck Surgery, CHU Saint-Pierre, Brussels, Belgium
| |
Collapse
|
6
|
Apornvirat S, Thinpanja W, Damrongkiet K, Benjakul N, Laohawetwanit T. Comparing customized ChatGPT and pathology residents in histopathologic description and diagnosis of common diseases. Ann Diagn Pathol 2024; 73:152359. [PMID: 38972166 DOI: 10.1016/j.anndiagpath.2024.152359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 06/30/2024] [Accepted: 07/01/2024] [Indexed: 07/09/2024]
Abstract
This study aimed to evaluate and analyze the performance of a customized Chat Generative Pre-Trained Transformer (ChatGPT), known as GPT, against pathology residents in providing microscopic descriptions and diagnosing diseases from histopathological images. A dataset of representative photomicrographs from 70 diseases across 14 organ systems was analyzed by a customized version of ChatGPT-4 (GPT-4) and pathology residents. Two pathologists independently evaluated the microscopic descriptions and diagnoses using a predefined scoring system (0-4 for microscopic descriptions and 0-2 for pathological diagnoses), with higher scores indicating greater accuracy. Microscopic descriptions that received perfect scores, which included all relevant keywords and findings, were then presented to the standard version of ChatGPT to assess its diagnostic capabilities based on these descriptions. GPT-4 showed consistency in microscopic description and diagnosis scores across five rounds, accomplishing median scores of 50 % and 48.6 %, respectively. However, its performance was still inferior to junior and senior pathology residents (73.9 % and 93.9 % description scores and 63.9 % and 87.9 % diagnosis scores, respectively). When analyzing classic ChatGPT's understanding of microscopic descriptions provided by residents, it correctly diagnosed 35 (87.5 %) of cases from junior residents and 44 (68.8 %) from senior residents, given that the initial descriptions consisted of keywords and relevant findings. While GPT-4 can accurately interpret some histopathological images, its overall performance is currently inferior to that of pathology residents. However, ChatGPT's ability to accurately interpret and diagnose diseases from the descriptions provided by residents suggests that this technology could serve as a valuable support tool in pathology diagnostics.
Collapse
Affiliation(s)
- Sompon Apornvirat
- Division of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand; Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand
| | - Warut Thinpanja
- Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand
| | - Khampee Damrongkiet
- Department of Pathology, King Chulalongkorn Memorial Hospital, Bangkok, Thailand; Department of Anatomical Pathology, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand
| | - Nontawat Benjakul
- Department of Anatomical Pathology, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand; Vajira Pathology-Clinical-Correlation Target Research Interest Group, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand
| | - Thiyaphat Laohawetwanit
- Division of Pathology, Chulabhorn International College of Medicine, Thammasat University, Pathum Thani, Thailand; Division of Pathology, Thammasat University Hospital, Pathum Thani, Thailand.
| |
Collapse
|
7
|
Koga S, Du W. Integrating AI in medicine: Lessons from Chat-GPT's limitations in medical imaging. Dig Liver Dis 2024; 56:1114-1115. [PMID: 38429138 DOI: 10.1016/j.dld.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 02/19/2024] [Indexed: 03/03/2024]
Affiliation(s)
- Shunsuke Koga
- Department of Pathology and Laboratory Medicine, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, United States.
| | - Wei Du
- Department of Pathology and Laboratory Medicine, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, United States
| |
Collapse
|
8
|
Daungsupawong H, Wiwanitkit V. GPT-4 and histopathological image detection and classification of colorectal adenomas. J Clin Pathol 2024; 77:383. [PMID: 38286610 DOI: 10.1136/jcp-2024-209405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 01/12/2024] [Indexed: 01/31/2024]
|
9
|
Cheng J. Applications of Large Language Models in Pathology. Bioengineering (Basel) 2024; 11:342. [PMID: 38671764 PMCID: PMC11047860 DOI: 10.3390/bioengineering11040342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 03/27/2024] [Accepted: 03/29/2024] [Indexed: 04/28/2024] Open
Abstract
Large language models (LLMs) are transformer-based neural networks that can provide human-like responses to questions and instructions. LLMs can generate educational material, summarize text, extract structured data from free text, create reports, write programs, and potentially assist in case sign-out. LLMs combined with vision models can assist in interpreting histopathology images. LLMs have immense potential in transforming pathology practice and education, but these models are not infallible, so any artificial intelligence generated content must be verified with reputable sources. Caution must be exercised on how these models are integrated into clinical practice, as these models can produce hallucinations and incorrect results, and an over-reliance on artificial intelligence may lead to de-skilling and automation bias. This review paper provides a brief history of LLMs and highlights several use cases for LLMs in the field of pathology.
Collapse
Affiliation(s)
- Jerome Cheng
- Department of Pathology, University of Michigan, Ann Arbor, MI 48105, USA
| |
Collapse
|