Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chiesa-Estomba CM, Lechien JR, Vaira LA, Brunet A, Cammaroto G, Mayo-Yanez M, Sanchez-Barrueco A, Saga-Gutierrez C. Exploring the potential of Chat-GPT as a supportive tool for sialendoscopy clinical decision making and patient information support. Eur Arch Otorhinolaryngol 2024;281:2081-2086. [PMID: 37405455 DOI: 10.1007/s00405-023-08104-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 06/29/2023] [Indexed: 07/06/2023]

For:	Chiesa-Estomba CM, Lechien JR, Vaira LA, Brunet A, Cammaroto G, Mayo-Yanez M, Sanchez-Barrueco A, Saga-Gutierrez C. Exploring the potential of Chat-GPT as a supportive tool for sialendoscopy clinical decision making and patient information support. Eur Arch Otorhinolaryngol 2024;281:2081-2086. [PMID: 37405455 DOI: 10.1007/s00405-023-08104-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 06/29/2023] [Indexed: 07/06/2023]

Number

Cited by Other Article(s)

Fiedler AK, Zhang K, Lal TS, Jiang X, Fraser SM. Generative Pre-trained Transformer for Pediatric Stroke Research: A Pilot Study. Pediatr Neurol 2024;160:54-59. [PMID: 39191085 DOI: 10.1016/j.pediatrneurol.2024.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/25/2024] [Accepted: 07/02/2024] [Indexed: 08/29/2024]

Shamil E, Ko TK, Fan KS, Schuster-Bruce J, Jaafar M, Khwaja S, Eynon-Lewis N, D'Souza A, Andrews P. Assessing the Quality and Readability of Online Patient Information: ENT UK Patient Information e-Leaflets versus Responses by a Generative Artificial Intelligence. Facial Plast Surg 2024. [PMID: 39260421 DOI: 10.1055/a-2413-3675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024] Open

Abstract

BACKGROUND

The evolution of artificial intelligence has introduced new ways to disseminate health information, including natural language processing models like ChatGPT. However, the quality and readability of such digitally generated information remains understudied. This study is the first to compare the quality and readability of digitally generated health information against leaflets produced by professionals.

METHODOLOGY

Patient information leaflets from five ENT UK leaflets and their corresponding ChatGPT responses were extracted from the Internet. Assessors with various degrees of medical knowledge evaluated the content using the Ensuring Quality Information for Patients (EQIP) tool and readability tools including the Flesch-Kincaid Grade Level (FKGL). Statistical analysis was performed to identify differences between leaflets, assessors, and sources of information.

RESULTS

ENT UK leaflets were of moderate quality, scoring a median EQIP of 23. Statistically significant differences in overall EQIP score were identified between ENT UK leaflets, but ChatGPT responses were of uniform quality. Nonspecialist doctors rated the highest EQIP scores, while medical students scored the lowest. The mean readability of ENT UK leaflets was higher than ChatGPT responses. The information metrics of ENT UK leaflets were moderate and varied between topics. Equivalent ChatGPT information provided comparable content quality, but with reduced readability.

CONCLUSION

ChatGPT patient information and professionally produced leaflets had comparable content, but large language model content required a higher reading age. With the increasing use of online health resources, this study highlights the need for a balanced approach that considers both the quality and readability of patient education materials.

Collapse

Lechien JR. Generative AI and Otolaryngology-Head & Neck Surgery. Otolaryngol Clin North Am 2024;57:753-765. [PMID: 38839556 DOI: 10.1016/j.otc.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]

Peters M, Leclercq M, Yanni A, Vanden Eynden X, Martin L, Vanden Haute N, Tancredi S, De Passe C, Boutremans E, Lechien J, Dequanter D. ChatGPT and trainee performances in the management of maxillofacial patients. JOURNAL OF STOMATOLOGY, ORAL AND MAXILLOFACIAL SURGERY 2024:102090. [PMID: 39332706 DOI: 10.1016/j.jormas.2024.102090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 08/24/2024] [Accepted: 09/22/2024] [Indexed: 09/29/2024]

Affiliation(s)

Mélissa Peters Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium.
Maxime Leclercq Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Antoine Yanni Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Xavier Vanden Eynden Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Lalmand Martin Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Noémie Vanden Haute Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Szonja Tancredi Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Céline De Passe Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Edward Boutremans Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium
Jerome Lechien Faculty of Medicine, Department of Human Anatomy and Experimental Oncology UMONS, Mons, Belgium; Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, School of Medicine, UFR Simone Veil, Université Versailles Saint-Quentin-en-Yvelines (Paris Saclay University), Paris, France; Department of Otorhinolaryngology and Head and Neck Surgery, CHU Saint-Pierre, Brussels, Belgium; Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France; Young Confederation of the European Oto-Rhino-Laryngological Head and Neck Surgery Societies (Y-CEORLHNS), Dublin, Ireland; Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium
Didier Dequanter Department of Stomatology, Oral & Maxillofacial Surgery, CHU Saint Pierre, Brussels, Belgium; Faculty of Medicine, Department of Human Anatomy and Experimental Oncology UMONS, Mons, Belgium

Collapse

Tomo S, Lechien JR, Bueno HS, Cantieri-Debortoli DF, Simonato LE. Accuracy and consistency of ChatGPT-3.5 and - 4 in providing differential diagnoses in oral and maxillofacial diseases: a comparative diagnostic performance analysis. Clin Oral Investig 2024;28:544. [PMID: 39316174 DOI: 10.1007/s00784-024-05939-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 09/14/2024] [Indexed: 09/25/2024]

Abstract

OBJECTIVE

To investigate the performance of ChatGPT in the differential diagnosis of oral and maxillofacial diseases.

METHODS

Thirty-seven oral and maxillofacial lesions findings were presented to ChatGPT-3.5 and - 4, 18 dental surgeons trained in oral medicine/pathology (OMP), 23 general dental surgeons (DDS), and 16 dental students (DS) for differential diagnosis. Additionally, a group of 15 general dentists was asked to describe 11 cases to ChatGPT versions. The ChatGPT-3.5, -4, and human primary and alternative diagnoses were rated by 2 independent investigators with a 4 Likert-Scale. The consistency of ChatGPT-3.5 and - 4 was evaluated with regenerated inputs.

RESULTS

Moderate consistency of outputs was observed for ChatGPT-3.5 and - 4 to provide primary (κ = 0.532 and κ = 0.533 respectively) and alternative (κ = 0.337 and κ = 0.367 respectively) hypotheses. The mean of correct diagnoses was 64.86% for ChatGPT-3.5, 80.18% for ChatGPT-4, 86.64% for OMP, 24.32% for DDS, and 16.67% for DS. The mean correct primary hypothesis rates were 45.95% for ChatGPT-3.5, 61.80% for ChatGPT-4, 82.28% for OMP, 22.72% for DDS, and 15.77% for DS. The mean correct diagnosis rate for ChatGPT-3.5 with standard descriptions was 64.86%, compared to 45.95% with participants' descriptions. For ChatGPT-4, the mean was 80.18% with standard descriptions and 61.80% with participant descriptions.

CONCLUSION

ChatGPT-4 demonstrates an accuracy comparable to specialists to provide differential diagnosis for oral and maxillofacial diseases. Consistency of ChatGPT to provide diagnostic hypotheses for oral diseases cases is moderate, representing a weakness for clinical application. The quality of case documentation and descriptions impacts significantly on the performance of ChatGPT.

CLINICAL RELEVANCE

General dentists, dental students and specialists in oral medicine and pathology may benefit from ChatGPT-4 as an auxiliary method to define differential diagnosis for oral and maxillofacial lesions, but its accuracy is dependent on precise case descriptions.

Collapse

Villarreal-Espinosa JB, Berreta RS, Allende F, Garcia JR, Ayala S, Familiari F, Chahla J. Accuracy assessment of ChatGPT responses to frequently asked questions regarding anterior cruciate ligament surgery. Knee 2024;51:84-92. [PMID: 39241674 DOI: 10.1016/j.knee.2024.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 06/21/2024] [Accepted: 08/14/2024] [Indexed: 09/09/2024]

Bellamkonda N, Farlow JL, Haring CT, Sim MW, Seim NB, Cannon RB, Monroe MM, Agrawal A, Rocco JW, McCrary HC. Evaluating the Accuracy of ChatGPT in Common Patient Questions Regarding HPV+ Oropharyngeal Carcinoma. Ann Otol Rhinol Laryngol 2024;133:814-819. [PMID: 39075853 DOI: 10.1177/00034894241259137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]

Abstract

OBJECTIVES

Large language model (LLM)-based chatbots such as ChatGPT have been publicly available and increasingly utilized by the general public since late 2022. This study sought to investigate ChatGPT responses to common patient questions regarding Human Papilloma Virus (HPV) positive oropharyngeal cancer (OPC).

METHODS

This was a prospective, multi-institutional study, with data collected from high volume institutions that perform >50 transoral robotic surgery cases per year. The 100 most recent discussion threads including the term "HPV" on the American Cancer Society's Cancer Survivors Network's Head and Neck Cancer public discussion board were reviewed. The 11 most common questions were serially queried to ChatGPT 3.5; answers were recorded. A survey was distributed to fellowship trained head and neck oncologic surgeons at 3 institutions to evaluate the responses.

RESULTS

A total of 8 surgeons participated in the study. For questions regarding HPV contraction and transmission, ChatGPT answers were scored as clinically accurate and aligned with consensus in the head and neck surgical oncology community 84.4% and 90.6% of the time, respectively. For questions involving treatment of HPV+ OPC, ChatGPT was clinically accurate and aligned with consensus 87.5% and 91.7% of the time, respectively. For questions regarding the HPV vaccine, ChatGPT was clinically accurate and aligned with consensus 62.5% and 75% of the time, respectively. When asked about circulating tumor DNA testing, only 12.5% of surgeons thought responses were accurate or consistent with consensus.

CONCLUSION

ChatGPT 3.5 performed poorly with questions involving evolving therapies and diagnostics-thus, caution should be used when using a platform like ChatGPT 3.5 to assess use of advanced technology. Patients should be counseled on the importance of consulting their surgeons to receive accurate and up to date recommendations, and use LLM's to augment their understanding of these important health-related topics.

Collapse

Alami K, Willemse E, Quiriny M, Lipski S, Laurent C, Donquier V, Digonnet A. Evaluation of ChatGPT-4's Performance in Therapeutic Decision-Making During Multidisciplinary Oncology Meetings for Head and Neck Squamous Cell Carcinoma. Cureus 2024;16:e68808. [PMID: 39376890 PMCID: PMC11456411 DOI: 10.7759/cureus.68808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/04/2024] [Indexed: 10/09/2024] Open

Abstract

Objectives First reports suggest that artificial intelligence (AI) such as ChatGPT-4 (Open AI, ChatGPT-4, San Francisco, USA) might represent reliable tools for therapeutic decisions in some medical conditions. This study aims to assess the decisional capacity of ChatGPT-4 in patients with head and neck carcinomas, using the multidisciplinary oncology meeting (MOM) and the National Comprehensive Cancer Network (NCCN) decision as references. Methods This retrospective study included 263 patients with squamous cell carcinoma of the oral cavity, oropharynx, hypopharynx, and larynx who were followed at our institution between January 1, 2016, and December 31, 2021. The recommendation of GPT4 for the first- and second-line treatments was compared to the MOM decision and NCCN guidelines. The degrees of agreement were calculated using the Kappa method, which measures the degree of agreement between two evaluators. Results ChatGPT-4 demonstrated a moderate agreement in first-line treatment recommendations (Kappa = 0.48) and a substantial agreement (Kappa = 0.78) in second-line treatment recommendations compared to the decisions from MOM. A substantial agreement with the NCCN guidelines for both first- and second-line treatments was observed (Kappa = 0.72 and 0.66, respectively). The degree of agreement decreased when the decision included gastrostomy, patients over 70, and those with comorbidities. Conclusions The study illustrates that while ChatGPT-4 can significantly support clinical decision-making in oncology by aligning closely with expert recommendations and established guidelines, ongoing enhancements and training are crucial. The findings advocate for the continued evolution of AI tools to better handle the nuanced aspects of patient health profiles, thus broadening their applicability and reliability in clinical practice.

Collapse

Lechien JR, Rameau A. Applications of ChatGPT in Otolaryngology-Head Neck Surgery: A State of the Art Review. Otolaryngol Head Neck Surg 2024;171:667-677. [PMID: 38716790 DOI: 10.1002/ohn.807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/01/2024] [Accepted: 04/19/2024] [Indexed: 08/28/2024]

Abstract

OBJECTIVE

To review the current literature on the application, accuracy, and performance of Chatbot Generative Pre-Trained Transformer (ChatGPT) in Otolaryngology-Head and Neck Surgery.

DATA SOURCES

PubMED, Cochrane Library, and Scopus.

REVIEW METHODS

A comprehensive review of the literature on the applications of ChatGPT in otolaryngology was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-analyses statement.

CONCLUSIONS

ChatGPT provides imperfect patient information or general knowledge related to diseases found in Otolaryngology-Head and Neck Surgery. In clinical practice, despite suboptimal performance, studies reported that the model is more accurate in providing diagnoses, than in suggesting the most adequate additional examinations and treatments related to clinical vignettes or real clinical cases. ChatGPT has been used as an adjunct tool to improve scientific reports (referencing, spelling correction), to elaborate study protocols, or to take student or resident exams reporting several levels of accuracy. The stability of ChatGPT responses throughout repeated questions appeared high but many studies reported some hallucination events, particularly in providing scientific references.

IMPLICATIONS FOR PRACTICE

To date, most applications of ChatGPT are limited in generating disease or treatment information, and in the improvement of the management of clinical cases. The lack of comparison of ChatGPT performance with other large language models is the main limitation of the current research. Its ability to analyze clinical images has not yet been investigated in otolaryngology although upper airway tract or ear images are an important step in the diagnosis of most common ear, nose, and throat conditions. This review may help otolaryngologists to conceive new applications in further research.

Collapse

Kutbi M. Artificial Intelligence-Based Applications for Bone Fracture Detection Using Medical Images: A Systematic Review. Diagnostics (Basel) 2024;14:1879. [PMID: 39272664 PMCID: PMC11394268 DOI: 10.3390/diagnostics14171879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 08/19/2024] [Accepted: 08/26/2024] [Indexed: 09/15/2024] Open

De Vito A, Colpani A, Moi G, Babudieri S, Calcagno A, Calvino V, Ceccarelli M, Colpani G, d'Ettorre G, Di Biagio A, Farinella M, Falaguasta M, Focà E, Giupponi G, Habed AJ, Isenia WJ, Lo Caputo S, Marchetti G, Modesti L, Mussini C, Nunnari G, Rusconi S, Russo D, Saracino A, Serra PA, Madeddu G. Assessing ChatGPT's Potential in HIV Prevention Communication: A Comprehensive Evaluation of Accuracy, Completeness, and Inclusivity. AIDS Behav 2024;28:2746-2754. [PMID: 38836986 PMCID: PMC11286632 DOI: 10.1007/s10461-024-04391-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/24/2024] [Indexed: 06/06/2024]

Affiliation(s)

Andrea De Vito Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, 07100, Italy. PhD School in Biomedical Science, Biomedical Science Department, University of Sassari, Sassari, Italy.
Agnese Colpani Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, 07100, Italy
Giulia Moi Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, 07100, Italy
Sergio Babudieri Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, 07100, Italy
Andrea Calcagno Unit of Infectious Diseases, Department of Medical Sciences, University of Turin, Torino, Italy
Valeria Calvino Associazione Nazionale per la Lotta contro l'AIDS (ANLAIDS), Rome, Italy
Manuela Ceccarelli Unit of Infectious Diseases, School of Medicine and Surgery, "Kore" University of Enna, Enna, Italy
Gianmaria Colpani Department of Media and Culture Studies, Utrecht University, Utrecht, Netherlands
Gabriella d'Ettorre Unit of Infectious Diseases, Department of Public Health and Infectious Diseases, Azienda Policlinico Umberto I, Rome, Italy
Antonio Di Biagio Infectious Diseases, San Martino Hospital Genoa, University of Genoa, Genoa, Italy
Massimo Farinella Mario Mieli, LGBTQIA+ cultural association, Rome, Italy
Marco Falaguasta Associazione Nazionale per la Lotta contro l'AIDS (ANLAIDS), Padova, Italy
Emanuele Focà Unit of Infectious and Tropical Diseases, Department of Clinical and Experimental Sciences, University of Brescia and ASST Spedali Civili di Brescia, Brescia, Italy
Giusi Giupponi Lega italiana per la lotta contro l'AIDS (LILA), Brescia, Italy
Adriano José Habed Department of Media and Culture Studies, Utrecht University, Utrecht, Netherlands
Wigbertson Julian Isenia Department of Anthropology, University of Amsterdam, Amsterdam, Netherlands
Sergio Lo Caputo S.C. Malattie Infettive, Dipartimento di Scienze Mediche e Chirurgiche, University of Foggia, Foggia, Italy
Giulia Marchetti Clinic of Infectious Diseases, Department of Health Sciences, ASST Santi Paolo e Carlo, University of Milan, Milan, Italy
Luca Modesti Conigli Bianchi, Artivists against Serophobia, Italy
Cristina Mussini University of Modena and Reggio Emilia, Modena, Italy
Giuseppe Nunnari Unit of Infectious Diseases, Department of Clinical and Experimental Medicine, ARNAS Garibaldi Hospital, University of Catania, Catania, Italy
Stefano Rusconi Infectious Diseases Unit, Ospedale Civile di Legnano, ASST Ovest Milanese, DIBIC Luigi Sacco, Università degli Studi di Milano, Legnano, 20025, Italy
Daria Russo Network Persone Sieropositive (NPS), Rome, Italy
Annalisa Saracino Clinic of Infectious Diseases, Department of Precision and Regenerative Medicine and Ionian Area-(DiMePRe-J), University of Bari "Aldo Moro", Bari, Italy
Pier Andrea Serra Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy
Giordano Madeddu Unit of Infectious Diseases, Department of Medicine, Surgery, and Pharmacy, University of Sassari, Sassari, 07100, Italy

Collapse

Maniaci A, Lazzeroni M, Cozzi A, Fraccaroli F, Gaffuri M, Chiesa-Estomba C, Capaccio P. Can chatbots enhance the management of pediatric sialadenitis in clinical practice? Eur Arch Otorhinolaryngol 2024:10.1007/s00405-024-08798-4. [PMID: 38955859 DOI: 10.1007/s00405-024-08798-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 06/17/2024] [Indexed: 07/04/2024]

Ho RA, Shaari AL, Cowan PT, Yan K. ChatGPT Responses to Frequently Asked Questions on Ménière's Disease: A Comparison to Clinical Practice Guideline Answers. OTO Open 2024;8:e163. [PMID: 38974175 PMCID: PMC11225079 DOI: 10.1002/oto2.163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Revised: 06/01/2024] [Accepted: 06/08/2024] [Indexed: 07/09/2024] Open

Lahat A, Sharif K, Zoabi N, Shneor Patt Y, Sharif Y, Fisher L, Shani U, Arow M, Levin R, Klang E. Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4. J Med Internet Res 2024;26:e54571. [PMID: 38935937 PMCID: PMC11240076 DOI: 10.2196/54571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 02/02/2024] [Accepted: 04/29/2024] [Indexed: 06/29/2024] Open

Abstract

BACKGROUND

Artificial intelligence, particularly chatbot systems, is becoming an instrumental tool in health care, aiding clinical decision-making and patient engagement.

OBJECTIVE

This study aims to analyze the performance of ChatGPT-3.5 and ChatGPT-4 in addressing complex clinical and ethical dilemmas, and to illustrate their potential role in health care decision-making while comparing seniors' and residents' ratings, and specific question types.

METHODS

A total of 4 specialized physicians formulated 176 real-world clinical questions. A total of 8 senior physicians and residents assessed responses from GPT-3.5 and GPT-4 on a 1-5 scale across 5 categories: accuracy, relevance, clarity, utility, and comprehensiveness. Evaluations were conducted within internal medicine, emergency medicine, and ethics. Comparisons were made globally, between seniors and residents, and across classifications.

RESULTS

Both GPT models received high mean scores (4.4, SD 0.8 for GPT-4 and 4.1, SD 1.0 for GPT-3.5). GPT-4 outperformed GPT-3.5 across all rating dimensions, with seniors consistently rating responses higher than residents for both models. Specifically, seniors rated GPT-4 as more beneficial and complete (mean 4.6 vs 4.0 and 4.6 vs 4.1, respectively; P<.001), and GPT-3.5 similarly (mean 4.1 vs 3.7 and 3.9 vs 3.5, respectively; P<.001). Ethical queries received the highest ratings for both models, with mean scores reflecting consistency across accuracy and completeness criteria. Distinctions among question types were significant, particularly for the GPT-4 mean scores in completeness across emergency, internal, and ethical questions (4.2, SD 1.0; 4.3, SD 0.8; and 4.5, SD 0.7, respectively; P<.001), and for GPT-3.5's accuracy, beneficial, and completeness dimensions.

CONCLUSIONS

ChatGPT's potential to assist physicians with medical issues is promising, with prospects to enhance diagnostics, treatments, and ethics. While integration into clinical workflows may be valuable, it must complement, not replace, human expertise. Continued research is essential to ensure safe and effective implementation in clinical environments.

Collapse

Sahin S, Erkmen B, Duymaz YK, Bayram F, Tekin AM, Topsakal V. Evaluating ChatGPT-4's performance as a digital health advisor for otosclerosis surgery. Front Surg 2024;11:1373843. [PMID: 38903865 PMCID: PMC11188327 DOI: 10.3389/fsurg.2024.1373843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 05/20/2024] [Indexed: 06/22/2024] Open

Menshawey R, Menshawey E. Quid Pro Quo Doctor, I tell you things, you tell me things: ChatGPT's thoughts on a killer. Forensic Sci Med Pathol 2024;20:751-755. [PMID: 37594609 DOI: 10.1007/s12024-023-00696-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/07/2023] [Indexed: 08/19/2023]

Dallari V, Liberale C, De Cecco F, Nocini R, Arietti V, Monzani D, Sacchetto L. The role of artificial intelligence in training ENT residents: a survey on ChatGPT, a new method of investigation. ACTA OTORHINOLARYNGOLOGICA ITALICA : ORGANO UFFICIALE DELLA SOCIETA ITALIANA DI OTORINOLARINGOLOGIA E CHIRURGIA CERVICO-FACCIALE 2024;44:161-168. [PMID: 38712520 PMCID: PMC11166211 DOI: 10.14639/0392-100x-n2806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 01/02/2024] [Indexed: 05/08/2024]

Kim H, Park H, Kang S, Kim J, Kim J, Jung J, Taira R. Evaluating the validity of the nursing statements algorithmically generated based on the International Classifications of Nursing Practice for respiratory nursing care using large language models. J Am Med Inform Assoc 2024;31:1397-1403. [PMID: 38630586 PMCID: PMC11105147 DOI: 10.1093/jamia/ocae070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 02/21/2024] [Accepted: 03/19/2024] [Indexed: 04/19/2024] Open

Lopez-Gonzalez R, Sanchez-Cordero S, Pujol-Gebellí J, Castellvi J. Evaluation of the Impact of ChatGPT on the Selection of Surgical Technique in Bariatric Surgery. Obes Surg 2024:10.1007/s11695-024-07279-1. [PMID: 38760650 DOI: 10.1007/s11695-024-07279-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 05/08/2024] [Accepted: 05/09/2024] [Indexed: 05/19/2024]

Vaira LA, Lechien JR, Abbate V, Allevi F, Audino G, Beltramini GA, Bergonzani M, Boscolo-Rizzo P, Califano G, Cammaroto G, Chiesa-Estomba CM, Committeri U, Crimi S, Curran NR, di Bello F, di Stadio A, Frosolini A, Gabriele G, Gengler IM, Lonardi F, Maglitto F, Mayo-Yáñez M, Petrocelli M, Pucci R, Saibene AM, Saponaro G, Tel A, Trabalzini F, Trecca EMC, Vellone V, Salzano G, De Riu G. Validation of the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms. Eur Arch Otorhinolaryngol 2024:10.1007/s00405-024-08710-0. [PMID: 38703195 DOI: 10.1007/s00405-024-08710-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Accepted: 04/27/2024] [Indexed: 05/06/2024]

Abstract

BACKGROUND

The widespread diffusion of Artificial Intelligence (AI) platforms is revolutionizing how health-related information is disseminated, thereby highlighting the need for tools to evaluate the quality of such information. This study aimed to propose and validate the Quality Assessment of Medical Artificial Intelligence (QAMAI), a tool specifically designed to assess the quality of health information provided by AI platforms.

METHODS

The QAMAI tool has been developed by a panel of experts following guidelines for the development of new questionnaires. A total of 30 responses from ChatGPT4, addressing patient queries, theoretical questions, and clinical head and neck surgery scenarios were assessed by 27 reviewers from 25 academic centers worldwide. Construct validity, internal consistency, inter-rater and test-retest reliability were assessed to validate the tool.

RESULTS

The validation was conducted on the basis of 792 assessments for the 30 responses given by ChatGPT4. The results of the exploratory factor analysis revealed a unidimensional structure of the QAMAI with a single factor comprising all the items that explained 51.1% of the variance with factor loadings ranging from 0.449 to 0.856. Overall internal consistency was high (Cronbach's alpha = 0.837). The Interclass Correlation Coefficient was 0.983 (95% CI 0.973-0.991; F (29,542) = 68.3; p < 0.001), indicating excellent reliability. Test-retest reliability analysis revealed a moderate-to-strong correlation with a Pearson's coefficient of 0.876 (95% CI 0.859-0.891; p < 0.001).

CONCLUSIONS

The QAMAI tool demonstrated significant reliability and validity in assessing the quality of health information provided by AI platforms. Such a tool might become particularly important/useful for physicians as patients increasingly seek medical information on AI platforms.

Collapse

Affiliation(s)

Luigi Angelo Vaira Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Viale San Pietro 43/B, 07100, Sassari, Italy. PhD School of Biomedical Science, Biomedical Sciences Department, University of Sassari, Sassari, Italy.
Jerome R Lechien Department of Laryngology and Bronchoesophagology, EpiCURA Hospital, Mons School of Medicine, UMONS. Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium Department of Otolaryngology-Head Neck Surgery, Elsan Polyclinic of Poitiers, Poitiers, France
Vincenzo Abbate Head and Neck Section, Department of Neurosciences, Reproductive and Odontostomatological Science, Federico II University of Naples, Naples, Italy
Fabiana Allevi Maxillofacial Surgery Department, ASSt Santi Paolo e Carlo, University of Milan, Milan, Italy
Giovanni Audino Head and Neck Section, Department of Neurosciences, Reproductive and Odontostomatological Science, Federico II University of Naples, Naples, Italy
Giada Anna Beltramini Department of Biomedical, Surgical and Dental Sciences, University of Milan, Milan, Italy Maxillofacial and Dental Unit, Fondazione IRCCS Cà Granda Ospedale Maggiore Policlinico, Milan, Italy
Michela Bergonzani Maxillo-Facial Surgery Division, Head and Neck Department, University Hospital of Parma, Parma, USA
Paolo Boscolo-Rizzo Department of Medical, Surgical and Health Sciences, Section of Otolaryngology, University of Trieste, Trieste, Italy
Gianluigi Califano Department of Neurosciences, Reproductive and Odontostomatological Science, Federico II University of Naples, Naples, Italy
Giovanni Cammaroto ENT Department, Morgagni Pierantoni Hospital, AUSL Romagna, Forlì, Italy
Carlos M Chiesa-Estomba Department of Otorhinolaryngology-Head and Neck Surgery, Hospital Universitario Donostia, San Sebastian, Spain
Umberto Committeri Head and Neck Section, Department of Neurosciences, Reproductive and Odontostomatological Science, Federico II University of Naples, Naples, Italy
Salvatore Crimi Operative Unit of Maxillofacial Surgery, Policlinico San Marco, University of Catania, Catania, Italy
Nicholas R Curran Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati Medical Center, Cincinnati, OH, USA
Francesco di Bello Department of Neurosciences, Reproductive and Odontostomatological Science, Federico II University of Naples, Naples, Italy
Arianna di Stadio Otolaryngology Unit, GF Ingrassia Department, University of Catania, Catania, Italy
Andrea Frosolini Department of Maxillofacial Surgery, University of Siena, Siena, Italy
Guido Gabriele Department of Maxillofacial Surgery, University of Siena, Siena, Italy
Isabelle M Gengler Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati Medical Center, Cincinnati, OH, USA
Fabio Lonardi Department of Maxillofacial Surgery, University of Verona, Verona, Italy
Fabio Maglitto Maxillo-Facial Surgery Unit, University of Bari "Aldo Moro", Bari, Italy
Miguel Mayo-Yáñez Otorhinolaryngology, Head and Neck Surgery Department, Complexo Hospitalario Universitario A Coruña (CHUAC), A Coruña, Galicia, Spain
Marzia Petrocelli Maxillofacial Surgery Operative Unit, Bellaria and Maggiore Hospital, Bologna, Italy
Resi Pucci Maxillofacial Surgery Unit, San Camillo-Forlanini Hospital, Rome, Italy
Alberto Maria Saibene Otolaryngology Unit, Santi Paolo e Carlo Hospital, Department of Health Sciences, University of Milan, Milan, Italy
Gianmarco Saponaro Maxillo-Facial Surgery Unit, IRCSS "A. Gemelli" Foundation-Catholic University of the Sacred Heart, Rome, Italy
Alessandro Tel Clinic of Maxillofacial Surgery, Department of Head and Neck Surgery and Neuroscience, University Hospital of Udine, Udine, Italy
Franco Trabalzini Department of Otorhinolaryngology, Head and Neck Surgery, Meyer Children's Hospital, Florence, Italy
Eleonora M C Trecca Department of Otorhinolaryngology and Maxillofacial Surgery, IRCCS Hospital Casa Sollievo Della Sofferenza, San Giovanni Rotondo, Foggia, Italy Department of Otorhinolaryngology, University Hospital of Foggia, Foggia, Italy
Valentino Vellone Maxillofacial Surgery Unit, "S. Maria" Hospital, Terni, Italy
Giovanni Salzano Head and Neck Section, Department of Neurosciences, Reproductive and Odontostomatological Science, Federico II University of Naples, Naples, Italy
Giacomo De Riu Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Viale San Pietro 43/B, 07100, Sassari, Italy

Collapse

Lechien JR, Carroll TL, Huston MN, Naunheim MR. ChatGPT-4 accuracy for patient education in laryngopharyngeal reflux. Eur Arch Otorhinolaryngol 2024;281:2547-2552. [PMID: 38492008 DOI: 10.1007/s00405-024-08560-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 02/13/2024] [Indexed: 03/18/2024]

Starke SJ, Martinez Rivera MB, Krishnan S, Shah M. Randomized Controlled Trial of Clinical Guidelines Versus Interactive Decision-Support for Improving Medical Trainees' Confidence with Latent Tuberculosis Care. J Gen Intern Med 2024;39:951-959. [PMID: 38062221 PMCID: PMC11074081 DOI: 10.1007/s11606-023-08551-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/17/2023] [Indexed: 05/08/2024]

Abstract

BACKGROUND

In order to eliminate tuberculosis (TB) in the USA, primary care providers must take on an expanded role in the diagnosis and management of latent tuberculosis infection (LTBI). Clinical practice guidelines and recommendations exist for LTBI management, but there is a need for innovative tools to improve medical students' and residents' knowledge of evidence-based practices for LTBI testing and treatment.

OBJECTIVE

To assess the impact of LTBI-ASSIST, a free online decision support aid, as a novel educational tool and mechanism of delivering clinical practice guidelines for medical trainees.

DESIGN

A single site, randomized controlled trial of trainees delivered by electronic survey.

INTERVENTIONS

Medical students and Internal Medicine residents at the Johns Hopkins University School of Medicine.

PARTICIPANTS

Participants were randomized in 1:1 ratio to receive the US clinical practice guidelines and recommendations for Latent TB management (control arm) or the guidelines plus an introduction to LTBI-ASSIST (LTBI-ASSIST arm) as they completed a case-based knowledge assessment and reported confidence with domains of LTBI care.

MAIN MEASURES

(1) Proportion of questions answered correctly on a case-based knowledge assessment; (2) change in reported confidence with domains of LTBI care.

KEY RESULTS

One hundred and thirty participants completed the knowledge assessment. Those randomized to receive the LTBI-ASSIST Tool performed better on the case-based knowledge assessment with a mean score of 75.9% (95% CI: 70.6-81.1), compared to 57.4% (52.8-62.0) in the group that received the guidelines only (p <0.001). Similarly, the LTBI-ASSIST group reported a higher change in confidence (measured as post-assessment confidence minus pre-assessment confidence), compared to the control group, in six of the seven domains of LTBI care.

CONCLUSIONS

LTBI-ASSIST can be an effective supplement to existing guidelines in educating medical trainees and helping providers find evidence-based, guideline-supported answers for questions encountered in clinical practice.

TRIAL REGISTRATION

NIH Clinical Trial Registry No. NCT05772065.

Collapse

Wu J, Ma Y, Wang J, Xiao M. The Application of ChatGPT in Medicine: A Scoping Review and Bibliometric Analysis. J Multidiscip Healthc 2024;17:1681-1692. [PMID: 38650670 PMCID: PMC11034560 DOI: 10.2147/jmdh.s463128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 03/25/2024] [Indexed: 04/25/2024] Open

Abstract

Purpose

ChatGPT has a wide range of applications in the medical field. Therefore, this review aims to define the key issues and provide a comprehensive view of the literature based on the application of ChatGPT in medicine.

Methods

This scope follows Arksey and O'Malley's five-stage framework. A comprehensive literature search of publications (30 November 2022 to 16 August 2023) was conducted. Six databases were searched and relevant references were systematically catalogued. Attention was focused on the general characteristics of the articles, their fields of application, and the advantages and disadvantages of using ChatGPT. Descriptive statistics and narrative synthesis methods were used for data analysis.

Results

Of the 3426 studies, 247 met the criteria for inclusion in this review. The majority of articles (31.17%) were from the United States. Editorials (43.32%) ranked first, followed by experimental studys (11.74%). The potential applications of ChatGPT in medicine are varied, with the largest number of studies (45.75%) exploring clinical practice, including assisting with clinical decision support and providing disease information and medical advice. This was followed by medical education (27.13%) and scientific research (16.19%). Particularly noteworthy in the discipline statistics were radiology, surgery and dentistry at the top of the list. However, ChatGPT in medicine also faces issues of data privacy, inaccuracy and plagiarism.

Conclusion

The application of ChatGPT in medicine focuses on different disciplines and general application scenarios. ChatGPT has a paradoxical nature: it offers significant advantages, but at the same time raises great concerns about its application in healthcare settings. Therefore, it is imperative to develop theoretical frameworks that not only address its widespread use in healthcare but also facilitate a comprehensive assessment. In addition, these frameworks should contribute to the development of strict and effective guidelines and regulatory measures.

Collapse

Moise A, Centomo-Bozzo A, Orishchak O, Alnoury MK, Daniel SJ. Can ChatGPT Replace an Otolaryngologist in Guiding Parents on Tonsillectomy? EAR, NOSE & THROAT JOURNAL 2024:1455613241230841. [PMID: 38563440 DOI: 10.1177/01455613241230841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024] Open

Saibene AM, Allevi F, Calvo-Henriquez C, Maniaci A, Mayo-Yáñez M, Paderno A, Vaira LA, Felisati G, Craig JR. Reliability of large language models in managing odontogenic sinusitis clinical scenarios: a preliminary multidisciplinary evaluation. Eur Arch Otorhinolaryngol 2024;281:1835-1841. [PMID: 38189967 PMCID: PMC10943141 DOI: 10.1007/s00405-023-08372-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 11/22/2023] [Indexed: 01/09/2024]

Abstract

PURPOSE

This study aimed to evaluate the utility of large language model (LLM) artificial intelligence tools, Chat Generative Pre-Trained Transformer (ChatGPT) versions 3.5 and 4, in managing complex otolaryngological clinical scenarios, specifically for the multidisciplinary management of odontogenic sinusitis (ODS).

METHODS

A prospective, structured multidisciplinary specialist evaluation was conducted using five ad hoc designed ODS-related clinical scenarios. LLM responses to these scenarios were critically reviewed by a multidisciplinary panel of eight specialist evaluators (2 ODS experts, 2 rhinologists, 2 general otolaryngologists, and 2 maxillofacial surgeons). Based on the level of disagreement from panel members, a Total Disagreement Score (TDS) was calculated for each LLM response, and TDS comparisons were made between ChatGPT3.5 and ChatGPT4, as well as between different evaluators.

RESULTS

While disagreement to some degree was demonstrated in 73/80 evaluator reviews of LLMs' responses, TDSs were significantly lower for ChatGPT4 compared to ChatGPT3.5. Highest TDSs were found in the case of complicated ODS with orbital abscess, presumably due to increased case complexity with dental, rhinologic, and orbital factors affecting diagnostic and therapeutic options. There were no statistically significant differences in TDSs between evaluators' specialties, though ODS experts and maxillofacial surgeons tended to assign higher TDSs.

CONCLUSIONS

LLMs like ChatGPT, especially newer versions, showed potential for complimenting evidence-based clinical decision-making, but substantial disagreement was still demonstrated between LLMs and clinical specialists across most case examples, suggesting they are not yet optimal in aiding clinical management decisions. Future studies will be important to analyze LLMs' performance as they evolve over time.

Collapse

Mira FA, Favier V, Dos Santos Sobreira Nunes H, de Castro JV, Carsuzaa F, Meccariello G, Vicini C, De Vito A, Lechien JR, Chiesa-Estomba C, Maniaci A, Iannella G, Rojas EP, Cornejo JB, Cammaroto G. Chat GPT for the management of obstructive sleep apnea: do we have a polar star? Eur Arch Otorhinolaryngol 2024;281:2087-2093. [PMID: 37980605 DOI: 10.1007/s00405-023-08270-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 09/29/2023] [Indexed: 11/21/2023]

Abstract

PURPOSE

This study explores the potential of the Chat-Generative Pre-Trained Transformer (Chat-GPT), a Large Language Model (LLM), in assisting healthcare professionals in the diagnosis of obstructive sleep apnea (OSA). It aims to assess the agreement between Chat-GPT's responses and those of expert otolaryngologists, shedding light on the role of AI-generated content in medical decision-making.

METHODS

A prospective, cross-sectional study was conducted, involving 350 otolaryngologists from 25 countries who responded to a specialized OSA survey. Chat-GPT was tasked with providing answers to the same survey questions. Responses were assessed by both super-experts and statistically analyzed for agreement.

RESULTS

The study revealed that Chat-GPT and expert responses shared a common answer in over 75% of cases for individual questions. However, the overall consensus was achieved in only four questions. Super-expert assessments showed a moderate agreement level, with Chat-GPT scoring slightly lower than experts. Statistically, Chat-GPT's responses differed significantly from experts' opinions (p = 0.0009). Sub-analysis revealed areas of improvement for Chat-GPT, particularly in questions where super-experts rated its responses lower than expert consensus.

CONCLUSIONS

Chat-GPT demonstrates potential as a valuable resource for OSA diagnosis, especially where access to specialists is limited. The study emphasizes the importance of AI-human collaboration, with Chat-GPT serving as a complementary tool rather than a replacement for medical professionals. This research contributes to the discourse in otolaryngology and encourages further exploration of AI-driven healthcare applications. While Chat-GPT exhibits a commendable level of consensus with expert responses, ongoing refinements in AI-based healthcare tools hold significant promise for the future of medicine, addressing the underdiagnosis and undertreatment of OSA and improving patient outcomes.

Collapse

Affiliation(s)

Felipe Ahumada Mira ENT Department, Hospital of Linares, Linares, Chile Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Valentin Favier ENT Department, University Hospital of Montpellier, Montpellier, France Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Heloisa Dos Santos Sobreira Nunes ENT and Sleep Medicine Department, Nucleus of Otolaryngology, Head and Neck Surgery and Sleep Medicine of São Paulo, São Paulo, Brazil Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Joana Vaz de Castro ENT Department, Armed Forces Hospital, Lisbon, Portugal Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Florent Carsuzaa ENT Department, University Hospital of Poitiers, Poitiers, France Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Giuseppe Meccariello Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy
Claudio Vicini Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy
Andrea De Vito Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy
Jerome R Lechien Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology and Head and Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons, Mons, Belgium Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Carlos Chiesa-Estomba Department of Otorhinolaryngology, Biodonostia Research Institute, Donostia University Hospital, Osakidetza, 20014, San Sebastian, Spain Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Antonino Maniaci Department of Medical and Surgical Sciences and Advanced Technologies "GF Ingrassia", ENT Section, University of Catania, Piazza Università 2, 95100, Catania, Italy Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Giannicola Iannella Department of 'Organi di Senso', University "Sapienza", Viale Dell'Università 33, 00185, Rome, Italy Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Eduardo Peña Rojas Clínica Lircay, Talca, Chile
Jenifer Barros Cornejo Hospital Clínico UC Christus, Santiago, Chile
Giovanni Cammaroto Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy. Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France.

Collapse

Lechien JR, Maniaci A, Gengler I, Hans S, Chiesa-Estomba CM, Vaira LA. Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI). Eur Arch Otorhinolaryngol 2024;281:2063-2079. [PMID: 37698703 DOI: 10.1007/s00405-023-08219-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 08/30/2023] [Indexed: 09/13/2023]

Abstract

OBJECTIVES

To evaluate the reliability and validity of the Artificial Intelligence Performance Instrument (AIPI).

METHODS

Medical records of patients consulting in otolaryngology were evaluated by physicians and ChatGPT for differential diagnosis, management, and treatment. The ChatGPT performance was rated twice using AIPI within a 7-day period to assess test-retest reliability. Internal consistency was evaluated using Cronbach's α. Internal validity was evaluated by comparing the AIPI scores of the clinical cases rated by ChatGPT and 2 blinded practitioners. Convergent validity was measured by comparing the AIPI score with a modified version of the Ottawa Clinical Assessment Tool (OCAT). Interrater reliability was assessed using Kendall's tau.

RESULTS

Forty-five patients completed the evaluations (28 females). The AIPI Cronbach's alpha analysis suggested an adequate internal consistency (α = 0.754). The test-retest reliability was moderate-to-strong for items and the total score of AIPI (rs = 0.486, p = 0.001). The mean AIPI score of the senior otolaryngologist was significantly higher compared to the score of ChatGPT, supporting adequate internal validity (p = 0.001). Convergent validity reported a moderate and significant correlation between AIPI and modified OCAT (rs = 0.319; p = 0.044). The interrater reliability reported significant positive concordance between both otolaryngologists for the patient feature, diagnostic, additional examination, and treatment subscores as well as for the AIPI total score.

CONCLUSIONS

AIPI is a valid and reliable instrument in assessing the performance of ChatGPT in ear, nose and throat conditions. Future studies are needed to investigate the usefulness of AIPI in medicine and surgery, and to evaluate the psychometric properties in these fields.

Collapse

Affiliation(s)

Jerome R Lechien Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France. Young Confederation of the European Oto-Rhino-Laryngological Head and Neck Surgery Societies (Y-CEORLHNS), Dublin, Ireland. Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium. Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, School of Medicine, UFR Simone Veil, Université Versailles Saint-Quentin-en-Yvelines (Paris Saclay University), Paris, France. Department of Otorhinolaryngology and Head and Neck Surgery, CHU Saint-Pierre, Brussels, Belgium. Faculty of Medicine, Department of Human Anatomy and Experimental Oncology, UMONS Research Institute for Health Sciences and Technology, Avenue du Champ de Mars, 6, B7000, Mons, Belgium.
Antonino Maniaci Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France Department of Medical, Surgical Sciences and Advanced Technologies G.F. Ingrassia, ENT Section, University of Catania, 95123, Catania, Italy
Isabelle Gengler Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati Medical Center, Cincinnati, OH, USA
Stephane Hans Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Department of Otorhinolaryngology and Head and Neck Surgery, Foch Hospital, School of Medicine, UFR Simone Veil, Université Versailles Saint-Quentin-en-Yvelines (Paris Saclay University), Paris, France
Carlos M Chiesa-Estomba Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France Young Confederation of the European Oto-Rhino-Laryngological Head and Neck Surgery Societies (Y-CEORLHNS), Dublin, Ireland Department of Otorhinolaryngology - Head and Neck Surgery, Donostia University Hospital - Biodonostia Research Institute, St. Sebastian, Spain
Luigi A Vaira Research Committee of Young Otolaryngologists of the International Federation of Otorhinolaryngological Societies (IFOS), Paris, France Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy Biomedical Science Department, Biomedical Science PhD School, University of Sassari, Sassari, Italy

Collapse

Karimov Z, Allahverdiyev I, Agayarov OY, Demir D, Almuradova E. ChatGPT vs UpToDate: comparative study of usefulness and reliability of Chatbot in common clinical presentations of otorhinolaryngology-head and neck surgery. Eur Arch Otorhinolaryngol 2024;281:2145-2151. [PMID: 38217726 PMCID: PMC10942922 DOI: 10.1007/s00405-023-08423-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 12/18/2023] [Indexed: 01/15/2024]

Teixeira-Marques F, Medeiros N, Nazaré F, Alves S, Lima N, Ribeiro L, Gama R, Oliveira P. Exploring the role of ChatGPT in clinical decision-making in otorhinolaryngology: a ChatGPT designed study. Eur Arch Otorhinolaryngol 2024;281:2023-2030. [PMID: 38345613 DOI: 10.1007/s00405-024-08498-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 01/23/2024] [Indexed: 03/16/2024]

Lechien JR, Chiesa-Estomba CM, Baudouin R, Hans S. Accuracy of ChatGPT in head and neck oncological board decisions: preliminary findings. Eur Arch Otorhinolaryngol 2024;281:2105-2114. [PMID: 37991498 DOI: 10.1007/s00405-023-08326-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 10/27/2023] [Indexed: 11/23/2023]

Abou-Abdallah M, Dar T, Mahmudzade Y, Michaels J, Talwar R, Tornari C. The quality and readability of patient information provided by ChatGPT: can AI reliably explain common ENT operations? Eur Arch Otorhinolaryngol 2024:10.1007/s00405-024-08598-w. [PMID: 38530460 DOI: 10.1007/s00405-024-08598-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 03/04/2024] [Indexed: 03/28/2024]

Briganti G. How ChatGPT works: a mini review. Eur Arch Otorhinolaryngol 2024;281:1565-1569. [PMID: 37991499 DOI: 10.1007/s00405-023-08337-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 11/06/2023] [Indexed: 11/23/2023]

Lombardo R, Cicione A, Santoro G, De Nunzio C. ChatGPT in prostate cancer: myth or reality? Prostate Cancer Prostatic Dis 2024;27:9-10. [PMID: 37950022 DOI: 10.1038/s41391-023-00750-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 08/03/2023] [Accepted: 10/18/2023] [Indexed: 11/12/2023]

Sarma G, Kashyap H, Medhi PP. ChatGPT in Head and Neck Oncology-Opportunities and Challenges. Indian J Otolaryngol Head Neck Surg 2024;76:1425-1429. [PMID: 38440617 PMCID: PMC10908741 DOI: 10.1007/s12070-023-04201-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 08/28/2023] [Indexed: 03/06/2024] Open

Dallari V, Sacchetto A, Saetti R, Calabrese L, Vittadello F, Gazzini L. Is artificial intelligence ready to replace specialist doctors entirely? ENT specialists vs ChatGPT: 1-0, ball at the center. Eur Arch Otorhinolaryngol 2024;281:995-1023. [PMID: 37962570 DOI: 10.1007/s00405-023-08321-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023]

Lechien JR, Georgescu BM, Hans S, Chiesa-Estomba CM. ChatGPT performance in laryngology and head and neck surgery: a clinical case-series. Eur Arch Otorhinolaryngol 2024;281:319-333. [PMID: 37874336 DOI: 10.1007/s00405-023-08282-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 10/06/2023] [Indexed: 10/25/2023]

Abstract

OBJECTIVES

To study the performance of ChatGPT in the management of laryngology and head and neck (LHN) cases.

METHODS

History and clinical examination of patients consulting at the Otolaryngology-Head and Neck Surgery department were presented to ChatGPT, which was interrogated for differential diagnosis, management, and treatment. The ChatGPT performance was assessed by two blinded board-certified otolaryngologists using the following items of a composite score and the Ottawa Clinic Assessment Tool: differential diagnosis; additional examination; and treatment options. The complexity of clinical cases was evaluated with the Amsterdam Clinical Challenge Scale test.

RESULTS

Forty clinical cases were submitted to ChatGPT, accounting for 14 (35%), 12 (30%), and 14 (35%) easy, moderate and difficult cases, respectively. ChatGPT indicated a significant higher number of additional examinations compared to practitioners (p = 0.001). There was a significant agreement between practitioners and ChatGPT for the indication of some common examinations (audiometry, ultrasonography, biopsy, gastrointestinal endoscopy or videofluoroscopy). ChatGPT never indicated some important additional examinations (PET-CT, voice quality assessment, or impedance-pH monitoring). ChatGPT reported highest performance in the proposition of the primary (90%) or the most plausible differential diagnoses (65%), and the therapeutic options (60-68%). The ChatGPT performance in the indication of additional examinations was lowest.

CONCLUSIONS

ChatGPT is a promising adjunctive tool in LHN practice, providing extensive documentation about disease-related additional examinations, differential diagnoses, and treatments. The ChatGPT is more efficient in diagnosis and treatment, rather than in the selection of the most adequate additional examination.

Collapse

Frosolini A, Franz L, Benedetti S, Vaira LA, de Filippis C, Gennaro P, Marioni G, Gabriele G. Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines. Eur Arch Otorhinolaryngol 2023;280:5129-5133. [PMID: 37679532 DOI: 10.1007/s00405-023-08205-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 08/19/2023] [Indexed: 09/09/2023]

Pugliese G, Maccari A, Felisati E, Felisati G, Giudici L, Rapolla C, Pisani A, Saibene AM. Are artificial intelligence large language models a reliable tool for difficult differential diagnosis? An a posteriori analysis of a peculiar case of necrotizing otitis externa. Clin Case Rep 2023;11:e7933. [PMID: 37736475 PMCID: PMC10509342 DOI: 10.1002/ccr3.7933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/30/2023] [Accepted: 09/12/2023] [Indexed: 09/23/2023] Open