1
|
Demirci A. A Comparison of ChatGPT and Human Questionnaire Evaluations of the Urological Cancer Videos Most Watched on YouTube. Clin Genitourin Cancer 2024; 22:102145. [PMID: 39033711 DOI: 10.1016/j.clgc.2024.102145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/21/2024] [Accepted: 06/22/2024] [Indexed: 07/23/2024]
Abstract
AIM To examine the reliability of ChatGPT in evaluating the quality of medical content of the most watched videos related to urological cancers on YouTube. MATERIAL AND METHODS In March 2024 a playlist was created of the first 20 videos watched on YouTube for each type of urological cancer. The video texts were evaluated by ChatGPT and by a urology specialist using the DISCERN-5 and Global Quality Scale (GQS) questionnaires. The results obtained were compared using the Kruskal-Wallis test. RESULTS For the prostate, bladder, renal, and testicular cancer videos, the median (IQR) DISCERN-5 scores given by the human evaluator and ChatGPT were (Human: 4 [1], 3 [0], 3 [2], 3 [1], P = .11; ChatGPT: 3 [1.75], 3 [1], 3 [2], 3 [0], P = .4, respectively) and the GQS scores were (Human: 4 [1.75], 3 [0.75], 3.5 [2], 3.5 [1], P = .12; ChatGPT: 4 [1], 3 [0.75], 3 [1], 3.5 [1], P = .1, respectively), with no significant difference determined between the scores. The repeatability of the ChatGPT responses was determined to be similar at 25 % for prostate cancer, 30 % for bladder cancer, 30 % for renal cancer, and 35 % for testicular cancer (P = .92). No statistically significant difference was determined between the median (IQR) DISCERN-5 and GQS scores given by humans and ChatGPT for the content of videos about prostate, bladder, renal, and testicular cancer (P > .05). CONCLUSION Although ChatGPT is successful in evaluating the medical quality of video texts, the results should be evaluated with caution as the repeatability of the results is low.
Collapse
Affiliation(s)
- Aykut Demirci
- Department of Urology, Dr. Abdurrahman Yurtaslan Ankara Oncology Training and Research Hospital, University of Health Sciences, Ankara, Turkey.
| |
Collapse
|
2
|
Puerto Nino AK, Garcia Perez V, Secco S, De Nunzio C, Lombardo R, Tikkinen KAO, Elterman DS. Can ChatGPT provide high-quality patient information on male lower urinary tract symptoms suggestive of benign prostate enlargement? Prostate Cancer Prostatic Dis 2024:10.1038/s41391-024-00847-7. [PMID: 38871841 DOI: 10.1038/s41391-024-00847-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/03/2024] [Accepted: 05/10/2024] [Indexed: 06/15/2024]
Abstract
BACKGROUND ChatGPT has recently emerged as a novel resource for patients' disease-specific inquiries. There is, however, limited evidence assessing the quality of the information. We evaluated the accuracy and quality of the ChatGPT's responses on male lower urinary tract symptoms (LUTS) suggestive of benign prostate enlargement (BPE) when compared to two reference resources. METHODS Using patient information websites from the European Association of Urology and the American Urological Association as reference material, we formulated 88 BPE-centric questions for ChatGPT 4.0+. Independently and in duplicate, we compared the ChatGPT's responses and the reference material, calculating accuracy through F1 score, precision, and recall metrics. We used a 5-point Likert scale for quality rating. We evaluated examiner agreement using the interclass correlation coefficient and assessed the difference in the quality scores with the Wilcoxon signed-rank test. RESULTS ChatGPT addressed all (88/88) LUTS/BPE-related questions. For the 88 questions, the recorded F1 score was 0.79 (range: 0-1), precision 0.66 (range: 0-1), recall 0.97 (range: 0-1), and the quality score had a median of 4 (range = 1-5). Examiners had a good level of agreement (ICC = 0.86). We found no statistically significant difference between the scores given by the examiners and the overall quality of the responses (p = 0.72). DISCUSSION ChatGPT demostrated a potential utility in educating patients about BPE/LUTS, its prognosis, and treatment that helps in the decision-making process. One must exercise prudence when recommending this as the sole information outlet. Additional studies are needed to completely understand the full extent of AI's efficacy in delivering patient education in urology.
Collapse
Affiliation(s)
- Angie K Puerto Nino
- Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada.
| | | | - Silvia Secco
- Department of Urology, Niguarda Hospital, Milan, Italy
| | - Cosimo De Nunzio
- Urology Unit, Ospedale Sant'Andrea, La Sapienza University of Rome, Rome, Italy
| | - Riccardo Lombardo
- Urology Unit, Ospedale Sant'Andrea, La Sapienza University of Rome, Rome, Italy
| | - Kari A O Tikkinen
- Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Department of Urology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Department of Surgery, South Karelian Central Hospital, Lappeenranta, Finland
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada
| | - Dean S Elterman
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
Geretto P, Lombardo R, Albisinni S, Turchi B, Campi R, DE Cillis S, Vacca L, Pelizzari L, Gallo ML, Sampogna G, Giammo A, Li Marzi V, DE Nunzio C. Quality of information and appropriateness of ChatGPT outputs for neuro-urology. Minerva Urol Nephrol 2024; 76:138-140. [PMID: 38742548 DOI: 10.23736/s2724-6051.24.05807-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Affiliation(s)
- Paolo Geretto
- Unit of Neuro-Urology, Città della Salute e della Scienza University Hospital, University of Turin, Turin, Italy -
| | - Riccardo Lombardo
- Unit of Urology, Sant'Andrea Hospital, Sapienza University, Rome, Italy
| | - Simone Albisinni
- Unit of Urology, Department of Surgical Sciences, Tor Vergata University Hospital, Tor Vergata University of Rome, Rome, Italy
| | - Beatrice Turchi
- Unit of Urology, Sant'Andrea Hospital, Sapienza University, Rome, Italy
| | - Riccardo Campi
- Department of Minimally Invasive and Robotic Urologic Surgery, Careggi University Hospital, University of Florence, Florence, Italy
| | - Sabrina DE Cillis
- Division of Urology, Department of Oncology, San Luigi Gonzaga Hospital, University of Turin, Orbassano, Turin, Italy
| | - Lorenzo Vacca
- Unit of Precision Gynecological Surgery, Dipartimento Centro di Eccellenza Donna e Bambino Nascente, Fatebenefratelli Gemelli Isola Tiberina, Rome, Italy
| | - Laura Pelizzari
- Department of Rehabilitative Medicine, AUSL Piacenza, Fiorenzuola d'Arda, Piacenza, Italy
| | - Maria L Gallo
- Department of Minimally Invasive and Robotic Urologic Surgery, Careggi University Hospital, University of Florence, Florence, Italy
| | - Gianluca Sampogna
- Unit of Urology, Niguarda Hospital, University of Milan, Milan, Italy
| | - Alessandro Giammo
- Unit of Neuro-Urology, Città della Salute e della Scienza University Hospital, University of Turin, Turin, Italy
| | - Vincenzo Li Marzi
- Department of Minimally Invasive and Robotic Urologic Surgery, Careggi University Hospital, University of Florence, Florence, Italy
| | - Cosimo DE Nunzio
- Unit of Urology, Sant'Andrea Hospital, Sapienza University, Rome, Italy
| |
Collapse
|
4
|
Lombardo R, Gallo G, Stira J, Turchi B, Santoro G, Riolo S, Romagnoli M, Cicione A, Tema G, Pastore A, Al Salhi Y, Fuschi A, Franco G, Nacchia A, Tubaro A, De Nunzio C. Quality of information and appropriateness of Open AI outputs for prostate cancer. Prostate Cancer Prostatic Dis 2024:10.1038/s41391-024-00789-0. [PMID: 38228809 DOI: 10.1038/s41391-024-00789-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 12/22/2023] [Accepted: 01/05/2024] [Indexed: 01/18/2024]
Abstract
Chat-GPT, a natural language processing (NLP) tool created by Open-AI, can potentially be used as a quick source for obtaining information related to prostate cancer. This study aims to analyze the quality and appropriateness of Chat-GPT's responses to inquiries related to prostate cancer compared to those of the European Urology Association's (EAU) 2023 prostate cancer guidelines. Overall, 195 questions were prepared according to the recommendations gathered in the prostate cancer section of the EAU 2023 Guideline. All questions were systematically presented to Chat-GPT's August 3 Version, and two expert urologists independently assessed and assigned scores ranging from 1 to 4 to each response (1: completely correct, 2: correct but inadequate, 3: a mix of correct and misleading information, and 4: completely incorrect). Sub-analysis per chapter and per grade of recommendation were performed. Overall, 195 recommendations were evaluated. Overall, 50/195 (26%) were completely correct, 51/195 (26%) correct but inadequate, 47/195 (24%) a mix of correct and misleading and 47/195 (24%) incorrect. When looking at different chapters Open AI was particularly accurate in answering questions on follow-up and QoL. Worst performance was recorded for the diagnosis and treatment chapters with respectively 19% and 30% of the answers completely incorrect. When looking at the strength of recommendation, no differences in terms of accuracy were recorded when comparing weak and strong recommendations (p > 0,05). Chat-GPT has a poor accuracy when answering questions on the PCa EAU guidelines recommendations. Future studies should assess its performance after adequate training.
Collapse
Affiliation(s)
| | - Giacomo Gallo
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Jordi Stira
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Beatrice Turchi
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Giuseppe Santoro
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Sara Riolo
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Matteo Romagnoli
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Antonio Cicione
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Giorgia Tema
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Antonio Pastore
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Yazan Al Salhi
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Andrea Fuschi
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Giorgio Franco
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Antonio Nacchia
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Andrea Tubaro
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| | - Cosimo De Nunzio
- Department of Urology, 'Sapienza' University of Rome, Rome, Italy
| |
Collapse
|