Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Shen Y, Heacock L, Elias J, Hentel KD, Reig B, Shih G, Moy L. ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology 2023;307:e230163. [PMID: 36700838 DOI: 10.1148/radiol.230163] [Citation(s) in RCA: 194] [Impact Index Per Article: 194.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

For:	Shen Y, Heacock L, Elias J, Hentel KD, Reig B, Shih G, Moy L. ChatGPT and Other Large Language Models Are Double-edged Swords. Radiology 2023;307:e230163. [PMID: 36700838 DOI: 10.1148/radiol.230163] [Citation(s) in RCA: 194] [Impact Index Per Article: 194.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Number

Cited by Other Article(s)

Park SH, Pinto-Powell R, Thesen T, Lindqwister A, Levy J, Chacko R, Gonzalez D, Bridges C, Schwendt A, Byrum T, Fong J, Shasavari S, Hassanpour S. Preparing healthcare leaders of the digital age with an integrative artificial intelligence curriculum: a pilot study. MEDICAL EDUCATION ONLINE 2024;29:2315684. [PMID: 38351737 PMCID: PMC10868429 DOI: 10.1080/10872981.2024.2315684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/02/2024] [Indexed: 02/16/2024]

Abstract

Artificial intelligence (AI) is rapidly being introduced into the clinical workflow of many specialties. Despite the need to train physicians who understand the utility and implications of AI and mitigate a growing skills gap, no established consensus exists on how to best introduce AI concepts to medical students during preclinical training. This study examined the effectiveness of a pilot Digital Health Scholars (DHS) non-credit enrichment elective that paralleled the Dartmouth Geisel School of Medicine's first-year preclinical curriculum with a focus on introducing AI algorithms and their applications in the concurrently occurring systems-blocks. From September 2022 to March 2023, ten self-selected first-year students enrolled in the elective curriculum run in parallel with four existing curricular blocks (Immunology, Hematology, Cardiology, and Pulmonology). Each DHS block consisted of a journal club, a live-coding demonstration, and an integration session led by a researcher in that field. Students' confidence in explaining the content objectives (high-level knowledge, implications, and limitations of AI) was measured before and after each block and compared using Mann-Whitney U tests. Students reported significant increases in confidence in describing the content objectives after all four blocks (Immunology: U = 4.5, p = 0.030; Hematology: U = 1.0, p = 0.009; Cardiology: U = 4.0, p = 0.019; Pulmonology: U = 4.0, p = 0.030) as well as an average overall satisfaction level of 4.29/5 in rating the curriculum content. Our study demonstrates that a digital health enrichment elective that runs in parallel to an institution's preclinical curriculum and embeds AI concepts into relevant clinical topics can enhance students' confidence in describing the content objectives that pertain to high-level algorithmic understanding, implications, and limitations of the studied models. Building on this elective curricular design, further studies with a larger enrollment can help determine the most effective approach in preparing future physicians for the AI-enhanced clinical workflow.

Collapse

Zhui L, Fenghe L, Xuehu W, Qining F, Wei R. Ethical Considerations and Fundamental Principles of Large Language Models in Medical Education: A Viewpoint. J Med Internet Res 2024. [PMID: 38971715 DOI: 10.2196/60083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2024] Open

Abstract

UNSTRUCTURED

This viewpoint article first explores the ethical challenges associated with the future application of large language models (LLMs) in the context of medical education. These challenges include not only ethical concerns related to the development of LLMs, such as artificial intelligence (AI) hallucinations, information bias, privacy and data risks, and deficiencies in terms of transparency and interpretability but also issues concerning the application of LLMs, including deficiencies in emotional intelligence, educational inequities, problems with academic integrity, and questions of responsibility and copyright ownership. This paper then analyzes existing AI-related legal and ethical frameworks and highlights their limitations with regard to the application of LLMs in the context of medical education. To ensure that LLMs are integrated in a responsible and safe manner, the authors recommend the development of a unified ethical framework that is specifically tailored for LLMs in this field. This framework should be based on eight fundamental principles: quality control and supervision mechanisms; privacy and data protection; transparency and interpretability; fairness and equal treatment; academic integrity and moral norms; accountability and traceability; protection and respect for intellectual property; and the promotion of educational research and innovation. The authors further discuss specific measures that can be taken to implement these principles, thereby laying a solid foundation for the development of a comprehensive and actionable ethical framework. Such a unified ethical framework based on these eight fundamental principles can provide clear guidance and support for the application of LLMs in the context of medical education. This approach can help establish a balance between technological advancement and ethical safeguards, thereby ensuring that medical education can progress without compromising the principles of fairness, justice, or patient safety and establishing a more equitable, safer, and more efficient environment for medical education.

Collapse

Yasaka K, Kanzawa J, Kanemaru N, Koshino S, Abe O. Fine-Tuned Large Language Model for Extracting Patients on Pretreatment for Lung Cancer from a Picture Archiving and Communication System Based on Radiological Reports. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01186-8. [PMID: 38955964 DOI: 10.1007/s10278-024-01186-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/17/2024] [Accepted: 06/19/2024] [Indexed: 07/04/2024]

Bortoli M, Fiore M, Tedeschi S, Oliveira V, Sousa R, Bruschi A, Campanacci DA, Viale P, De Paolis M, Sambri A. GPT-based chatbot tools are still unreliable in the management of prosthetic joint infections. Musculoskelet Surg 2024:10.1007/s12306-024-00846-w. [PMID: 38954323 DOI: 10.1007/s12306-024-00846-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 06/21/2024] [Indexed: 07/04/2024]

Keshavarz P, Bagherieh S, Nabipoorashrafi SA, Chalian H, Rahsepar AA, Kim GHJ, Hassani C, Raman SS, Bedayat A. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging 2024;105:251-265. [PMID: 38679540 DOI: 10.1016/j.diii.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/11/2024] [Accepted: 04/16/2024] [Indexed: 05/01/2024]

Abstract

PURPOSE

The purpose of this study was to systematically review the reported performances of ChatGPT, identify potential limitations, and explore future directions for its integration, optimization, and ethical considerations in radiology applications.

MATERIALS AND METHODS

After a comprehensive review of PubMed, Web of Science, Embase, and Google Scholar databases, a cohort of published studies was identified up to January 1, 2024, utilizing ChatGPT for clinical radiology applications.

RESULTS

Out of 861 studies derived, 44 studies evaluated the performance of ChatGPT; among these, 37 (37/44; 84.1%) demonstrated high performance, and seven (7/44; 15.9%) indicated it had a lower performance in providing information on diagnosis and clinical decision support (6/44; 13.6%) and patient communication and educational content (1/44; 2.3%). Twenty-four (24/44; 54.5%) studies reported the proportion of ChatGPT's performance. Among these, 19 (19/24; 79.2%) studies recorded a median accuracy of 70.5%, and in five (5/24; 20.8%) studies, there was a median agreement of 83.6% between ChatGPT outcomes and reference standards [radiologists' decision or guidelines], generally confirming ChatGPT's high accuracy in these studies. Eleven studies compared two recent ChatGPT versions, and in ten (10/11; 90.9%), ChatGPTv4 outperformed v3.5, showing notable enhancements in addressing higher-order thinking questions, better comprehension of radiology terms, and improved accuracy in describing images. Risks and concerns about using ChatGPT included biased responses, limited originality, and the potential for inaccurate information leading to misinformation, hallucinations, improper citations and fake references, cybersecurity vulnerabilities, and patient privacy risks.

CONCLUSION

Although ChatGPT's effectiveness has been shown in 84.1% of radiology studies, there are still multiple pitfalls and limitations to address. It is too soon to confirm its complete proficiency and accuracy, and more extensive multicenter studies utilizing diverse datasets and pre-training techniques are required to verify ChatGPT's role in radiology.

Collapse

Aburumman R, Al Annan K, Mrad R, Brunaldi VO, Gala K, Abu Dayyeh BK. Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study. Obes Surg 2024;34:2718-2724. [PMID: 38758515 DOI: 10.1007/s11695-024-07283-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 05/07/2024] [Accepted: 05/09/2024] [Indexed: 05/18/2024]

Abstract

BACKGROUND AND AIMS

The Chat Generative Pre-Trained Transformer (ChatGPT) represents a significant advancement in artificial intelligence (AI) chatbot technology. While ChatGPT offers promising capabilities, concerns remain about its reliability and accuracy. This study aims to evaluate ChatGPT's responses to patients' frequently asked questions about Endoscopic Sleeve Gastroplasty (ESG).

METHODS

Expert Gastroenterologists and Bariatric Surgeons, with experience in ESG, were invited to evaluate ChatGPT-generated answers to eight ESG-related questions, and answers sourced from hospital websites. The evaluation criteria included ease of understanding, scientific accuracy, and overall answer satisfaction. They were also tasked with discerning whether each response was AI generated or not.

RESULTS

Twelve medical professionals with expertise in ESG participated, 83.3% of whom had experience performing the procedure independently. The entire cohort possessed substantial knowledge about ESG. ChatGPT's utility among participants, rated on a scale of one to five, averaged 2.75. The raters demonstrated a 54% accuracy rate in distinguishing AI-generated responses, with a sensitivity of 39% and specificity of 60%, resulting in an average of 17.6 correct identifications out of a possible 31. Overall, there were no significant differences between AI-generated and non-AI responses in terms of scientific accuracy, understandability, and satisfaction, with one notable exception. For the question defining ESG, the AI-generated definition scored higher in scientific accuracy (4.33 vs. 3.61, p = 0.007) and satisfaction (4.33 vs. 3.58, p = 0.009) compared to the non-AI versions.

CONCLUSIONS

This study underscores ChatGPT's efficacy in providing medical information on ESG, demonstrating its comparability to traditional sources in scientific accuracy.

Collapse

Lecler A, Soyer P, Gong B. The potential and pitfalls of ChatGPT in radiology. Diagn Interv Imaging 2024;105:249-250. [PMID: 38811261 DOI: 10.1016/j.diii.2024.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 05/15/2024] [Indexed: 05/31/2024]

Kumar RP, Sivan V, Bachir H, Sarwar SA, Ruzicka F, O'Malley GR, Lobo P, Morales IC, Cassimatis ND, Hundal JS, Patel NV. Can Artificial Intelligence Mitigate Missed Diagnoses by Generating Differential Diagnoses for Neurosurgeons? World Neurosurg 2024;187:e1083-e1088. [PMID: 38759788 DOI: 10.1016/j.wneu.2024.05.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 05/08/2024] [Accepted: 05/09/2024] [Indexed: 05/19/2024]

Suleiman A, von Wedel D, Munoz-Acuna R, Redaelli S, Santarisi A, Seibold EL, Ratajczak N, Kato S, Said N, Sundar E, Goodspeed V, Schaefer MS. Assessing ChatGPT's ability to emulate human reviewers in scientific research: A descriptive and qualitative approach. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;254:108313. [PMID: 38954915 DOI: 10.1016/j.cmpb.2024.108313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/04/2024]

Affiliation(s)

Aiman Suleiman Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Anesthesia, Critical Care and Pain Medicine, Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA.
Dario von Wedel Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Ricardo Munoz-Acuna Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Simone Redaelli Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Abeer Santarisi Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Emergency Medicine, Disaster Medicine Fellowship, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Eva-Lotte Seibold Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Nikolai Ratajczak Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Shinichiro Kato Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Nader Said Department of Industrial Engineering, Faculty of Engineering Technologies and Sciences, Higher Colleges of Technology, DWC, Dubai, United Arab Emirates
Eswar Sundar Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Valerie Goodspeed Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
Maximilian S Schaefer Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Klinik für Anästhesiologie, Universitätsklinikum Düsseldorf, Düsseldorf, Germany

Collapse

Dehdab R, Brendlin A, Werner S, Almansour H, Gassenmaier S, Brendel JM, Nikolaou K, Afat S. Evaluating ChatGPT-4V in chest CT diagnostics: a critical image interpretation assessment. Jpn J Radiol 2024:10.1007/s11604-024-01606-3. [PMID: 38867035 DOI: 10.1007/s11604-024-01606-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 05/28/2024] [Indexed: 06/14/2024]

Park J, Oh K, Han K, Lee YH. Patient-centered radiology reports with generative artificial intelligence: adding value to radiology reporting. Sci Rep 2024;14:13218. [PMID: 38851825 PMCID: PMC11162416 DOI: 10.1038/s41598-024-63824-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 06/03/2024] [Indexed: 06/10/2024] Open

Rao SJ, Isath A, Krishnan P, Tangsrivimol JA, Virk HUH, Wang Z, Glicksberg BS, Krittanawong C. ChatGPT: A Conceptual Review of Applications and Utility in the Field of Medicine. J Med Syst 2024;48:59. [PMID: 38836893 DOI: 10.1007/s10916-024-02075-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 05/07/2024] [Indexed: 06/06/2024]

Farquhar S, Kossen J, Kuhn L, Gal Y. Detecting hallucinations in large language models using semantic entropy. Nature 2024;630:625-630. [PMID: 38898292 PMCID: PMC11186750 DOI: 10.1038/s41586-024-07421-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 04/12/2024] [Indexed: 06/21/2024]

Kuai H, Chen J, Tao X, Cai L, Imamura K, Matsumoto H, Liang P, Zhong N. Never-Ending Learning for Explainable Brain Computing. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024;11:e2307647. [PMID: 38602432 PMCID: PMC11200082 DOI: 10.1002/advs.202307647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 03/24/2024] [Indexed: 04/12/2024]

Saba L, Fu CL, Khouri J, Faiman B, Anwer F, Chaulagain CP. Evaluating ChatGPT as an educational resource for patients with multiple myeloma: A preliminary investigation. Am J Hematol 2024;99:1205-1207. [PMID: 38602288 DOI: 10.1002/ajh.27318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 03/07/2024] [Accepted: 03/11/2024] [Indexed: 04/12/2024]

Brunner J, Rinne S. Large Language Models as a Tool for Health Services Researchers: An Exploration of High-Value Applications. Ann Am Thorac Soc 2024;21:845-848. [PMID: 38445982 DOI: 10.1513/annalsats.202311-980ps] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 03/05/2024] [Indexed: 03/07/2024] Open

Burnette H, Pabani A, von Itzstein MS, Switzer B, Fan R, Ye F, Puzanov I, Naidoo J, Ascierto PA, Gerber DE, Ernstoff MS, Johnson DB. Use of artificial intelligence chatbots in clinical management of immune-related adverse events. J Immunother Cancer 2024;12:e008599. [PMID: 38816231 PMCID: PMC11141185 DOI: 10.1136/jitc-2023-008599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open

Choudhury A, Shamszare H. The Impact of Performance Expectancy, Workload, Risk, and Satisfaction on Trust in ChatGPT: Cross-Sectional Survey Analysis. JMIR Hum Factors 2024;11:e55399. [PMID: 38801658 PMCID: PMC11165287 DOI: 10.2196/55399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 03/25/2024] [Accepted: 04/07/2024] [Indexed: 05/29/2024] Open

Abstract

BACKGROUND

ChatGPT (OpenAI) is a powerful tool for a wide range of tasks, from entertainment and creativity to health care queries. There are potential risks and benefits associated with this technology. In the discourse concerning the deployment of ChatGPT and similar large language models, it is sensible to recommend their use primarily for tasks a human user can execute accurately. As we transition into the subsequent phase of ChatGPT deployment, establishing realistic performance expectations and understanding users' perceptions of risk associated with its use are crucial in determining the successful integration of this artificial intelligence (AI) technology.

OBJECTIVE

The aim of the study is to explore how perceived workload, satisfaction, performance expectancy, and risk-benefit perception influence users' trust in ChatGPT.

METHODS

A semistructured, web-based survey was conducted with 607 adults in the United States who actively use ChatGPT. The survey questions were adapted from constructs used in various models and theories such as the technology acceptance model, the theory of planned behavior, the unified theory of acceptance and use of technology, and research on trust and security in digital environments. To test our hypotheses and structural model, we used the partial least squares structural equation modeling method, a widely used approach for multivariate analysis.

RESULTS

A total of 607 people responded to our survey. A significant portion of the participants held at least a high school diploma (n=204, 33.6%), and the majority had a bachelor's degree (n=262, 43.1%). The primary motivations for participants to use ChatGPT were for acquiring information (n=219, 36.1%), amusement (n=203, 33.4%), and addressing problems (n=135, 22.2%). Some participants used it for health-related inquiries (n=44, 7.2%), while a few others (n=6, 1%) used it for miscellaneous activities such as brainstorming, grammar verification, and blog content creation. Our model explained 64.6% of the variance in trust. Our analysis indicated a significant relationship between (1) workload and satisfaction, (2) trust and satisfaction, (3) performance expectations and trust, and (4) risk-benefit perception and trust.

CONCLUSIONS

The findings underscore the importance of ensuring user-friendly design and functionality in AI-based applications to reduce workload and enhance user satisfaction, thereby increasing user trust. Future research should further explore the relationship between risk-benefit perception and trust in the context of AI chatbots.

Collapse

Ye F, Zhang H, Luo X, Wu T, Yang Q, Shi Z. Evaluating ChatGPT's Performance in Answering Questions About Allergic Rhinitis and Chronic Rhinosinusitis. Otolaryngol Head Neck Surg 2024. [PMID: 38796735 DOI: 10.1002/ohn.832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 04/30/2024] [Accepted: 05/04/2024] [Indexed: 05/28/2024]

Affiliation(s)

Fan Ye Department of Otolaryngology-Head and Neck Surgery, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Department of Allergy, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
He Zhang Department of Otolaryngology-Head and Neck Surgery, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Department of Allergy, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
Xin Luo Department of Otolaryngology-Head and Neck Surgery, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Department of Allergy, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
Tong Wu Department of Otolaryngology-Head and Neck Surgery, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Department of Allergy, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
Qintai Yang Department of Otolaryngology-Head and Neck Surgery, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Department of Allergy, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Naso-Orbital-Maxilla and Skull Base Center, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Key Laboratory of Airway Inflammatory Disease Research and Innovative Technology Translation, Guangzhou, China
Zhaohui Shi Department of Otolaryngology-Head and Neck Surgery, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Department of Allergy, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Naso-Orbital-Maxilla and Skull Base Center, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China Key Laboratory of Airway Inflammatory Disease Research and Innovative Technology Translation, Guangzhou, China

Collapse

Buldur M, Sezer B. Evaluating the accuracy of Chat Generative Pre-trained Transformer version 4 (ChatGPT-4) responses to United States Food and Drug Administration (FDA) frequently asked questions about dental amalgam. BMC Oral Health 2024;24:605. [PMID: 38789962 PMCID: PMC11127407 DOI: 10.1186/s12903-024-04358-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open

Abstract

BACKGROUND

The use of artificial intelligence in the field of health sciences is becoming widespread. It is known that patients benefit from artificial intelligence applications on various health issues, especially after the pandemic period. One of the most important issues in this regard is the accuracy of the information provided by artificial intelligence applications.

OBJECTIVE

The purpose of this study was to the frequently asked questions about dental amalgam, as determined by the United States Food and Drug Administration (FDA), which is one of these information resources, to Chat Generative Pre-trained Transformer version 4 (ChatGPT-4) and to compare the content of the answers given by the application with the answers of the FDA.

METHODS

The questions were directed to ChatGPT-4 on May 8th and May 16th, 2023, and the responses were recorded and compared at the word and meaning levels using ChatGPT. The answers from the FDA webpage were also recorded. The responses were compared for content similarity in "Main Idea", "Quality Analysis", "Common Ideas", and "Inconsistent Ideas" between ChatGPT-4's responses and FDA's responses.

RESULTS

ChatGPT-4 provided similar responses at one-week intervals. In comparison with FDA guidance, it provided answers with similar information content to frequently asked questions. However, although there were some similarities in the general aspects of the recommendation regarding amalgam removal in the question, the two texts are not the same, and they offered different perspectives on the replacement of fillings.

CONCLUSIONS

The findings of this study indicate that ChatGPT-4, an artificial intelligence based application, encompasses current and accurate information regarding dental amalgam and its removal, providing it to individuals seeking access to such information. Nevertheless, we believe that numerous studies are required to assess the validity and reliability of ChatGPT-4 across diverse subjects.

Collapse

Sireci F, Lorusso F, Immordino A, Centineo M, Gerardi I, Patti G, Rusignuolo S, Manzella R, Gallina S, Dispenza F. ChatGPT as a New Tool to Select a Biological for Chronic Rhino Sinusitis with Polyps, "Caution Advised" or "Distant Reality"? J Pers Med 2024;14:563. [PMID: 38929784 PMCID: PMC11204527 DOI: 10.3390/jpm14060563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 05/07/2024] [Accepted: 05/23/2024] [Indexed: 06/28/2024] Open

Affiliation(s)

Federico Sireci Otorhinolaryngology Section, Department of Precision Medicine in Medical, Surgical and Critical Care (Me.Pre.C.C), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy;
Francesco Lorusso Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Angelo Immordino Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Manuela Centineo Digital Consultant Freelance, 90100 Palermo, Italy;
Ignazio Gerardi Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Gaetano Patti Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Simona Rusignuolo Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Riccardo Manzella Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Salvatore Gallina Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)
Francesco Dispenza Otorhinolaryngology Section, Biomedicine, Neuroscience and Advanced Diagnosics Department (BiND), University of Palermo, Via del Vespro 129, 133, 90127 Palermo, Italy; (F.L.); (I.G.); (G.P.); (S.R.); (R.M.); (S.G.); (F.D.)

Collapse

SAYGIN M, BEKMEZCİ M, DİNÇER E. Artificial Intelligence Model Chatgpt-4: Entrepreneur Candidate and Entrepreneurship Example. F1000Res 2024;13:308. [PMID: 38845823 PMCID: PMC11153998 DOI: 10.12688/f1000research.144671.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/20/2024] [Indexed: 06/09/2024] Open

Tripathi S, Sukumaran R, Cook TS. Efficient healthcare with large language models: optimizing clinical workflow and enhancing patient care. J Am Med Inform Assoc 2024;31:1436-1440. [PMID: 38273739 PMCID: PMC11105142 DOI: 10.1093/jamia/ocad258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/01/2023] [Accepted: 12/29/2023] [Indexed: 01/27/2024] Open

Dimitriadis F, Tsigkriki L, Charisopoulou D, Tsaousidis A, Siarkos M, Koulaouzidis G. Letter Re: Response to Luan et al. Angiology 2024:33197241256685. [PMID: 38769649 DOI: 10.1177/00033197241256685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]

Simsek O, Manteghinejad A, Vossough A. A Comparative Review of Imaging Journal Policies for Use of AI in Manuscript Generation. Acad Radiol 2024:S1076-6332(24)00290-3. [PMID: 38772797 DOI: 10.1016/j.acra.2024.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/06/2024] [Accepted: 05/06/2024] [Indexed: 05/23/2024]

Abstract

RATIONALE AND OBJECTIVES

Artificial intelligence (AI) technologies are rapidly evolving and offering new advances almost on a day-by-day basis, including various tools for manuscript generation and modification. On the other hand, these potentially time- and effort-saving solutions come with potential bias, factual error, and plagiarism risks. Some journals have started to update their author guidelines in reference to AI-generated or AI-assisted manuscripts. The purpose of this paper is to evaluate author guidelines for including AI use policies in radiology journals and compare scientometric data between journals with and without explicit AI use policies.

MATERIALS AND METHODS

This cross-sectional study included 112 MEDLINE-indexed imaging journals and evaluated their author guidelines between 13 October 2023 and 16 October 2023. Journals were identified based on subject matter and association with a radiological society. The authors' guidelines and editorial policies were evaluated for the use of AI in manuscript preparation and specific AI-generated image policies. We assessed the existence of an AI usage policy among subspecialty imaging journals. The scientometric scores of journals with and without AI use policies were compared using the Wilcoxon signed-rank test.

RESULTS

Among 112 MEDLINE-indexed radiology journals, 80 journals were affiliated with an imaging society, and 32 were not. 69 (61.6%) of 112 imaging journals had an AI usage policy, and 40 (57.9%) of 69 mentioned a specific policy about AI-generated figures. CiteScore (4.9 vs 4, p = 0.023), Source Normalized Impact per Paper (1.12 vs 0.83, p = 0.06), Scientific Journal Ranking (0.75 vs 0.54, p = 0.010) and Journal Citation Indicator (0.77 vs 0.62, p = 0.038) were significantly higher in journals with an AI policy.

CONCLUSION

The majority of imaging journals provide guidelines for AI-generated content, but still, a substantial number of journals do not have AI usage policies or do not require disclosure for non-human-created manuscripts. Journals with an established AI policy had higher citation and impact scores.

Collapse

Rau S, Rau A, Nattenmüller J, Fink A, Bamberg F, Reisert M, Russe MF. A retrieval-augmented chatbot based on GPT-4 provides appropriate differential diagnosis in gastrointestinal radiology: a proof of concept study. Eur Radiol Exp 2024;8:60. [PMID: 38755410 PMCID: PMC11098977 DOI: 10.1186/s41747-024-00457-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 03/12/2024] [Indexed: 05/18/2024] Open

Abstract

BACKGROUND

We investigated the potential of an imaging-aware GPT-4-based chatbot in providing diagnoses based on imaging descriptions of abdominal pathologies.

METHODS

Utilizing zero-shot learning via the LlamaIndex framework, GPT-4 was enhanced using the 96 documents from the Radiographics Top 10 Reading List on gastrointestinal imaging, creating a gastrointestinal imaging-aware chatbot (GIA-CB). To assess its diagnostic capability, 50 cases on a variety of abdominal pathologies were created, comprising radiological findings in fluoroscopy, MRI, and CT. We compared the GIA-CB to the generic GPT-4 chatbot (g-CB) in providing the primary and 2 additional differential diagnoses, using interpretations from senior-level radiologists as ground truth. The trustworthiness of the GIA-CB was evaluated by investigating the source documents as provided by the knowledge-retrieval mechanism. Mann-Whitney U test was employed.

RESULTS

The GIA-CB demonstrated a high capability to identify the most appropriate differential diagnosis in 39/50 cases (78%), significantly surpassing the g-CB in 27/50 cases (54%) (p = 0.006). Notably, the GIA-CB offered the primary differential in the top 3 differential diagnoses in 45/50 cases (90%) versus g-CB with 37/50 cases (74%) (p = 0.022) and always with appropriate explanations. The median response time was 29.8 s for GIA-CB and 15.7 s for g-CB, and the mean cost per case was $0.15 and $0.02, respectively.

CONCLUSIONS

The GIA-CB not only provided an accurate diagnosis for gastrointestinal pathologies, but also direct access to source documents, providing insight into the decision-making process, a step towards trustworthy and explainable AI. Integrating context-specific data into AI models can support evidence-based clinical decision-making.

RELEVANCE STATEMENT

A context-aware GPT-4 chatbot demonstrates high accuracy in providing differential diagnoses based on imaging descriptions, surpassing the generic GPT-4. It provided formulated rationale and source excerpts supporting the diagnoses, thus enhancing trustworthy decision-support.

KEY POINTS

• Knowledge retrieval enhances differential diagnoses in a gastrointestinal imaging-aware chatbot (GIA-CB). • GIA-CB outperformed the generic counterpart, providing formulated rationale and source excerpts. • GIA-CB has the potential to pave the way for AI-assisted decision support systems.

Collapse

Bhatia A, Khalvati F, Ertl-Wagner BB. Artificial Intelligence in the Future Landscape of Pediatric Neuroradiology: Opportunities and Challenges. AJNR Am J Neuroradiol 2024;45:549-553. [PMID: 38176730 DOI: 10.3174/ajnr.a8086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 10/17/2023] [Indexed: 01/06/2024]

D'Anna G, Van Cauter S, Thurnher M, Van Goethem J, Haller S. Can large language models pass official high-grade exams of the European Society of Neuroradiology courses? A direct comparison between OpenAI chatGPT 3.5, OpenAI GPT4 and Google Bard. Neuroradiology 2024:10.1007/s00234-024-03371-6. [PMID: 38705899 DOI: 10.1007/s00234-024-03371-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 04/30/2024] [Indexed: 05/07/2024]

Kedia N, Sanjeev S, Ong J, Chhablani J. ChatGPT and Beyond: An overview of the growing field of large language models and their use in ophthalmology. Eye (Lond) 2024;38:1252-1261. [PMID: 38172581 PMCID: PMC11076576 DOI: 10.1038/s41433-023-02915-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 11/23/2023] [Accepted: 12/20/2023] [Indexed: 01/05/2024] Open

Tepe M, Emekli E. Assessing the Responses of Large Language Models (ChatGPT-4, Gemini, and Microsoft Copilot) to Frequently Asked Questions in Breast Imaging: A Study on Readability and Accuracy. Cureus 2024;16:e59960. [PMID: 38726360 PMCID: PMC11080394 DOI: 10.7759/cureus.59960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2024] [Indexed: 05/12/2024] Open

Abstract

Background Large language models (LLMs), such as ChatGPT-4, Gemini, and Microsoft Copilot, have been instrumental in various domains, including healthcare, where they enhance health literacy and aid in patient decision-making. Given the complexities involved in breast imaging procedures, accurate and comprehensible information is vital for patient engagement and compliance. This study aims to evaluate the readability and accuracy of the information provided by three prominent LLMs, ChatGPT-4, Gemini, and Microsoft Copilot, in response to frequently asked questions in breast imaging, assessing their potential to improve patient understanding and facilitate healthcare communication. Methodology We collected the most common questions on breast imaging from clinical practice and posed them to LLMs. We then evaluated the responses in terms of readability and accuracy. Responses from LLMs were analyzed for readability using the Flesch Reading Ease and Flesch-Kincaid Grade Level tests and for accuracy through a radiologist-developed Likert-type scale. Results The study found significant variations among LLMs. Gemini and Microsoft Copilot scored higher on readability scales (p < 0.001), indicating their responses were easier to understand. In contrast, ChatGPT-4 demonstrated greater accuracy in its responses (p < 0.001). Conclusions While LLMs such as ChatGPT-4 show promise in providing accurate responses, readability issues may limit their utility in patient education. Conversely, Gemini and Microsoft Copilot, despite being less accurate, are more accessible to a broader patient audience. Ongoing adjustments and evaluations of these models are essential to ensure they meet the diverse needs of patients, emphasizing the need for continuous improvement and oversight in the deployment of artificial intelligence technologies in healthcare.

Collapse

Wessel D, Pogrebnyakov N. Using Social Media as a Source of Real-World Data for Pharmaceutical Drug Development and Regulatory Decision Making. Drug Saf 2024;47:495-511. [PMID: 38446405 PMCID: PMC11018692 DOI: 10.1007/s40264-024-01409-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/07/2024] [Indexed: 03/07/2024]

Abstract

INTRODUCTION

While pharmaceutical companies aim to leverage real-world data (RWD) to bridge the gap between clinical drug development and real-world patient outcomes, extant research has mainly focused on the use of social media in a post-approval safety-surveillance setting. Recent regulatory and technological developments indicate that social media may serve as a rich source to expand the evidence base to pre-approval and drug development activities. However, use cases related to drug development have been largely omitted, thereby missing some of the benefits of RWD. In addition, an applied end-to-end understanding of RWD rooted in both industry and regulations is lacking.

OBJECTIVE

We aimed to investigate how social media can be used as a source of RWD to support regulatory decision making and drug development in the pharmaceutical industry. We aimed to specifically explore the data pipeline and examine how social-media derived RWD can align with regulatory guidance from the US Food and Drug Administration and industry needs.

METHODS

A machine learning pipeline was developed to extract patient insights related to anticoagulants from X (Twitter) data. These findings were then analysed from an industry perspective, and complemented by interviews with professionals from a pharmaceutical company.

RESULTS

The analysis reveals several use cases where RWD derived from social media can be beneficial, particularly in generating hypotheses around patient and therapeutic area needs. We also note certain limitations of social media data, particularly around inferring causality.

CONCLUSIONS

Social media display considerable potential as a source of RWD for guiding efforts in pharmaceutical drug development and pre-approval settings. Although further regulatory guidance on the use of social media for RWD is needed to encourage its use, regulatory and technological developments are suggested to warrant at least exploratory uses for drug development.

Collapse

Dağci M, Çam F, Dost A. Reliability and Quality of the Nursing Care Planning Texts Generated by ChatGPT. Nurse Educ 2024;49:E109-E114. [PMID: 37994523 DOI: 10.1097/nne.0000000000001566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2023]

Yilmaz Muluk S. Enhancing Musculoskeletal Injection Safety: Evaluating Checklists Generated by Artificial Intelligence and Revising the Preformed Checklist. Cureus 2024;16:e59708. [PMID: 38841023 PMCID: PMC11150897 DOI: 10.7759/cureus.59708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/02/2024] [Indexed: 06/07/2024] Open

Abstract

Background Musculoskeletal disorders are a significant global health issue, necessitating advanced management strategies such as intra-articular and extra-articular injections to alleviate pain, inflammation, and mobility challenges. As the adoption of these interventions by physicians grows, the importance of robust safety protocols becomes paramount. This study evaluates the effectiveness of conversational artificial intelligence (AI), particularly versions 3.5 and 4 of Chat Generative Pre-trained Transformer (ChatGPT), in creating patient safety checklists for managing musculoskeletal injections to enhance the preparation of safety documentation. Methodology A quantitative analysis was conducted to evaluate AI-generated safety checklists against a preformed checklist adapted from reputable medical sources. Adherence of the generated checklists to the preformed checklist was calculated and classified. The Wilcoxon signed-rank test was used to assess the performance differences between ChatGPT versions 3.5 and 4. Results ChatGPT-4 showed superior adherence to the preformed checklist compared to ChatGPT-3.5, with both versions classified as very good in safety protocol creation. Although no significant differences were present in the sign-in and sign-out parts of the checklists of both versions, ChatGPT-4 had significantly higher scores in the procedure planning part (p = 0.007), and its overall performance was also higher (p < 0.001). Subsequently, the preformed checklist was revised to incorporate new contributions from ChatGPT. Conclusions ChatGPT, especially version 4, proved effective in generating patient safety checklists for musculoskeletal injections, highlighting the potential of AI to streamline clinical practices. Further enhancements are necessary to fully meet the medical standards.

Collapse

Tippareddy C, Faraji N, Awan OA. The Application of ChatGPT to Enhance Medical Education. Acad Radiol 2024;31:2185-2187. [PMID: 38724132 DOI: 10.1016/j.acra.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 04/14/2023] [Indexed: 06/15/2024]

Pinto DS, Noronha SM, Saigal G, Quencer RM. Comparison of an AI-Generated Case Report With a Human-Written Case Report: Practical Considerations for AI-Assisted Medical Writing. Cureus 2024;16:e60461. [PMID: 38883028 PMCID: PMC11179998 DOI: 10.7759/cureus.60461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 06/18/2024] Open

Abstract

INTRODUCTION

The utility of ChatGPT has recently caused consternation in the medical world. While it has been utilized to write manuscripts, only a few studies have evaluated the quality of manuscripts generated by AI (artificial intelligence).

OBJECTIVE

We evaluate the ability of ChatGPT to write a case report when provided with a framework. We also provide practical considerations for manuscript writing using AI.

METHODS

We compared a manuscript written by a blinded human author (10 years of medical experience) with a manuscript written by ChatGPT on a rare presentation of a common disease. We used multiple iterations of the manuscript generation request to derive the best ChatGPT output. Participants, outcomes, and measures: 22 human reviewers compared the manuscripts using parameters that characterize human writing and relevant standard manuscript assessment criteria, viz., scholarly impact quotient (SIQ). We also compared the manuscripts using the "average perplexity score" (APS), "burstiness score" (BS), and "highest perplexity of a sentence" (GPTZero parameters to detect AI-generated content).

RESULTS

The human manuscript had a significantly higher quality of presentation and nuanced writing (p<0.05). Both manuscripts had a logical flow. 12/22 reviewers were able to identify the AI-generated manuscript (p<0.05), but 4/22 reviewers wrongly identified the human-written manuscript as AI-generated. GPTZero software erroneously identified four sentences of the human-written manuscript to be AI-generated.

CONCLUSION

Though AI showed an ability to highlight the novelty of the case report and project a logical flow comparable to the human manuscript, it could not outperform the human writer on all parameters. The human manuscript showed a better quality of presentation and more nuanced writing. The practical considerations we provide for AI-assisted medical writing will help to better utilize AI in manuscript writing.

Collapse

Brady AP, Allen B, Chong J, Kotter E, Kottler N, Mongan J, Oakden-Rayner L, Dos Santos DP, Tang A, Wald C, Slavotinek J. Developing, Purchasing, Implementing and Monitoring AI Tools in Radiology: Practical Considerations. A Multi-Society Statement From the ACR, CAR, ESR, RANZCR & RSNA. Can Assoc Radiol J 2024;75:226-244. [PMID: 38251882 DOI: 10.1177/08465371231222229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open

Simms RC. Work With ChatGPT, Not Against: 3 Teaching Strategies That Harness the Power of Artificial Intelligence. Nurse Educ 2024;49:158-161. [PMID: 38502607 DOI: 10.1097/nne.0000000000001634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]

Schlussel L, Samaan JS, Chan Y, Chang B, Yeo YH, Ng WH, Rezaie A. Evaluating the accuracy and reproducibility of ChatGPT-4 in answering patient questions related to small intestinal bacterial overgrowth. Artif Intell Gastroenterol 2024;5:90503. [DOI: 10.35712/aig.v5.i1.90503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 03/27/2024] [Accepted: 04/16/2024] [Indexed: 04/29/2024] Open

Abstract

BACKGROUND

Small intestinal bacterial overgrowth (SIBO) poses diagnostic and treatment challenges due to its complex management and evolving guidelines. Patients often seek online information related to their health, prompting interest in large language models, like GPT-4, as potential sources of patient education.

AIM

To investigate ChatGPT-4's accuracy and reproducibility in responding to patient questions related to SIBO.

METHODS

A total of 27 patient questions related to SIBO were curated from professional societies, Facebook groups, and Reddit threads. Each question was entered into GPT-4 twice on separate days to examine reproducibility of accuracy on separate occasions. GPT-4 generated responses were independently evaluated for accuracy and reproducibility by two motility fellowship-trained gastroenterologists. A third senior fellowship-trained gastroenterologist resolved disagreements. Accuracy of responses were graded using the scale: (1) Comprehensive; (2) Correct but inadequate; (3) Some correct and some incorrect; or (4) Completely incorrect. Two responses were generated for every question to evaluate reproducibility in accuracy.

RESULTS

In evaluating GPT-4's effectiveness at answering SIBO-related questions, it provided responses with correct information to 18/27 (66.7%) of questions, with 16/27 (59.3%) of responses graded as comprehensive and 2/27 (7.4%) responses graded as correct but inadequate. The model provided responses with incorrect information to 9/27 (33.3%) of questions, with 4/27 (14.8%) of responses graded as completely incorrect and 5/27 (18.5%) of responses graded as mixed correct and incorrect data. Accuracy varied by question category, with questions related to “basic knowledge” achieving the highest proportion of comprehensive responses (90%) and no incorrect responses. On the other hand, the “treatment” related questions yielded the lowest proportion of comprehensive responses (33.3%) and highest percent of completely incorrect responses (33.3%). A total of 77.8% of questions yielded reproducible responses.

CONCLUSION

Though GPT-4 shows promise as a supplementary tool for SIBO-related patient education, the model requires further refinement and validation in subsequent iterations prior to its integration into patient care.

Collapse

Shiraishi M, Tomioka Y, Miyakuni A, Ishii S, Hori A, Park H, Ohba J, Okazaki M. Performance of ChatGPT in Answering Clinical Questions on the Practical Guideline of Blepharoptosis. Aesthetic Plast Surg 2024:10.1007/s00266-024-04005-1. [PMID: 38684536 DOI: 10.1007/s00266-024-04005-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 03/11/2024] [Indexed: 05/02/2024]

Raman R, Lathabai HH, Mandal S, Das P, Kaur T, Nedungadi P. ChatGPT: Literate or intelligent about UN sustainable development goals? PLoS One 2024;19:e0297521. [PMID: 38656952 PMCID: PMC11042716 DOI: 10.1371/journal.pone.0297521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 01/05/2024] [Indexed: 04/26/2024] Open

Wu C, Chen L, Han M, Li Z, Yang N, Yu C. Application of ChatGPT-based blended medical teaching in clinical education of hepatobiliary surgery. MEDICAL TEACHER 2024:1-5. [PMID: 38614458 DOI: 10.1080/0142159x.2024.2339412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 04/02/2024] [Indexed: 04/15/2024]

Affiliation(s)

Changhao Wu Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China Department of Surgery, Guizhou Medical University, Guiyang, China College of Clinical Medicine, Guizhou Medical University, Guiyang, China Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
Liwen Chen Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China Department of Surgery, Guizhou Medical University, Guiyang, China College of Clinical Medicine, Guizhou Medical University, Guiyang, China Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
Min Han Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China Department of Surgery, Guizhou Medical University, Guiyang, China College of Clinical Medicine, Guizhou Medical University, Guiyang, China Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
Zhu Li Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China Department of Surgery, Guizhou Medical University, Guiyang, China College of Clinical Medicine, Guizhou Medical University, Guiyang, China Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
Nenghong Yang Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China Department of Surgery, Guizhou Medical University, Guiyang, China College of Clinical Medicine, Guizhou Medical University, Guiyang, China Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
Chao Yu Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China Department of Surgery, Guizhou Medical University, Guiyang, China College of Clinical Medicine, Guizhou Medical University, Guiyang, China Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China

Collapse

Wu Y, Zheng Y, Feng B, Yang Y, Kang K, Zhao A. Embracing ChatGPT for Medical Education: Exploring Its Impact on Doctors and Medical Students. JMIR MEDICAL EDUCATION 2024;10:e52483. [PMID: 38598263 PMCID: PMC11043925 DOI: 10.2196/52483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/03/2023] [Accepted: 01/17/2024] [Indexed: 04/11/2024]

Bhayana R, Biswas S, Cook TS, Kim W, Kitamura FC, Gichoya J, Yi PH. From Bench to Bedside With Large Language Models: AJR Expert Panel Narrative Review. AJR Am J Roentgenol 2024. [PMID: 38598354 DOI: 10.2214/ajr.24.30928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]

Gande S, Gould M, Ganti L. Bibliometric analysis of ChatGPT in medicine. Int J Emerg Med 2024;17:50. [PMID: 38575866 PMCID: PMC10993428 DOI: 10.1186/s12245-024-00624-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 03/19/2024] [Indexed: 04/06/2024] Open

Abstract

INTRODUCTION

The emergence of artificial intelligence (AI) chat programs has opened two distinct paths, one enhancing interaction and another potentially replacing personal understanding. Ethical and legal concerns arise due to the rapid development of these programs. This paper investigates academic discussions on AI in medicine, analyzing the context, frequency, and reasons behind these conversations.

METHODS

The study collected data from the Web of Science database on articles containing the keyword "ChatGPT" published from January to September 2023, resulting in 786 medically related journal articles. The inclusion criteria were peer-reviewed articles in English related to medicine.

RESULTS

The United States led in publications (38.1%), followed by India (15.5%) and China (7.0%). Keywords such as "patient" (16.7%), "research" (12%), and "performance" (10.6%) were prevalent. The Cureus Journal of Medical Science (11.8%) had the most publications, followed by the Annals of Biomedical Engineering (8.3%). August 2023 had the highest number of publications (29.3%), with significant growth between February to March and April to May. Medical General Internal (21.0%) was the most common category, followed by Surgery (15.4%) and Radiology (7.9%).

DISCUSSION

The prominence of India in ChatGPT research, despite lower research funding, indicates the platform's popularity and highlights the importance of monitoring its use for potential medical misinformation. China's interest in ChatGPT research suggests a focus on Natural Language Processing (NLP) AI applications, despite public bans on the platform. Cureus' success in publishing ChatGPT articles can be attributed to its open-access, rapid publication model. The study identifies research trends in plastic surgery, radiology, and obstetric gynecology, emphasizing the need for ethical considerations and reliability assessments in the application of ChatGPT in medical practice.

CONCLUSION

ChatGPT's presence in medical literature is growing rapidly across various specialties, but concerns related to safety, privacy, and accuracy persist. More research is needed to assess its suitability for patient care and implications for non-medical use. Skepticism and thorough review of research are essential, as current studies may face retraction as more information emerges.

Collapse

Alanezi F. Examining the role of ChatGPT in promoting health behaviors and lifestyle changes among cancer patients. Nutr Health 2024:2601060241244563. [PMID: 38567408 DOI: 10.1177/02601060241244563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Gertz RJ, Dratsch T, Bunck AC, Lennartz S, Iuga AI, Hellmich MG, Persigehl T, Pennig L, Gietzen CH, Fervers P, Maintz D, Hahnfeldt R, Kottlors J. Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy. Radiology 2024;311:e232714. [PMID: 38625012 DOI: 10.1148/radiol.232714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]

Abstract

Background Errors in radiology reports may occur because of resident-to-attending discrepancies, speech recognition inaccuracies, and large workload. Large language models, such as GPT-4 (ChatGPT; OpenAI), may assist in generating reports. Purpose To assess effectiveness of GPT-4 in identifying common errors in radiology reports, focusing on performance, time, and cost-efficiency. Materials and Methods In this retrospective study, 200 radiology reports (radiography and cross-sectional imaging [CT and MRI]) were compiled between June 2023 and December 2023 at one institution. There were 150 errors from five common error categories (omission, insertion, spelling, side confusion, and other) intentionally inserted into 100 of the reports and used as the reference standard. Six radiologists (two senior radiologists, two attending physicians, and two residents) and GPT-4 were tasked with detecting these errors. Overall error detection performance, error detection in the five error categories, and reading time were assessed using Wald χ2 tests and paired-sample t tests. Results GPT-4 (detection rate, 82.7%;124 of 150; 95% CI: 75.8, 87.9) matched the average detection performance of radiologists independent of their experience (senior radiologists, 89.3% [134 of 150; 95% CI: 83.4, 93.3]; attending physicians, 80.0% [120 of 150; 95% CI: 72.9, 85.6]; residents, 80.0% [120 of 150; 95% CI: 72.9, 85.6]; P value range, .522-.99). One senior radiologist outperformed GPT-4 (detection rate, 94.7%; 142 of 150; 95% CI: 89.8, 97.3; P = .006). GPT-4 required less processing time per radiology report than the fastest human reader in the study (mean reading time, 3.5 seconds ± 0.5 [SD] vs 25.1 seconds ± 20.1, respectively; P < .001; Cohen d = -1.08). The use of GPT-4 resulted in lower mean correction cost per report than the most cost-efficient radiologist ($0.03 ± 0.01 vs $0.42 ± 0.41; P < .001; Cohen d = -1.12). Conclusion The radiology report error detection rate of GPT-4 was comparable with that of radiologists, potentially reducing work hours and cost. © RSNA, 2024 See also the editorial by Forman in this issue.

Collapse

Affiliation(s)

Roman Johannes Gertz From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Thomas Dratsch From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Alexander Christian Bunck From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Simon Lennartz From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Andra-Iza Iuga From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Martin Gunnar Hellmich From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Thorsten Persigehl From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Lenhard Pennig From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Carsten Herbert Gietzen From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Philipp Fervers From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
David Maintz From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Robert Hahnfeldt From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Jonathan Kottlors From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany

Collapse

van Diessen E, van Amerongen RA, Zijlmans M, Otte WM. Potential merits and flaws of large language models in epilepsy care: A critical review. Epilepsia 2024;65:873-886. [PMID: 38305763 DOI: 10.1111/epi.17907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 12/30/2023] [Accepted: 01/19/2024] [Indexed: 02/03/2024]

Mira FA, Favier V, Dos Santos Sobreira Nunes H, de Castro JV, Carsuzaa F, Meccariello G, Vicini C, De Vito A, Lechien JR, Chiesa-Estomba C, Maniaci A, Iannella G, Rojas EP, Cornejo JB, Cammaroto G. Chat GPT for the management of obstructive sleep apnea: do we have a polar star? Eur Arch Otorhinolaryngol 2024;281:2087-2093. [PMID: 37980605 DOI: 10.1007/s00405-023-08270-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 09/29/2023] [Indexed: 11/21/2023]

Abstract

PURPOSE

This study explores the potential of the Chat-Generative Pre-Trained Transformer (Chat-GPT), a Large Language Model (LLM), in assisting healthcare professionals in the diagnosis of obstructive sleep apnea (OSA). It aims to assess the agreement between Chat-GPT's responses and those of expert otolaryngologists, shedding light on the role of AI-generated content in medical decision-making.

METHODS

A prospective, cross-sectional study was conducted, involving 350 otolaryngologists from 25 countries who responded to a specialized OSA survey. Chat-GPT was tasked with providing answers to the same survey questions. Responses were assessed by both super-experts and statistically analyzed for agreement.

RESULTS

The study revealed that Chat-GPT and expert responses shared a common answer in over 75% of cases for individual questions. However, the overall consensus was achieved in only four questions. Super-expert assessments showed a moderate agreement level, with Chat-GPT scoring slightly lower than experts. Statistically, Chat-GPT's responses differed significantly from experts' opinions (p = 0.0009). Sub-analysis revealed areas of improvement for Chat-GPT, particularly in questions where super-experts rated its responses lower than expert consensus.

CONCLUSIONS

Chat-GPT demonstrates potential as a valuable resource for OSA diagnosis, especially where access to specialists is limited. The study emphasizes the importance of AI-human collaboration, with Chat-GPT serving as a complementary tool rather than a replacement for medical professionals. This research contributes to the discourse in otolaryngology and encourages further exploration of AI-driven healthcare applications. While Chat-GPT exhibits a commendable level of consensus with expert responses, ongoing refinements in AI-based healthcare tools hold significant promise for the future of medicine, addressing the underdiagnosis and undertreatment of OSA and improving patient outcomes.

Collapse

Affiliation(s)

Felipe Ahumada Mira ENT Department, Hospital of Linares, Linares, Chile Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Valentin Favier ENT Department, University Hospital of Montpellier, Montpellier, France Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Heloisa Dos Santos Sobreira Nunes ENT and Sleep Medicine Department, Nucleus of Otolaryngology, Head and Neck Surgery and Sleep Medicine of São Paulo, São Paulo, Brazil Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Joana Vaz de Castro ENT Department, Armed Forces Hospital, Lisbon, Portugal Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Florent Carsuzaa ENT Department, University Hospital of Poitiers, Poitiers, France Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Giuseppe Meccariello Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy
Claudio Vicini Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy
Andrea De Vito Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy
Jerome R Lechien Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology and Head and Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons, Mons, Belgium Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Carlos Chiesa-Estomba Department of Otorhinolaryngology, Biodonostia Research Institute, Donostia University Hospital, Osakidetza, 20014, San Sebastian, Spain Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Antonino Maniaci Department of Medical and Surgical Sciences and Advanced Technologies "GF Ingrassia", ENT Section, University of Catania, Piazza Università 2, 95100, Catania, Italy Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Giannicola Iannella Department of 'Organi di Senso', University "Sapienza", Viale Dell'Università 33, 00185, Rome, Italy Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France
Eduardo Peña Rojas Clínica Lircay, Talca, Chile
Jenifer Barros Cornejo Hospital Clínico UC Christus, Santiago, Chile
Giovanni Cammaroto Head and Neck Department, ENT & Oral Surgery Unity, G.B. Morgagni, L. Pierantoni Hospital, Via Forlanini, 47121, Forlì, Italy. Young Otolaryngologists-International Federations of Oto-Rhinolaryngological Societies (YO-IFOS), Paris, France.

Collapse

Ni Z, Peng R, Zheng X, Xie P. Embracing the future: Integrating ChatGPT into China's nursing education system. Int J Nurs Sci 2024;11:295-299. [PMID: 38707690 PMCID: PMC11064564 DOI: 10.1016/j.ijnss.2024.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 02/13/2024] [Accepted: 03/06/2024] [Indexed: 05/07/2024] Open

Bajaj S, Gandhi D, Nayar D. Potential Applications and Impact of ChatGPT in Radiology. Acad Radiol 2024;31:1256-1261. [PMID: 37802673 DOI: 10.1016/j.acra.2023.08.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/15/2023] [Accepted: 08/28/2023] [Indexed: 10/08/2023]