Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bi AS. What's Important: The Next Academic-ChatGPT AI? J Bone Joint Surg Am 2023:00004623-990000000-00788. [PMID: 37083839 DOI: 10.2106/jbjs.23.00269] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/22/2023]

For:	Bi AS. What's Important: The Next Academic-ChatGPT AI? J Bone Joint Surg Am 2023:00004623-990000000-00788. [PMID: 37083839 DOI: 10.2106/jbjs.23.00269] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/22/2023]

Number

Cited by Other Article(s)

Madaudo C, Parlati ALM, Di Lisi D, Carluccio R, Sucato V, Vadalà G, Nardi E, Macaione F, Cannata A, Manzullo N, Santoro C, Iervolino A, D'Angelo F, Marzano F, Basile C, Gargiulo P, Corrado E, Paolillo S, Novo G, Galassi AR, Filardi PP. Artificial intelligence in cardiology: a peek at the future and the role of ChatGPT in cardiology practice. J Cardiovasc Med (Hagerstown) 2024;25:766-771. [PMID: 39347723 DOI: 10.2459/jcm.0000000000001664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 08/19/2024] [Indexed: 10/01/2024]

Affiliation(s)

Cristina Madaudo Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo Department of Cardiovascular Sciences, British Heart Foundation Centre of Research Excellence, School of Cardiovascular Medicine, Faculty of Life Sciences and Medicine, King's College London, The James Black Centre, 125 Coldharbour Lane, London, UK
Antonio Luca Maria Parlati Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy Department of Cardiovascular Sciences, British Heart Foundation Centre of Research Excellence, School of Cardiovascular Medicine, Faculty of Life Sciences and Medicine, King's College London, The James Black Centre, 125 Coldharbour Lane, London, UK
Daniela Di Lisi Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Raffaele Carluccio Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Vincenzo Sucato Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Giuseppe Vadalà Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Ermanno Nardi Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Francesca Macaione Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Antonio Cannata Department of Cardiovascular Sciences, British Heart Foundation Centre of Research Excellence, School of Cardiovascular Medicine, Faculty of Life Sciences and Medicine, King's College London, The James Black Centre, 125 Coldharbour Lane, London, UK
Nilla Manzullo Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Ciro Santoro Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Adelaide Iervolino Department of Clinical Medicine and Surgery, University of Naples Federico II, Naples, Italy
Federica D'Angelo Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Federica Marzano Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Christian Basile Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Paola Gargiulo Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Egle Corrado Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Stefania Paolillo Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
Giuseppina Novo Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Alfredo Ruggero Galassi Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
Pasquale Perrone Filardi Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy

Collapse

Kim HJ, Yang JH, Chang DG, Lenke LG, Pizones J, Castelein R, Watanabe K, Trobisch PD, Mundis GM, Suh SW, Suk SI. Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis. J Med Internet Res 2024;26:e52001. [PMID: 38924787 PMCID: PMC11237793 DOI: 10.2196/52001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 01/15/2024] [Accepted: 04/26/2024] [Indexed: 06/28/2024] Open

Abstract

BACKGROUND

Due to recent advances in artificial intelligence (AI), language model applications can generate logical text output that is difficult to distinguish from human writing. ChatGPT (OpenAI) and Bard (subsequently rebranded as "Gemini"; Google AI) were developed using distinct approaches, but little has been studied about the difference in their capability to generate the abstract. The use of AI to write scientific abstracts in the field of spine surgery is the center of much debate and controversy.

OBJECTIVE

The objective of this study is to assess the reproducibility of the structured abstracts generated by ChatGPT and Bard compared to human-written abstracts in the field of spine surgery.

METHODS

In total, 60 abstracts dealing with spine sections were randomly selected from 7 reputable journals and used as ChatGPT and Bard input statements to generate abstracts based on supplied paper titles. A total of 174 abstracts, divided into human-written abstracts, ChatGPT-generated abstracts, and Bard-generated abstracts, were evaluated for compliance with the structured format of journal guidelines and consistency of content. The likelihood of plagiarism and AI output was assessed using the iThenticate and ZeroGPT programs, respectively. A total of 8 reviewers in the spinal field evaluated 30 randomly extracted abstracts to determine whether they were produced by AI or human authors.

RESULTS

The proportion of abstracts that met journal formatting guidelines was greater among ChatGPT abstracts (34/60, 56.6%) compared with those generated by Bard (6/54, 11.1%; P<.001). However, a higher proportion of Bard abstracts (49/54, 90.7%) had word counts that met journal guidelines compared with ChatGPT abstracts (30/60, 50%; P<.001). The similarity index was significantly lower among ChatGPT-generated abstracts (20.7%) compared with Bard-generated abstracts (32.1%; P<.001). The AI-detection program predicted that 21.7% (13/60) of the human group, 63.3% (38/60) of the ChatGPT group, and 87% (47/54) of the Bard group were possibly generated by AI, with an area under the curve value of 0.863 (P<.001). The mean detection rate by human reviewers was 53.8% (SD 11.2%), achieving a sensitivity of 56.3% and a specificity of 48.4%. A total of 56.3% (63/112) of the actual human-written abstracts and 55.9% (62/128) of AI-generated abstracts were recognized as human-written and AI-generated by human reviewers, respectively.

CONCLUSIONS

Both ChatGPT and Bard can be used to help write abstracts, but most AI-generated abstracts are currently considered unethical due to high plagiarism and AI-detection rates. ChatGPT-generated abstracts appear to be superior to Bard-generated abstracts in meeting journal formatting guidelines. Because humans are unable to accurately distinguish abstracts written by humans from those produced by AI programs, it is crucial to exercise special caution and examine the ethical boundaries of using AI programs, including ChatGPT and Bard.

Collapse

Yao JJ, Aggarwal M, Lopez RD, Namdari S. Current Concepts Review: Large Language Models in Orthopaedics: Definitions, Uses, and Limitations. J Bone Joint Surg Am 2024:00004623-990000000-01136. [PMID: 38896652 DOI: 10.2106/jbjs.23.01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]

Warren E, Hurley ET, Park CN, Crook BS, Lorentz S, Levin JM, Anakwenze O, MacDonald PB, Klifto CS. Evaluation of information from artificial intelligence on rotator cuff repair surgery. JSES Int 2024;8:53-57. [PMID: 38312282 PMCID: PMC10837709 DOI: 10.1016/j.jseint.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024] Open

Abstract

Purpose

The purpose of this study was to analyze the quality and readability of information regarding rotator cuff repair surgery available using an online AI software.

Methods

An open AI model (ChatGPT) was used to answer 24 commonly asked questions from patients on rotator cuff repair. Questions were stratified into one of three categories based on the Rothwell classification system: fact, policy, or value. The answers for each category were evaluated for reliability, quality and readability using The Journal of the American Medical Association Benchmark criteria, DISCERN score, Flesch-Kincaid Reading Ease Score and Grade Level.

Results

The Journal of the American Medical Association Benchmark criteria score for all three categories was 0, which is the lowest score indicating no reliable resources cited. The DISCERN score was 51 for fact, 53 for policy, and 55 for value questions, all of which are considered good scores. Across question categories, the reliability portion of the DISCERN score was low, due to a lack of resources. The Flesch-Kincaid Reading Ease Score (and Flesch-Kincaid Grade Level) was 48.3 (10.3) for the fact class, 42.0 (10.9) for the policy class, and 38.4 (11.6) for the value class.

Conclusion

The quality of information provided by the open AI chat system was generally high across all question types but had significant shortcomings in reliability due to the absence of source material citations. The DISCERN scores of the AI generated responses matched or exceeded previously published results of studies evaluating the quality of online information about rotator cuff repairs. The responses were U.S. 10th grade or higher reading level which is above the AMA and NIH recommendation of 6th grade reading level for patient materials. The AI software commonly referred the user to seek advice from orthopedic surgeons to improve their chances of a successful outcome.

Collapse

Giorgino R, Alessandri-Bonetti M, Luca A, Migliorini F, Rossi N, Peretti GM, Mangiavini L. ChatGPT in orthopedics: a narrative review exploring the potential of artificial intelligence in orthopedic practice. Front Surg 2023;10:1284015. [PMID: 38026475 PMCID: PMC10654618 DOI: 10.3389/fsurg.2023.1284015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open

Ghanem D, Covarrubias O, Raad M, LaPorte D, Shafiq B. ChatGPT Performs at the Level of a Third-Year Orthopaedic Surgery Resident on the Orthopaedic In-Training Examination. JB JS Open Access 2023;8:e23.00103. [PMID: 38638869 PMCID: PMC11025881 DOI: 10.2106/jbjs.oa.23.00103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/20/2024] Open

Abstract

Introduction

Publicly available AI language models such as ChatGPT have demonstrated utility in text generation and even problem-solving when provided with clear instructions. Amidst this transformative shift, the aim of this study is to assess ChatGPT's performance on the orthopaedic surgery in-training examination (OITE).

Methods

All 213 OITE 2021 web-based questions were retrieved from the AAOS-ResStudy website (https://www.aaos.org/education/examinations/ResStudy). Two independent reviewers copied and pasted the questions and response options into ChatGPT Plus (version 4.0) and recorded the generated answers. All media-containing questions were flagged and carefully examined. Twelve OITE media-containing questions that relied purely on images (clinical pictures, radiographs, MRIs, CT scans) and could not be rationalized from the clinical presentation were excluded. Cohen's Kappa coefficient was used to examine the agreement of ChatGPT-generated responses between reviewers. Descriptive statistics were used to summarize the performance (% correct) of ChatGPT Plus. The 2021 norm table was used to compare ChatGPT Plus' performance on the OITE to national orthopaedic surgery residents in that same year.

Results

A total of 201 questions were evaluated by ChatGPT Plus. Excellent agreement was observed between raters for the 201 ChatGPT-generated responses, with a Cohen's Kappa coefficient of 0.947. 45.8% (92/201) were media-containing questions. ChatGPT had an average overall score of 61.2% (123/201). Its score was 64.2% (70/109) on non-media questions. When compared to the performance of all national orthopaedic surgery residents in 2021, ChatGPT Plus performed at the level of an average PGY3.

Discussion

ChatGPT Plus is able to pass the OITE with an overall score of 61.2%, ranking at the level of a third-year orthopaedic surgery resident. It provided logical reasoning and justifications that may help residents improve their understanding of OITE cases and general orthopaedic principles. Further studies are still needed to examine their efficacy and impact on long-term learning and OITE/ABOS performance.

Collapse