1
|
Madaudo C, Parlati ALM, Di Lisi D, Carluccio R, Sucato V, Vadalà G, Nardi E, Macaione F, Cannata A, Manzullo N, Santoro C, Iervolino A, D'Angelo F, Marzano F, Basile C, Gargiulo P, Corrado E, Paolillo S, Novo G, Galassi AR, Filardi PP. Artificial intelligence in cardiology: a peek at the future and the role of ChatGPT in cardiology practice. J Cardiovasc Med (Hagerstown) 2024; 25:766-771. [PMID: 39347723 DOI: 10.2459/jcm.0000000000001664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 08/19/2024] [Indexed: 10/01/2024]
Abstract
Artificial intelligence has increasingly become an integral part of our daily activities. ChatGPT, a natural language processing technology developed by OpenAI, is widely used in various industries, including healthcare. The application of ChatGPT in healthcare is still evolving, with studies exploring its potential in clinical decision-making, patient education, workflow optimization, and scientific literature. ChatGPT could be exploited in the medical field to improve patient education and information, thus increasing compliance. ChatGPT could facilitate information exchange on major cardiovascular diseases, provide clinical decision support, and improve patient communication and education. It could assist the clinician in differential diagnosis, suggest appropriate imaging modalities, and optimize treatment plans based on evidence-based guidelines. However, it is unclear whether it will be possible to use ChatGPT for the management of patients who require rapid decisions. Indeed, many drawbacks are associated with the daily use of these technologies in the medical field, such as insufficient expertise in specialized fields and a lack of comprehension of the context in which it works. The pros and cons of its use have been explored in this review, which was not written with the help of ChatGPT.
Collapse
Affiliation(s)
- Cristina Madaudo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
- Department of Cardiovascular Sciences, British Heart Foundation Centre of Research Excellence, School of Cardiovascular Medicine, Faculty of Life Sciences and Medicine, King's College London, The James Black Centre, 125 Coldharbour Lane, London, UK
| | - Antonio Luca Maria Parlati
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
- Department of Cardiovascular Sciences, British Heart Foundation Centre of Research Excellence, School of Cardiovascular Medicine, Faculty of Life Sciences and Medicine, King's College London, The James Black Centre, 125 Coldharbour Lane, London, UK
| | - Daniela Di Lisi
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Raffaele Carluccio
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Vincenzo Sucato
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Giuseppe Vadalà
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Ermanno Nardi
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Francesca Macaione
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Antonio Cannata
- Department of Cardiovascular Sciences, British Heart Foundation Centre of Research Excellence, School of Cardiovascular Medicine, Faculty of Life Sciences and Medicine, King's College London, The James Black Centre, 125 Coldharbour Lane, London, UK
| | - Nilla Manzullo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Ciro Santoro
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Adelaide Iervolino
- Department of Clinical Medicine and Surgery, University of Naples Federico II, Naples, Italy
| | - Federica D'Angelo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Federica Marzano
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Christian Basile
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Paola Gargiulo
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Egle Corrado
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Stefania Paolillo
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Giuseppina Novo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | - Alfredo Ruggero Galassi
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Cardiology Unit, University of Palermo, University Hospital P. Giaccone, Palermo
| | | |
Collapse
|
2
|
Kim HJ, Yang JH, Chang DG, Lenke LG, Pizones J, Castelein R, Watanabe K, Trobisch PD, Mundis GM, Suh SW, Suk SI. Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis. J Med Internet Res 2024; 26:e52001. [PMID: 38924787 PMCID: PMC11237793 DOI: 10.2196/52001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 01/15/2024] [Accepted: 04/26/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Due to recent advances in artificial intelligence (AI), language model applications can generate logical text output that is difficult to distinguish from human writing. ChatGPT (OpenAI) and Bard (subsequently rebranded as "Gemini"; Google AI) were developed using distinct approaches, but little has been studied about the difference in their capability to generate the abstract. The use of AI to write scientific abstracts in the field of spine surgery is the center of much debate and controversy. OBJECTIVE The objective of this study is to assess the reproducibility of the structured abstracts generated by ChatGPT and Bard compared to human-written abstracts in the field of spine surgery. METHODS In total, 60 abstracts dealing with spine sections were randomly selected from 7 reputable journals and used as ChatGPT and Bard input statements to generate abstracts based on supplied paper titles. A total of 174 abstracts, divided into human-written abstracts, ChatGPT-generated abstracts, and Bard-generated abstracts, were evaluated for compliance with the structured format of journal guidelines and consistency of content. The likelihood of plagiarism and AI output was assessed using the iThenticate and ZeroGPT programs, respectively. A total of 8 reviewers in the spinal field evaluated 30 randomly extracted abstracts to determine whether they were produced by AI or human authors. RESULTS The proportion of abstracts that met journal formatting guidelines was greater among ChatGPT abstracts (34/60, 56.6%) compared with those generated by Bard (6/54, 11.1%; P<.001). However, a higher proportion of Bard abstracts (49/54, 90.7%) had word counts that met journal guidelines compared with ChatGPT abstracts (30/60, 50%; P<.001). The similarity index was significantly lower among ChatGPT-generated abstracts (20.7%) compared with Bard-generated abstracts (32.1%; P<.001). The AI-detection program predicted that 21.7% (13/60) of the human group, 63.3% (38/60) of the ChatGPT group, and 87% (47/54) of the Bard group were possibly generated by AI, with an area under the curve value of 0.863 (P<.001). The mean detection rate by human reviewers was 53.8% (SD 11.2%), achieving a sensitivity of 56.3% and a specificity of 48.4%. A total of 56.3% (63/112) of the actual human-written abstracts and 55.9% (62/128) of AI-generated abstracts were recognized as human-written and AI-generated by human reviewers, respectively. CONCLUSIONS Both ChatGPT and Bard can be used to help write abstracts, but most AI-generated abstracts are currently considered unethical due to high plagiarism and AI-detection rates. ChatGPT-generated abstracts appear to be superior to Bard-generated abstracts in meeting journal formatting guidelines. Because humans are unable to accurately distinguish abstracts written by humans from those produced by AI programs, it is crucial to exercise special caution and examine the ethical boundaries of using AI programs, including ChatGPT and Bard.
Collapse
Affiliation(s)
- Hong Jin Kim
- Department of Orthopedic Surgery, Inje University Sanggye Paik Hospital, College of Medicine, Inje University, Seoul, Republic of Korea
| | - Jae Hyuk Yang
- Department of Orthopedic Surgery, Korea University Anam Hospital, College of Medicine, Korea University, Seoul, Republic of Korea
| | - Dong-Gune Chang
- Department of Orthopedic Surgery, Inje University Sanggye Paik Hospital, College of Medicine, Inje University, Seoul, Republic of Korea
| | - Lawrence G Lenke
- Department of Orthopedic Surgery, The Daniel and Jane Och Spine Hospital, Columbia University, New York, NY, United States
| | - Javier Pizones
- Department of Orthopedic Surgery, Hospital Universitario La Paz, Madrid, Spain
| | - René Castelein
- Department of Orthopedic Surgery, University Medical Centre Utrecht, Utrecht, Netherlands
| | - Kota Watanabe
- Department of Orthopedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Per D Trobisch
- Department of Spine Surgery, Eifelklinik St. Brigida, Simmerath, Germany
| | - Gregory M Mundis
- Department of Orthopaedic Surgery, Scripps Clinic, La Jolla, CA, United States
| | - Seung Woo Suh
- Department of Orthopedic Surgery, Korea University Guro Hospital, College of Medicine, Korea University, Seoul, Republic of Korea
| | - Se-Il Suk
- Department of Orthopedic Surgery, Inje University Sanggye Paik Hospital, College of Medicine, Inje University, Seoul, Republic of Korea
| |
Collapse
|
3
|
Yao JJ, Aggarwal M, Lopez RD, Namdari S. Current Concepts Review: Large Language Models in Orthopaedics: Definitions, Uses, and Limitations. J Bone Joint Surg Am 2024:00004623-990000000-01136. [PMID: 38896652 DOI: 10.2106/jbjs.23.01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
➤ Large language models are a subset of artificial intelligence. Large language models are powerful tools that excel in natural language text processing and generation.➤ There are many potential clinical, research, and educational applications of large language models in orthopaedics, but the development of these applications needs to be focused on patient safety and the maintenance of high standards.➤ There are numerous methodological, ethical, and regulatory concerns with regard to the use of large language models. Orthopaedic surgeons need to be aware of the controversies and advocate for an alignment of these models with patient and caregiver priorities.
Collapse
Affiliation(s)
- Jie J Yao
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| | | | - Ryan D Lopez
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Surena Namdari
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| |
Collapse
|
4
|
Warren E, Hurley ET, Park CN, Crook BS, Lorentz S, Levin JM, Anakwenze O, MacDonald PB, Klifto CS. Evaluation of information from artificial intelligence on rotator cuff repair surgery. JSES Int 2024; 8:53-57. [PMID: 38312282 PMCID: PMC10837709 DOI: 10.1016/j.jseint.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024] Open
Abstract
Purpose The purpose of this study was to analyze the quality and readability of information regarding rotator cuff repair surgery available using an online AI software. Methods An open AI model (ChatGPT) was used to answer 24 commonly asked questions from patients on rotator cuff repair. Questions were stratified into one of three categories based on the Rothwell classification system: fact, policy, or value. The answers for each category were evaluated for reliability, quality and readability using The Journal of the American Medical Association Benchmark criteria, DISCERN score, Flesch-Kincaid Reading Ease Score and Grade Level. Results The Journal of the American Medical Association Benchmark criteria score for all three categories was 0, which is the lowest score indicating no reliable resources cited. The DISCERN score was 51 for fact, 53 for policy, and 55 for value questions, all of which are considered good scores. Across question categories, the reliability portion of the DISCERN score was low, due to a lack of resources. The Flesch-Kincaid Reading Ease Score (and Flesch-Kincaid Grade Level) was 48.3 (10.3) for the fact class, 42.0 (10.9) for the policy class, and 38.4 (11.6) for the value class. Conclusion The quality of information provided by the open AI chat system was generally high across all question types but had significant shortcomings in reliability due to the absence of source material citations. The DISCERN scores of the AI generated responses matched or exceeded previously published results of studies evaluating the quality of online information about rotator cuff repairs. The responses were U.S. 10th grade or higher reading level which is above the AMA and NIH recommendation of 6th grade reading level for patient materials. The AI software commonly referred the user to seek advice from orthopedic surgeons to improve their chances of a successful outcome.
Collapse
Affiliation(s)
- Eric Warren
- Duke University School of Medicine, Duke University, Durham, NC, USA
| | - Eoghan T. Hurley
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Caroline N. Park
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Bryan S. Crook
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Samuel Lorentz
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Jay M. Levin
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Oke Anakwenze
- Department of Orthopaedic Surgery, Duke University, Durham, NC, USA
| | - Peter B. MacDonald
- Section of Orthopaedic Surgery & The Pan Am Clinic, University of Manitoba, Winnipeg, MB, Canada
| | | |
Collapse
|
5
|
Giorgino R, Alessandri-Bonetti M, Luca A, Migliorini F, Rossi N, Peretti GM, Mangiavini L. ChatGPT in orthopedics: a narrative review exploring the potential of artificial intelligence in orthopedic practice. Front Surg 2023; 10:1284015. [PMID: 38026475 PMCID: PMC10654618 DOI: 10.3389/fsurg.2023.1284015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
The field of orthopedics faces complex challenges requiring quick and intricate decisions, with patient education and compliance playing crucial roles in treatment outcomes. Technological advancements in artificial intelligence (AI) can potentially enhance orthopedic care. ChatGPT, a natural language processing technology developed by OpenAI, has shown promise in various sectors, including healthcare. ChatGPT can facilitate patient information exchange in orthopedics, provide clinical decision support, and improve patient communication and education. It can assist in differential diagnosis, suggest appropriate imaging modalities, and optimize treatment plans based on evidence-based guidelines. However, ChatGPT has limitations, such as insufficient expertise in specialized domains and a lack of contextual understanding. The application of ChatGPT in orthopedics is still evolving, with studies exploring its potential in clinical decision-making, patient education, workflow optimization, and scientific literature. The results indicate both the benefits and limitations of ChatGPT, emphasizing the need for caution, ethical considerations, and human oversight. Addressing training data quality, biases, data privacy, and accountability challenges is crucial for responsible implementation. While ChatGPT has the potential to transform orthopedic healthcare, further research and development are necessary to ensure its reliability, accuracy, and ethical use in patient care.
Collapse
Affiliation(s)
- Riccardo Giorgino
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Residency Program in Orthopedics and Traumatology, University of Milan, Milan, Italy
| | - Mario Alessandri-Bonetti
- Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, United States
| | - Andrea Luca
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
| | - Filippo Migliorini
- Department of Orthopaedic, Trauma, and Reconstructive Surgery, RWTH University Medical Centre, Aachen, Germany
- Department of Orthopedics and Trauma Surgery, Academic Hospital of Bolzano (SABES-ASDAA), Teaching Hospital of the Paracelsus Medical University, Bolzano, Italy
| | - Nicolò Rossi
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Residency Program in Orthopedics and Traumatology, University of Milan, Milan, Italy
| | - Giuseppe M. Peretti
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Dipartimento di Scienze Biomediche per la Salute, Università degli Studi di Milano, Milan, Italy
| | - Laura Mangiavini
- IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
- Dipartimento di Scienze Biomediche per la Salute, Università degli Studi di Milano, Milan, Italy
| |
Collapse
|
6
|
Ghanem D, Covarrubias O, Raad M, LaPorte D, Shafiq B. ChatGPT Performs at the Level of a Third-Year Orthopaedic Surgery Resident on the Orthopaedic In-Training Examination. JB JS Open Access 2023; 8:e23.00103. [PMID: 38638869 PMCID: PMC11025881 DOI: 10.2106/jbjs.oa.23.00103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/20/2024] Open
Abstract
Introduction Publicly available AI language models such as ChatGPT have demonstrated utility in text generation and even problem-solving when provided with clear instructions. Amidst this transformative shift, the aim of this study is to assess ChatGPT's performance on the orthopaedic surgery in-training examination (OITE). Methods All 213 OITE 2021 web-based questions were retrieved from the AAOS-ResStudy website (https://www.aaos.org/education/examinations/ResStudy). Two independent reviewers copied and pasted the questions and response options into ChatGPT Plus (version 4.0) and recorded the generated answers. All media-containing questions were flagged and carefully examined. Twelve OITE media-containing questions that relied purely on images (clinical pictures, radiographs, MRIs, CT scans) and could not be rationalized from the clinical presentation were excluded. Cohen's Kappa coefficient was used to examine the agreement of ChatGPT-generated responses between reviewers. Descriptive statistics were used to summarize the performance (% correct) of ChatGPT Plus. The 2021 norm table was used to compare ChatGPT Plus' performance on the OITE to national orthopaedic surgery residents in that same year. Results A total of 201 questions were evaluated by ChatGPT Plus. Excellent agreement was observed between raters for the 201 ChatGPT-generated responses, with a Cohen's Kappa coefficient of 0.947. 45.8% (92/201) were media-containing questions. ChatGPT had an average overall score of 61.2% (123/201). Its score was 64.2% (70/109) on non-media questions. When compared to the performance of all national orthopaedic surgery residents in 2021, ChatGPT Plus performed at the level of an average PGY3. Discussion ChatGPT Plus is able to pass the OITE with an overall score of 61.2%, ranking at the level of a third-year orthopaedic surgery resident. It provided logical reasoning and justifications that may help residents improve their understanding of OITE cases and general orthopaedic principles. Further studies are still needed to examine their efficacy and impact on long-term learning and OITE/ABOS performance.
Collapse
Affiliation(s)
- Diane Ghanem
- Department of Orthopaedic Surgery, The Johns Hopkins Hospital, Baltimore, Maryland
| | - Oscar Covarrubias
- School of Medicine, The Johns Hopkins University, Baltimore, Maryland
| | - Micheal Raad
- Department of Orthopaedic Surgery, The Johns Hopkins Hospital, Baltimore, Maryland
| | - Dawn LaPorte
- Department of Orthopaedic Surgery, The Johns Hopkins Hospital, Baltimore, Maryland
| | - Babar Shafiq
- Department of Orthopaedic Surgery, The Johns Hopkins Hospital, Baltimore, Maryland
| |
Collapse
|