1
|
AlShehri Y, McConkey M, Lodhia P. ChatGPT has Educational Potential: Assessing ChatGPT Responses to Common Patient Hip Arthroscopy Questions. Arthroscopy 2024:S0749-8063(24)00452-3. [PMID: 38914299 DOI: 10.1016/j.arthro.2024.06.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/24/2024] [Accepted: 06/09/2024] [Indexed: 06/26/2024]
Abstract
PURPOSE To assess the ability of ChatGPT to answer common patient questions regarding hip arthroscopy, and to analyze the accuracy and appropriateness of its responses. METHODS Ten questions were selected from well-known patient education websites, and ChatGPT (version 3.5) responses to these questions were graded by two fellowship-trained hip preservation surgeons. Responses were analyzed, compared to the current literature, and graded from A to D (A being the highest, and D being the lowest) in a grading scale based on the accuracy and completeness of the response. If the grading differed between the two surgeons, a consensus was reached. Inter-rater agreement was calculated. The readability of responses was also assessed using the Flesch-Kincaid Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL). RESULTS Responses received the following consensus grades: A (50%, n=5), B (30%, n=3), C (10%, n=1), D (10%, n=1) (Table 2). Inter-rater agreement based on initial individual grading was 30%. The mean FRES was 28.2 (SD± 9.2), corresponding to a college graduate level, ranging from 11.7 to 42.5. The mean FKGL was 14.4 (SD±1.8), ranging from 12.1 to 18, indicating a college student reading level. CONCLUSION ChatGPT can answer common patient questions regarding hip arthroscopy with satisfactory accuracy graded by two high-volume hip arthroscopists, however, incorrect information was identified in more than one instance. Caution must be observed when using ChatGPT for patient education related to hip arthroscopy. CLINICAL RELEVANCE Given the increasing number of hip arthroscopies being performed annually, ChatGPT has the potential to aid physicians in educating their patients about this procedure and address any questions they may have.
Collapse
Affiliation(s)
- Yasir AlShehri
- Department of Orthopaedics, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada; Department of Orthopedics, College of Medicine, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | - Mark McConkey
- Department of Orthopaedics, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada
| | - Parth Lodhia
- Department of Orthopaedics, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
2
|
Batool I, Naved N, Kazmi SMR, Umer F. Leveraging Large Language Models in the delivery of post-operative dental care: a comparison between an embedded GPT model and ChatGPT. BDJ Open 2024; 10:48. [PMID: 38866751 PMCID: PMC11169374 DOI: 10.1038/s41405-024-00226-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/01/2024] [Accepted: 05/07/2024] [Indexed: 06/14/2024] Open
Abstract
OBJECTIVE This study underscores the transformative role of Artificial Intelligence (AI) in healthcare, particularly the promising applications of Large Language Models (LLMs) in the delivery of post-operative dental care. The aim is to evaluate the performance of an embedded GPT model and its comparison with ChatGPT-3.5 turbo. The assessment focuses on aspects like response accuracy, clarity, relevance, and up-to-date knowledge in addressing patient concerns and facilitating informed decision-making. MATERIAL AND METHODS An embedded GPT model, employing GPT-3.5-16k, was crafted via GPT-trainer to answer postoperative questions in four dental specialties including Operative Dentistry & Endodontics, Periodontics, Oral & Maxillofacial Surgery, and Prosthodontics. The generated responses were validated by thirty-six dental experts, nine from each specialty, employing a Likert scale, providing comprehensive insights into the embedded GPT model's performance and its comparison with GPT3.5 turbo. For content validation, a quantitative Content Validity Index (CVI) was used. The CVI was calculated both at the item level (I-CVI) and scale level (S-CVI/Ave). To adjust I-CVI for chance agreement, a modified kappa statistic (K*) was computed. RESULTS The overall content validity of responses generated via embedded GPT model and ChatGPT was 65.62% and 61.87% respectively. Moreover, the embedded GPT model revealed a superior performance surpassing ChatGPT with an accuracy of 62.5% and clarity of 72.5%. In contrast, the responses generated via ChatGPT achieved slightly lower scores, with an accuracy of 52.5% and clarity of 67.5%. However, both models performed equally well in terms of relevance and up-to-date knowledge. CONCLUSION In conclusion, embedded GPT model showed better results as compared to ChatGPT in providing post-operative dental care emphasizing the benefits of embedding and prompt engineering, paving the way for future advancements in healthcare applications.
Collapse
Affiliation(s)
- Itrat Batool
- Section of Dentistry, Department of Surgery, Aga Khan University Hospital, Karachi, Pakistan
| | - Nighat Naved
- Section of Dentistry, Department of Surgery, Aga Khan University Hospital, Karachi, Pakistan
| | - Syed Murtaza Raza Kazmi
- Section of Dentistry, Department of Surgery, Aga Khan University Hospital, Karachi, Pakistan
| | - Fahad Umer
- Section of Dentistry, Department of Surgery, Aga Khan University Hospital, Karachi, Pakistan.
| |
Collapse
|
3
|
AlShehri Y, Sidhu A, Lakshmanan LVS, Lefaivre KA. Applications of Natural Language Processing for Automated Clinical Data Analysis in Orthopaedics. J Am Acad Orthop Surg 2024; 32:439-446. [PMID: 38626429 DOI: 10.5435/jaaos-d-23-00839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 02/20/2024] [Indexed: 04/18/2024] Open
Abstract
Natural language processing is an exciting and emerging field in health care that can transform the field of orthopaedics. It can aid in the process of automated clinical data analysis, changing the way we extract data for various purposes including research and registry formation, diagnosis, and medical billing. This scoping review will look at the various applications of NLP in orthopaedics. Specific examples of NLP applications include identification of essential data elements from surgical and imaging reports, patient feedback analysis, and use of AI conversational agents for patient engagement. We will demonstrate how NLP has proven itself to be a powerful and valuable tool. Despite these potential advantages, there are drawbacks we must consider. Concerns with data quality, bias, privacy, and accessibility may stand as barriers in the way of widespread implementation of NLP technology. As natural language processing technology continues to develop, it has the potential to revolutionize orthopaedic research and clinical practices and enhance patient outcomes.
Collapse
Affiliation(s)
- Yasir AlShehri
- From the Department of Orthopedics, College of Medicine, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia (AlShehri), the Department of Orthopaedics, Faculty of Medicine, The University of British Columbia, Vancouver, BC, Canada (Sidhu and Lefaivre), and the Department of Computer Science, The University of British Columbia, Vancouver, BC, Canada (Lakshmanan)
| | | | | | | |
Collapse
|
4
|
Kasapovic A, Ali T, Babasiz M, Bojko J, Gathen M, Kaczmarczyk R, Roos J. Does the Information Quality of ChatGPT Meet the Requirements of Orthopedics and Trauma Surgery? Cureus 2024; 16:e60318. [PMID: 38882956 PMCID: PMC11177007 DOI: 10.7759/cureus.60318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 06/18/2024] Open
Abstract
BACKGROUND The integration of artificial intelligence (AI) in medicine, particularly through AI-based language models like ChatGPT, offers a promising avenue for enhancing patient education and healthcare delivery. This study aims to evaluate the quality of medical information provided by Chat Generative Pre-trained Transformer (ChatGPT) regarding common orthopedic and trauma surgical procedures, assess its limitations, and explore its potential as a supplementary source for patient education. METHODS Using the GPT-3.5-Turbo version of ChatGPT, simulated patient information was generated for 20 orthopedic and trauma surgical procedures. The study utilized standardized information forms as a reference for evaluating ChatGPT's responses. The accuracy and quality of the provided information were assessed using a modified DISCERN instrument, and a global medical assessment was conducted to categorize the information's usefulness and reliability. RESULTS ChatGPT mentioned an average of 47% of relevant keywords across procedures, with a variance in the mention rate between 30.5% and 68.6%. The average modified DISCERN (mDISCERN) score was 2.4 out of 5, indicating a moderate to low quality of information. None of the ChatGPT-generated fact sheets were rated as "very useful," with 45% deemed "somewhat useful," 35% "not useful," and 20% classified as "dangerous." A positive correlation was found between higher mDISCERN scores and better physician ratings, suggesting that information quality directly impacts perceived utility. CONCLUSION While AI-based language models like ChatGPT hold significant promise for medical education and patient care, the current quality of information provided in the field of orthopedics and trauma surgery is suboptimal. Further development and refinement of AI sources and algorithms are necessary to improve the accuracy and reliability of medical information. This study underscores the need for ongoing research and development in AI applications in healthcare, emphasizing the critical role of accurate, high-quality information in patient education and informed consent processes.
Collapse
Affiliation(s)
- Adnan Kasapovic
- Department of Orthopedics and Trauma Surgery, University Hospital of Bonn, Bonn, DEU
| | - Thaer Ali
- Department of Orthopedics and Trauma Surgery, University Hospital of Bonn, Bonn, DEU
| | - Mari Babasiz
- Department of Orthopedics and Trauma Surgery, University Hospital of Bonn, Bonn, DEU
| | - Jessica Bojko
- Department of Orthopedics and Trauma Surgery, University Hospital of Bonn, Bonn, DEU
| | - Martin Gathen
- Department of Orthopedics and Trauma Surgery, University Hospital of Bonn, Bonn, DEU
| | - Robert Kaczmarczyk
- Department of Dermatology and Allergy, School of Medicine, Technical University of Munich, Munich, DEU
| | - Jonas Roos
- Department of Orthopedics and Trauma Surgery, University Hospital of Bonn, Bonn, DEU
| |
Collapse
|
5
|
Cote MP, Lubowitz JH. Recommended Requirements and Essential Elements for Proper Reporting of the Use of Artificial Intelligence Machine Learning Tools in Biomedical Research and Scientific Publications. Arthroscopy 2024; 40:1033-1038. [PMID: 38300189 DOI: 10.1016/j.arthro.2023.12.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 12/30/2023] [Indexed: 02/02/2024]
Abstract
Essential elements required for proper use of artificial intelligence machine learning tools in biomedical research and scientific publications include (1) explanation justifying why a machine learning approach contributes to the purpose of the study; (2) description of the adequacy of the data (input) to produce the desired results (output); (3) details of the algorithmic (i.e., computational) approach including methods for organizing the data (preprocessing); the machine learning computational algorithm(s) assessed; on what data the models were trained; the presence of bias and efforts to mitigate these effects; and the methods for quantifying the variables (features) most influential in determining the results (e.g., Shapley values); (4) description of methods, and reporting of results, quantitating performance in terms of both model accuracy and model calibration (level of confidence in the model's predictions); (5) availability of the programming code (including a link to the code when available-ideally, the code should be available); (6) discussion of model internal validation (results applicable and sensitive to the population investigated and data on which the model was trained) and external validation (were the results investigated as to whether they are generalizable to different populations? If not, consideration of this limitation and discussion of plans for external validation, i.e., next steps). As biomedical research submissions using artificial intelligence technology increase, these requirements could facilitate purposeful use and comprehensive methodological reporting.
Collapse
|
6
|
Desai V. The Future of Artificial Intelligence in Sports Medicine and Return to Play. Semin Musculoskelet Radiol 2024; 28:203-212. [PMID: 38484772 DOI: 10.1055/s-0043-1778019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Artificial intelligence (AI) has shown tremendous growth over the last decade, with the more recent development of clinical applications in health care. The ability of AI to synthesize large amounts of complex data automatically allows health care providers to access previously unavailable metrics and thus enhance and personalize patient care. These innovations include AI-assisted diagnostic tools, prediction models for each treatment pathway, and various tools for workflow optimization. The extension of AI into sports medicine is still early, but numerous AI-driven algorithms, devices, and research initiatives have delved into predicting and preventing athlete injury, aiding in injury assessment, optimizing recovery plans, monitoring rehabilitation progress, and predicting return to play.
Collapse
Affiliation(s)
- Vishal Desai
- Department of Radiology, Thomas Jefferson University, Philadelphia, Pennsylvania
| |
Collapse
|
7
|
Sharma SC, Ramchandani JP, Thakker A, Lahiri A. ChatGPT in Plastic and Reconstructive Surgery. Indian J Plast Surg 2023; 56:320-325. [PMID: 37705820 PMCID: PMC10497341 DOI: 10.1055/s-0043-1771514] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023] Open
Abstract
Background Chat Generative Pre-Trained Transformer (ChatGPT) is a versatile large language model-based generative artificial intelligence. It is proficient in a variety of tasks from drafting emails to coding to composing music to passing medical licensing exams. While the potential role of ChatGPT in plastic surgery is promising, evidence-based research is needed to guide its implementation in practice. Methods This review aims to summarize the literature surrounding ChatGPT's use in plastic surgery. Results A literature search revealed several applications for ChatGPT in the field of plastic surgery, including the ability to create academic literature and to aid the production of research. However, the ethical implications of using such chatbots in scientific writing requires careful consideration. ChatGPT can also generate high-quality patient discharge summaries and operation notes within seconds, freeing up busy junior doctors to complete other tasks. However, currently clinical information must still be manually inputted, and clinicians must consider data privacy implications. Its use in aiding patient communication and education and training is also widely documented in the literature. However, questions have been raised over the accuracy of answers generated given that current versions of ChatGPT cannot access the most up-to-date sources. Conclusions While one must be aware of its shortcomings, ChatGPT is a useful tool for plastic surgeons to improve productivity for a range of tasks from manuscript preparation to healthcare communication generation to drafting teaching sessions to studying and learning. As access improves and technology becomes more refined, surely more uses for ChatGPT in plastic surgery will become apparent.
Collapse
Affiliation(s)
- Sanjeev Chaand Sharma
- Department of Plastic Surgery, Leicester Royal Infirmary, Infirmary Square, Leicester, United Kingdom
| | - Jai Parkash Ramchandani
- Faculty of Life Sciences & Medicine, King's College London, Guy's Campus, Great Maze Pond, London, United Kingdom
| | - Arjuna Thakker
- Academic Team of Musculoskeletal Surgery, Leicester General Hospital, University Hospitals of Leicester NHS Trust, United Kingdom
| | - Anindya Lahiri
- Department of Plastic Surgery, Sandwell General Hospital, West Bromwich, United Kingdom
| |
Collapse
|
8
|
McBee JC, Han DY, Liu L, Ma L, Adjeroh DA, Xu D, Hu G. Interdisciplinary Inquiry via PanelGPT: Application to Explore Chatbot Application in Sports Rehabilitation. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.23.23292452. [PMID: 37546795 PMCID: PMC10402232 DOI: 10.1101/2023.07.23.23292452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Background ChatGPT showcases exceptional conversational capabilities and extensive cross-disciplinary knowledge. In addition, it possesses the ability to perform multiple roles within a single chat session. This unique multi-role-playing feature positions ChatGPT as a promising tool to explore interdisciplinary subjects. Objective The study intended to guide ChatGPT for interdisciplinary exploration through simulated panel discussions. As a proof-of-concept, we employed this method to evaluate the advantages and challenges of using chatbots in sports rehabilitation. Methods We proposed a model termed PanelGPT to explore ChatGPTs' knowledge graph on interdisciplinary topics through simulated panel discussions. Applied to "chatbots in sports rehabilitation", ChatGPT role-played both the moderator and panelists, which included a physiotherapist, psychologist, nutritionist, AI expert, and an athlete. We act as the audience posed questions to the panel, with ChatGPT acting as both the panelists for responses and the moderator for hosting the discussion. We performed the simulation using the ChatGPT-4 model and evaluated the responses with existing literature and human expertise. Results Each simulation mimicked a real-life panel discussion: The moderator introduced the panel and posed opening/closing questions, to which all panelists responded. The experts engaged with each other to address inquiries from the audience, primarily from their respective fields of expertise. By tackling questions related to education, physiotherapy, physiology, nutrition, and ethical consideration, the discussion highlighted benefits such as 24/7 support, personalized advice, automated tracking, and reminders. It also emphasized the importance of user education and identified challenges such as limited interaction modes, inaccuracies in emotion-related advice, assurance on data privacy and security, transparency in data handling, and fairness in model training. The panelists reached a consensus that chatbots are designed to assist, not replace, human healthcare professionals in the rehabilitation process. Conclusions Compared to a typical conversation with ChatGPT, the multi-perspective approach of PanelGPT facilitates a comprehensive understanding of an interdisciplinary topic by integrating insights from experts with complementary knowledge. Beyond addressing the exemplified topic of chatbots in sports rehabilitation, the model can be adapted to tackle a wide array of interdisciplinary topics within educational, research, and healthcare settings.
Collapse
Affiliation(s)
- Joseph C. McBee
- Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, WV 26506, USA
- Department of Chemical and Biomedical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Daniel Y. Han
- Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, WV 26506, USA
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, 85281 USA
| | - Leah Ma
- College of Health, Education, and Human Services, Wright State University, Dayton, OH 45345, USA
| | - Donald A. Adjeroh
- Lane Department of Computer Science & Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Dong Xu
- Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO65211, USA
| | - Gangqing Hu
- Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, WV 26506, USA
| |
Collapse
|