1
|
Yao JJ, Aggarwal M, Lopez RD, Namdari S. Current Concepts Review: Large Language Models in Orthopaedics: Definitions, Uses, and Limitations. J Bone Joint Surg Am 2024:00004623-990000000-01136. [PMID: 38896652 DOI: 10.2106/jbjs.23.01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
➤ Large language models are a subset of artificial intelligence. Large language models are powerful tools that excel in natural language text processing and generation.➤ There are many potential clinical, research, and educational applications of large language models in orthopaedics, but the development of these applications needs to be focused on patient safety and the maintenance of high standards.➤ There are numerous methodological, ethical, and regulatory concerns with regard to the use of large language models. Orthopaedic surgeons need to be aware of the controversies and advocate for an alignment of these models with patient and caregiver priorities.
Collapse
Affiliation(s)
- Jie J Yao
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| | | | - Ryan D Lopez
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| | - Surena Namdari
- Rothman Orthopaedic Institute, Thomas Jefferson University, Philadelphia, Pennsylvania
| |
Collapse
|
2
|
Kıyak YS, Emekli E. ChatGPT prompts for generating multiple-choice questions in medical education and evidence on their validity: a literature review. Postgrad Med J 2024:qgae065. [PMID: 38840505 DOI: 10.1093/postmj/qgae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/29/2024] [Accepted: 05/23/2024] [Indexed: 06/07/2024]
Abstract
ChatGPT's role in creating multiple-choice questions (MCQs) is growing but the validity of these artificial-intelligence-generated questions is unclear. This literature review was conducted to address the urgent need for understanding the application of ChatGPT in generating MCQs for medical education. Following the database search and screening of 1920 studies, we found 23 relevant studies. We extracted the prompts for MCQ generation and assessed the validity evidence of MCQs. The findings showed that prompts varied, including referencing specific exam styles and adopting specific personas, which align with recommended prompt engineering tactics. The validity evidence covered various domains, showing mixed accuracy rates, with some studies indicating comparable quality to human-written questions, and others highlighting differences in difficulty and discrimination levels, alongside a significant reduction in question creation time. Despite its efficiency, we highlight the necessity of careful review and suggest a need for further research to optimize the use of ChatGPT in question generation. Main messages Ensure high-quality outputs by utilizing well-designed prompts; medical educators should prioritize the use of detailed, clear ChatGPT prompts when generating MCQs. Avoid using ChatGPT-generated MCQs directly in examinations without thorough review to prevent inaccuracies and ensure relevance. Leverage ChatGPT's potential to streamline the test development process, enhancing efficiency without compromising quality.
Collapse
Affiliation(s)
- Yavuz Selim Kıyak
- Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara 06500, Turkey
| | - Emre Emekli
- Department of Radiology, Faculty of Medicine, Eskişehir Osmangazi University, Eskişehir 26040, Turkey
| |
Collapse
|
3
|
Le KDR, Tay SBP, Choy KT, Verjans J, Sasanelli N, Kong JCH. Applications of natural language processing tools in the surgical journey. Front Surg 2024; 11:1403540. [PMID: 38826809 PMCID: PMC11140056 DOI: 10.3389/fsurg.2024.1403540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/07/2024] [Indexed: 06/04/2024] Open
Abstract
Background Natural language processing tools are becoming increasingly adopted in multiple industries worldwide. They have shown promising results however their use in the field of surgery is under-recognised. Many trials have assessed these benefits in small settings with promising results before large scale adoption can be considered in surgery. This study aims to review the current research and insights into the potential for implementation of natural language processing tools into surgery. Methods A narrative review was conducted following a computer-assisted literature search on Medline, EMBASE and Google Scholar databases. Papers related to natural language processing tools and consideration into their use for surgery were considered. Results Current applications of natural language processing tools within surgery are limited. From the literature, there is evidence of potential improvement in surgical capability and service delivery, such as through the use of these technologies to streamline processes including surgical triaging, data collection and auditing, surgical communication and documentation. Additionally, there is potential to extend these capabilities to surgical academia to improve processes in surgical research and allow innovation in the development of educational resources. Despite these outcomes, the evidence to support these findings are challenged by small sample sizes with limited applicability to broader settings. Conclusion With the increasing adoption of natural language processing technology, such as in popular forms like ChatGPT, there has been increasing research in the use of these tools within surgery to improve surgical workflow and efficiency. This review highlights multifaceted applications of natural language processing within surgery, albeit with clear limitations due to the infancy of the infrastructure available to leverage these technologies. There remains room for more rigorous research into broader capability of natural language processing technology within the field of surgery and the need for cross-sectoral collaboration to understand the ways in which these algorithms can best be integrated.
Collapse
Affiliation(s)
- Khang Duy Ricky Le
- Department of General Surgical Specialties, The Royal Melbourne Hospital, Melbourne, VIC, Australia
- Department of Surgical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Geelong Clinical School, Deakin University, Geelong, VIC, Australia
- Department of Medical Education, The University of Melbourne, Melbourne, VIC, Australia
| | - Samuel Boon Ping Tay
- Department of Anaesthesia and Pain Medicine, Eastern Health, Box Hill, VIC, Australia
| | - Kay Tai Choy
- Department of Surgery, Austin Health, Melbourne, VIC, Australia
| | - Johan Verjans
- Australian Institute for Machine Learning (AIML), University of Adelaide, Adelaide, SA, Australia
- Lifelong Health Theme (Platform AI), South Australian Health and Medical Research Institute, Adelaide, SA, Australia
| | - Nicola Sasanelli
- Division of Information Technology, Engineering and the Environment, University of South Australia, Adelaide, SA, Australia
- Department of Operations (Strategic and International Partnerships), SmartSAT Cooperative Research Centre, Adelaide, SA, Australia
- Agora High Tech, Adelaide, SA, Australia
| | - Joseph C. H. Kong
- Department of Surgical Oncology, Peter MacCallum Cancer Centre, Melbourne, VIC, Australia
- Monash University Department of Surgery, Alfred Hospital, Melbourne, VIC, Australia
- Department of Colorectal Surgery, Alfred Hospital, Melbourne, VIC, Australia
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
4
|
Kıyak YS, Coşkun Ö, Budakoğlu Iİ, Uluoğlu C. ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam. Eur J Clin Pharmacol 2024; 80:729-735. [PMID: 38353690 DOI: 10.1007/s00228-024-03649-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/03/2024] [Indexed: 04/09/2024]
Abstract
PURPOSE Artificial intelligence, specifically large language models such as ChatGPT, offers valuable potential benefits in question (item) writing. This study aimed to determine the feasibility of generating case-based multiple-choice questions using ChatGPT in terms of item difficulty and discrimination levels. METHODS This study involved 99 fourth-year medical students who participated in a rational pharmacotherapy clerkship carried out based-on the WHO 6-Step Model. In response to a prompt that we provided, ChatGPT generated ten case-based multiple-choice questions on hypertension. Following an expert panel, two of these multiple-choice questions were incorporated into a medical school exam without making any changes in the questions. Based on the administration of the test, we evaluated their psychometric properties, including item difficulty, item discrimination (point-biserial correlation), and functionality of the options. RESULTS Both questions exhibited acceptable levels of point-biserial correlation, which is higher than the threshold of 0.30 (0.41 and 0.39). However, one question had three non-functional options (options chosen by fewer than 5% of the exam participants) while the other question had none. CONCLUSIONS The findings showed that the questions can effectively differentiate between students who perform at high and low levels, which also point out the potential of ChatGPT as an artificial intelligence tool in test development. Future studies may use the prompt to generate items in order for enhancing the external validity of the results by gathering data from diverse institutions and settings.
Collapse
Affiliation(s)
- Yavuz Selim Kıyak
- Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey.
- Gazi Üniversitesi Hastanesi E Blok 9, Kat 06500 Beşevler, Ankara, Turkey.
| | - Özlem Coşkun
- Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey
| | - Işıl İrem Budakoğlu
- Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey
| | - Canan Uluoğlu
- Department of Medical Pharmacology, Faculty of Medicine, Gazi University, Ankara, Turkey
| |
Collapse
|
5
|
Wu C, Chen L, Han M, Li Z, Yang N, Yu C. Application of ChatGPT-based blended medical teaching in clinical education of hepatobiliary surgery. MEDICAL TEACHER 2024:1-5. [PMID: 38614458 DOI: 10.1080/0142159x.2024.2339412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 04/02/2024] [Indexed: 04/15/2024]
Abstract
OBJECTIVE This study evaluates the effectiveness of incorporating the Chat Generative Pre-trained Transformer (ChatGPT) into the clinical teaching of hepatobiliary surgery for undergraduate medical students. MATERIALS AND METHODS A group of 61 medical undergraduates from the Affiliated Hospital of Guizhou Medical University, undergoing hepatobiliary surgery training, were randomly assigned to either an experimental group (31 students) using ChatGPT-based blended teaching or a control group (30 students) with traditional teaching methods. The evaluation metrics included final exam scores, teaching satisfaction, and teaching effectiveness ratings, analyzed using SPSS 26.0 (SPSS Inc., Chicago, IL) with t-tests and χ2 tests. RESULTS The experimental group significantly outperformed the control group in final exam theoretical scores (86.44 ± 5.59 vs. 77.86 ± 4.16, p < .001) and clinical skills scores (83.84 ± 6.13 vs. 79.12 ± 4.27, p = .001). Additionally, the experimental group reported higher teaching satisfaction (17.23 ± 1.33) and self-evaluation of teaching effectiveness (9.14 ± 0.54) compared to the control group (15.38 ± 1.5 and 8.46 ± 0.70, respectively, p < .001). CONCLUSIONS The integration of ChatGPT into hepatobiliary surgery education significantly enhances theoretical knowledge, clinical skills, and overall satisfaction among medical undergraduates, suggesting a beneficial impact on their educational development.
Collapse
Affiliation(s)
- Changhao Wu
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Liwen Chen
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Min Han
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Zhu Li
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Nenghong Yang
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Chao Yu
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| |
Collapse
|
6
|
Busch F, Adams LC, Bressem KK. Spotlight on the biomedical ethical integration of AI in medical education - Response to: 'An explorative assessment of ChatGPT as an aid in medical education: Use it with caution'. MEDICAL TEACHER 2024; 46:594-595. [PMID: 38104590 DOI: 10.1080/0142159x.2023.2293655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 12/07/2023] [Indexed: 12/19/2023]
Affiliation(s)
- Felix Busch
- Department of Radiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Lisa C Adams
- Department of Radiology, Klinikum rechts der Isar, Technische Universität München (TUM), Munich, Germany
| | - Keno K Bressem
- Department of Radiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| |
Collapse
|
7
|
Mu Y, He D. The Potential Applications and Challenges of ChatGPT in the Medical Field. Int J Gen Med 2024; 17:817-826. [PMID: 38476626 PMCID: PMC10929156 DOI: 10.2147/ijgm.s456659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
ChatGPT, an AI-driven conversational large language model (LLM), has garnered significant scholarly attention since its inception, owing to its manifold applications in the realm of medical science. This study primarily examines the merits, limitations, anticipated developments, and practical applications of ChatGPT in clinical practice, healthcare, medical education, and medical research. It underscores the necessity for further research and development to enhance its performance and deployment. Moreover, future research avenues encompass ongoing enhancements and standardization of ChatGPT, mitigating its limitations, and exploring its integration and applicability in translational and personalized medicine. Reflecting the narrative nature of this review, a focused literature search was performed to identify relevant publications on ChatGPT's use in medicine. This process was aimed at gathering a broad spectrum of insights to provide a comprehensive overview of the current state and future prospects of ChatGPT in the medical domain. The objective is to aid healthcare professionals in understanding the groundbreaking advancements associated with the latest artificial intelligence tools, while also acknowledging the opportunities and challenges presented by ChatGPT.
Collapse
Affiliation(s)
- Yonglin Mu
- Department of Urology, Children’s Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| | - Dawei He
- Department of Urology, Children’s Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| |
Collapse
|
8
|
Hess BJ, Cupido N, Ross S, Kvern B. Becoming adaptive experts in an era of rapid advances in generative artificial intelligence. MEDICAL TEACHER 2024; 46:300-303. [PMID: 38092006 DOI: 10.1080/0142159x.2023.2289844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 11/28/2023] [Indexed: 02/24/2024]
Affiliation(s)
- Brian J Hess
- College of Family Physicians of Canada, Department of Certification and Assessment, Mississauga, Ontario, Canada
| | - Nathan Cupido
- The Wilson Centre, University Health Network and Temerty Faculty of Medicine, and the Institute of Health Policy, Management, and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Shelley Ross
- Department of Family Medicine, Faculty of Medicine and Dentistry, College of Health Sciences, University of Alberta, Edmonton, Canada
| | - Brent Kvern
- College of Family Physicians of Canada, Department of Certification and Assessment, Mississauga, Ontario, Canada
| |
Collapse
|