1
|
Ch'en PY, Day W, Pekson RC, Barrientos J, Burton WB, Ludwig AB, Jariwala SP, Cassese T. GPT-4 generated answer rationales to multiple choice assessment questions in undergraduate medical education. BMC MEDICAL EDUCATION 2025; 25:333. [PMID: 40038669 PMCID: PMC11877964 DOI: 10.1186/s12909-025-06862-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Accepted: 02/11/2025] [Indexed: 03/06/2025]
Abstract
BACKGROUND Pre-clerkship medical students benefit from practice questions that provide rationales for answer choices. Creating these rationales is a time-intensive endeavor. Therefore, not all practice multiple choice questions (MCQ) have corresponding explanations to aid learning. The authors examined artificial intelligence's (AI) potential to create high-quality answer rationales for clinical vignette-style MCQs. METHODS The authors conducted a single-center pre-post intervention survey study in August 2023 assessing 8 pre-clerkship course director (CD) attitudes towards GPT-4 generated answer rationales to clinical vignette style MCQs. Ten MCQs from each course's question bank were selected and input into GPT-4 with instructions to select the best answer and generate rationales for each answer choice. CDs were provided their unmodified GPT-4 interactions to assess the accuracy, clarity, appropriateness, and likelihood of implementation of the rationales. CDs were asked about time spent reviewing and making necessary modifications, satisfaction, and receptiveness in using GPT-4 for this purpose. RESULTS GPT-4 correctly answered 75/80 (93.8%) questions on the first attempt. CDs were receptive to using GPT-4 for rationale generation and all were satisfied with the generated rationales. CDs determined that the majority of rationales were very accurate (77.5%), very clear (83.8%) and very appropriate (93.8%). Most rationales could be implemented with little or no modification (88.3%). All CDs would implement AI-generated answer rationales with CD editorial insights. Most CDs (75%) took ≤ 4 min to review a set of generated rationales for a question. CONCLUSION GPT-4 is an acceptable and feasible tool for generating accurate, clear and appropriate answer rationales for MCQs in medical education. Future studies should examine students' feedback to generated rationales and further explore generating rationales for question with media. The authors plan to explore the implementation of this technological application at their medical school, including logistics and training to create a streamlined process that benefits both learners and educators. CLINICAL TRIAL Not applicable; not a clinical trial.
Collapse
Affiliation(s)
| | - Wesley Day
- Albert Einstein College of Medicine, Bronx, NY, USA
| | | | | | | | | | | | - Todd Cassese
- Albert Einstein College of Medicine, Bronx, NY, USA.
- Department of Medicine, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, USA.
| |
Collapse
|
2
|
Huang G, Zhang H, Zeng J, Chen W. A study of the effect of question feedback types on learning engagement in panoramic videos. Front Psychol 2025; 16:1321712. [PMID: 40083766 PMCID: PMC11903445 DOI: 10.3389/fpsyg.2025.1321712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 02/17/2025] [Indexed: 03/16/2025] Open
Abstract
Introduction The immersive and interactive nature of panoramic video empowers learners with experiences that are infinitely close to the real environment and increases the use of imagination in learners' knowledge acquisition. Studies have shown that embedding question feedback in traditional educational videos can effectively improve learning. However, little research has been conducted on embedding question feedback in panoramic videos to explore what types of question feedback effectively improve the dimensions of learners' learning engagement and yield better learning experiences and learning effects. Methods This study embedded questions with feedback within panoramic videos by categorizing feedback into two types: simple feedback and elaborated feedback. Using eye tracking, brainwave meters, and subjective questionnaires as measurement tools, this study investigated which type of question feedback embedded in panoramic videos improved various dimensions of learner engagement and academic performance. Participants (n = 91) were randomly assigned to the experimental group (simple feedback, elaborated feedback) or the control group (no feedback). Results The results of the study showed that (1) the experimental group significantly improved in cognitive engagement, behavioral engagement, and emotional engagement compared to the control group. When the precision of feedback information was greater, the learner's behavioral engagement was greater; however, the precision of feedback information did not significantly affect cognitive and emotional engagement. (2) When the feedback information was more detailed, the learners' academic performance was better. Discussion The findings of this study can support strategic recommendations for the design and application of panoramic videos.
Collapse
Affiliation(s)
- Guan Huang
- College of Education, China West Normal University, Nanchong, China
| | - Haohua Zhang
- College of Education, China West Normal University, Nanchong, China
| | - Jingsheng Zeng
- College of Education, China West Normal University, Nanchong, China
| | - Wen Chen
- Cyberspace Security Academy, Sichuan University, Chengdu, China
| |
Collapse
|
3
|
Gauthier TP, Cady E, Liu T, Landayan AM. Resources and strategies for learning infectious diseases pharmacotherapy during advanced pharmacy practice experiences and pharmacy residency. Am J Health Syst Pharm 2025; 82:228-234. [PMID: 39185681 DOI: 10.1093/ajhp/zxae250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Indexed: 08/27/2024] Open
Affiliation(s)
- Timothy P Gauthier
- Baptist Health Clinical Pharmacy Enterprise, Baptist Health South Florida, Miami, FL, USA
| | - Elizabeth Cady
- Southern Illinois University at Edwardsville School of Pharmacy, Springfield, IL, USA
| | - Thomas Liu
- Baptist Health Clinical Pharmacy Enterprise, Baptist Health South Florida, Miami, FL, USA
| | - Alice M Landayan
- Pharmacy Department, South Miami Hospital, Baptist Health South Florida, Miami, FL, USA
| |
Collapse
|
4
|
Wu Z, Gan W, Xue Z, Ni Z, Zheng X, Zhang Y. Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study. JMIR MEDICAL EDUCATION 2024; 10:e52746. [PMID: 39363539 PMCID: PMC11466054 DOI: 10.2196/52746] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 06/12/2024] [Accepted: 06/15/2024] [Indexed: 10/05/2024]
Abstract
Background The creation of large language models (LLMs) such as ChatGPT is an important step in the development of artificial intelligence, which shows great potential in medical education due to its powerful language understanding and generative capabilities. The purpose of this study was to quantitatively evaluate and comprehensively analyze ChatGPT's performance in handling questions for the National Nursing Licensure Examination (NNLE) in China and the United States, including the National Council Licensure Examination for Registered Nurses (NCLEX-RN) and the NNLE. Objective This study aims to examine how well LLMs respond to the NCLEX-RN and the NNLE multiple-choice questions (MCQs) in various language inputs. To evaluate whether LLMs can be used as multilingual learning assistance for nursing, and to assess whether they possess a repository of professional knowledge applicable to clinical nursing practice. Methods First, we compiled 150 NCLEX-RN Practical MCQs, 240 NNLE Theoretical MCQs, and 240 NNLE Practical MCQs. Then, the translation function of ChatGPT 3.5 was used to translate NCLEX-RN questions from English to Chinese and NNLE questions from Chinese to English. Finally, the original version and the translated version of the MCQs were inputted into ChatGPT 4.0, ChatGPT 3.5, and Google Bard. Different LLMs were compared according to the accuracy rate, and the differences between different language inputs were compared. Results The accuracy rates of ChatGPT 4.0 for NCLEX-RN practical questions and Chinese-translated NCLEX-RN practical questions were 88.7% (133/150) and 79.3% (119/150), respectively. Despite the statistical significance of the difference (P=.03), the correct rate was generally satisfactory. Around 71.9% (169/235) of NNLE Theoretical MCQs and 69.1% (161/233) of NNLE Practical MCQs were correctly answered by ChatGPT 4.0. The accuracy of ChatGPT 4.0 in processing NNLE Theoretical MCQs and NNLE Practical MCQs translated into English was 71.5% (168/235; P=.92) and 67.8% (158/233; P=.77), respectively, and there was no statistically significant difference between the results of text input in different languages. ChatGPT 3.5 (NCLEX-RN P=.003, NNLE Theoretical P<.001, NNLE Practical P=.12) and Google Bard (NCLEX-RN P<.001, NNLE Theoretical P<.001, NNLE Practical P<.001) had lower accuracy rates for nursing-related MCQs than ChatGPT 4.0 in English input. English accuracy was higher when compared with ChatGPT 3.5's Chinese input, and the difference was statistically significant (NCLEX-RN P=.02, NNLE Practical P=.02). Whether submitted in Chinese or English, the MCQs from the NCLEX-RN and NNLE demonstrated that ChatGPT 4.0 had the highest number of unique correct responses and the lowest number of unique incorrect responses among the 3 LLMs. Conclusions This study, focusing on 618 nursing MCQs including NCLEX-RN and NNLE exams, found that ChatGPT 4.0 outperformed ChatGPT 3.5 and Google Bard in accuracy. It excelled in processing English and Chinese inputs, underscoring its potential as a valuable tool in nursing education and clinical decision-making.
Collapse
Affiliation(s)
- Zelin Wu
- Department of Bone and Joint Surgery and Sports Medicine Center, The First Affiliated Hospital, Guangzhou, China
| | - Wenyi Gan
- Department of Joint Surgery and Sports Medicine, Zhuhai People’s Hospital, Zhuhai City, China
| | - Zhaowen Xue
- Department of Bone and Joint Surgery and Sports Medicine Center, The First Affiliated Hospital, Guangzhou, China
| | - Zhengxin Ni
- School of Nursing, Yangzhou University, Yangzhou, China
| | - Xiaofei Zheng
- Department of Bone and Joint Surgery and Sports Medicine Center, The First Affiliated Hospital, Guangzhou, China
| | - Yiyi Zhang
- Department of Bone and Joint Surgery and Sports Medicine Center, The First Affiliated Hospital, Guangzhou, China
| |
Collapse
|
5
|
de Lange T, Møystad A, Torgersen G, Ahlqvist J, Jäghagen EL. Students' perceptions of post-exam feedback in oral radiology-A comparative study from two dental hygienist educational settings. EUROPEAN JOURNAL OF DENTAL EDUCATION : OFFICIAL JOURNAL OF THE ASSOCIATION FOR DENTAL EDUCATION IN EUROPE 2024; 28:377-387. [PMID: 37885281 DOI: 10.1111/eje.12959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 08/15/2023] [Accepted: 09/17/2023] [Indexed: 10/28/2023]
Abstract
INTRODUCTION The aim of this study was to investigate how students perceive the benefit of participating in a teacher-organised session providing feedback on exams, termed post-exam feedback, in two dental hygienist programmes. METHODS The study was based on interviews with 22 participants, including 18 students and 4 faculty teachers. The data were approached on the basis of thematic analysis, allowing us to generate insights on how the participants reflected on their participation in the post-exam feedback sessions and how they perceived this arrangement as learners. RESULTS The findings from the study suggest that motivated students consider post-exam feedback to be beneficial in clearing up uncertainties and deepening their understanding of issues not fully understood during the exam, as well as supporting their further learning. Less motivated students mainly consider post-exam feedback to be relevant for students who do not pass the exams. CONCLUSIONS Organised in a student-centred way and with attentiveness to student learning preferences, the results suggest that post-exam feedback can be valuable for enhancing assessment and supporting student learning related to exams.
Collapse
Affiliation(s)
- Thomas de Lange
- Department of Education, University of South-Eastern Norway, Oslo, Norway
| | - Anne Møystad
- Institute of Clinical Dentistry, Faculty of Dentistry, University of Oslo, Oslo, Norway
| | - Gerald Torgersen
- Institute of Clinical Dentistry, Faculty of Dentistry, University of Oslo, Oslo, Norway
| | - Jan Ahlqvist
- Oral and Maxillofacial Radiology, Department of Odontology, Umeå University, Umeå, Sweden
| | - Eva Levring Jäghagen
- Oral and Maxillofacial Radiology, Department of Odontology, Umeå University, Umeå, Sweden
| |
Collapse
|
6
|
Manteghinejad A. Web-Based Medical Examinations During the COVID-19 Era: Reconsidering Learning as the Main Goal of Examination. JMIR MEDICAL EDUCATION 2021; 7:e25355. [PMID: 34329178 PMCID: PMC8360339 DOI: 10.2196/25355] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 02/18/2021] [Accepted: 05/23/2021] [Indexed: 06/13/2023]
Abstract
Like other aspects of the health care system, medical education has been greatly affected by the COVID-19 pandemic. To follow the requirements of lockdown and virtual education, the performance of students has been evaluated via web-based examinations. Although this shift to web-based examinations was inevitable, other mental, educational, and technical aspects should be considered to ensure the efficiency and accuracy of this type of evaluation in this era. The easiest way to address the new challenges is to administer traditional questions via a web-based platform. However, more factors should be accounted for when designing web-based examinations during the COVID-19 era. This article presents an approach in which the opportunity created by the pandemic is used as a basis to reconsider learning as the main goal of web-based examinations. The approach suggests using open-book examinations, using questions that require high cognitive domains, using real clinical scenarios, developing more comprehensive examination blueprints, using advanced platforms for web-based questions, and providing feedback in web-based examinations to ensure that the examinees have acquired the minimum competency levels defined in the course objectives.
Collapse
Affiliation(s)
- Amirreza Manteghinejad
- Cancer Prevention Research Center, Omid Hospital, Isfahan University of Medical Sciences, Isfahan, Iran
- Student Research Committee, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
7
|
Prochazka J, Ovcari M, Durinik M. Sandwich feedback: The empirical evidence of its effectiveness. LEARNING AND MOTIVATION 2020. [DOI: 10.1016/j.lmot.2020.101649] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
8
|
Burgess A, Bateman K, Schucker J. Postexamination review using a standardized examination review form. TEACHING AND LEARNING IN NURSING 2020. [DOI: 10.1016/j.teln.2019.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
9
|
Yune SJ, Lee SY, Im S. How Do Medical Students Prepare for Examinations: Pre‐assessment Cognitive and Meta‐cognitive Activities. KOREAN MEDICAL EDUCATION REVIEW 2019; 21:51-58. [DOI: 10.17496/kmer.2019.21.1.51] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2024]
Abstract
Although ‘assessment for learning’ rather than ‘assessment of learning’ has been emphasized recently, student learning before examinations is still unclear. The purpose of this study was to investigate pre-assessment learning activities (PALA) and to find mechanism factors (MF) that influence those activities. Moreover, we compared the PALA and MF of written exams with those of the clinical performance examination/objective structured clinical examination (CPX/OSCE) in third-year (N=121) and fourth-year (N=108) medical students. Through literature review and discussion, questionnaires with a 5-point Likert scale were developed to measure PALA and MF. PALA had the constructs of cognitive and meta-cognitive activities, and MF had sub-components of personal, interpersonal, and environmental factors. Cronbach’s α coefficient was used to calculate survey reliability, while the Pearson correlation coefficient and multiple regression analysis were used to investigate the influence of MF on PALA. A paired t-test was applied to compare the PALA and MF of written exams with those of CPX/OSCE in third and fourth year students. The Pearson correlation coefficients between PALA and MF were 0.479 for written exams and 0.508 for CPX/OSCE. MF explained 24.1% of the PALA in written exams and 25.9% of PALA in CPX/OSCE. Both PALA and MF showed significant differences between written exams and CPX/OSCE in third-year students, whereas those in fourth-year students showed no differences. Educators need to consider MFs that influence the PALA to encourage ‘assessment for learning’.
Collapse
|