1
|
Kaga T, Inaba S, Shikano Y, Watanabe Y, Fujisawa T, Akazawa Y, Ohshita M, Kawakami H, Higashi H, Aono J, Nagai T, Islam MZ, Wannous M, Sakata M, Yamamoto K, Furukawa TA, Yamaguchi O. Utility of RAND/UCLA appropriateness method in validating multiple-choice questions on ECG. BMC MEDICAL EDUCATION 2024; 24:448. [PMID: 38658906 PMCID: PMC11044544 DOI: 10.1186/s12909-024-05446-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 04/18/2024] [Indexed: 04/26/2024]
Abstract
OBJECTIVES This study aimed to investigate the utility of the RAND/UCLA appropriateness method (RAM) in validating expert consensus-based multiple-choice questions (MCQs) on electrocardiogram (ECG). METHODS According to the RAM user's manual, nine panelists comprising various experts who routinely handle ECGs were asked to reach a consensus in three phases: a preparatory phase (round 0), an online test phase (round 1), and a face-to-face expert panel meeting (round 2). In round 0, the objectives and future timeline of the study were elucidated to the nine expert panelists with a summary of relevant literature. In round 1, 100 ECG questions prepared by two skilled cardiologists were answered, and the success rate was calculated by dividing the number of correct answers by 9. Furthermore, the questions were stratified into "Appropriate," "Discussion," or "Inappropriate" according to the median score and interquartile range (IQR) of appropriateness rating by nine panelists. In round 2, the validity of the 100 ECG questions was discussed in an expert panel meeting according to the results of round 1 and finally reassessed as "Appropriate," "Candidate," "Revision," and "Defer." RESULTS In round 1 results, the average success rate of the nine experts was 0.89. Using the median score and IQR, 54 questions were classified as " Discussion." In the expert panel meeting in round 2, 23% of the original 100 questions was ultimately deemed inappropriate, although they had been prepared by two skilled cardiologists. Most of the 46 questions categorized as "Appropriate" using the median score and IQR in round 1 were considered "Appropriate" even after round 2 (44/46, 95.7%). CONCLUSIONS The use of the median score and IQR allowed for a more objective determination of question validity. The RAM may help select appropriate questions, contributing to the preparation of higher-quality tests.
Collapse
Affiliation(s)
| | - Shinji Inaba
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan.
| | - Yukari Shikano
- Ehime University Graduate School of Medicine, Toon, Japan
| | | | - Tomoki Fujisawa
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| | - Yusuke Akazawa
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| | - Muneaki Ohshita
- Department of Emergency and Critical Care Medicine, Graduate School of Medicine, Ehime University, Toon, Japan
| | - Hiroshi Kawakami
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| | - Haruhiko Higashi
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| | - Jun Aono
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| | - Takayuki Nagai
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| | - Mohammad Zahidul Islam
- Department of Information Communication Technology ICT Division, Government of Bangladesh, Dhaka, Bangladesh
| | - Muhammad Wannous
- Department of Computer Information Science, Higher Colleges of Technology, Abu Dhabi, UAE
| | - Masatsugu Sakata
- Departments of Health Promotion and Human Behavior, Kyoto University Graduate School of Medicine/School of Public Health, Kyoto, Japan
| | - Kazumichi Yamamoto
- Departments of Health Promotion and Human Behavior, Kyoto University Graduate School of Medicine/School of Public Health, Kyoto, Japan
- Institute for Airway Disease, Hyogo, Japan
| | - Toshi A Furukawa
- Departments of Health Promotion and Human Behavior, Kyoto University Graduate School of Medicine/School of Public Health, Kyoto, Japan
| | - Osamu Yamaguchi
- Department of Cardiology, Pulmonology, Hypertension and Nephrology, Ehime University Graduate School of Medicine, Toon, Ehime, 791-0295, Japan
| |
Collapse
|
2
|
Menon B, Miller J, DeShetler LM. Questioning the questions: Methods used by medical schools to review internal assessment items. MEDEDPUBLISH 2021; 10:37. [PMID: 38486513 PMCID: PMC10939609 DOI: 10.15694/mep.2021.000037.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024] Open
Abstract
This article was migrated. The article was marked as recommended. Objective: Review of assessment questions to ensure quality is critical to properly assess student performance. The purpose of this study was to identify processes used by medical schools to review questions used in internal assessments. Methods: The authors recruited professionals involved with the writing and/or review of questions for their medical school's internal assessments to participate in this study. The survey was administered electronically via an anonymous link, and participation was solicited through the DR-ED listserv, an electronic discussion group for medical educators. Responses were collected over a two-week period, and one reminder was sent to increase the response rate. The instrument was comprised of one demographic question, two closed-ended questions, and two open-ended questions. Results: Thirty-nine respondents completed the survey in which 22 provided the name of their institution/medical school. Of those who self-identified, no two respondents appeared to be from the same institution, and participants represented institutions from across the United States with two from other countries. The majority (n=32, 82%) of respondents indicated they had a process to review student assessment questions. Most participants reported that faculty and course/block directors had responsibility for review of assessment questions, while some indicated they had a committee or group of faculty who was responsible for review. Most focused equally on content/accuracy, formatting, and grammar as reported. Over 81% (n=22) of respondents indicated they used NBME resources to guide review, and less than 19% (n=5) utilized internally developed writing guides. Conclusions: Results of this study identified that medical schools are using a wide range of item review strategies and use a variety of tools to guide their review. These results will give insight to other medical schools who do not have processes in place to review assessment questions or who are looking to expand upon current procedures.
Collapse
|
3
|
Peyrony O, Hutin A, Truchot J, Borie R, Calvet D, Albaladejo A, Baadj Y, Cailleaux PE, Flamant M, Martin C, Messika J, Meunier A, Mirabel M, Tea V, Treton X, Chevret S, Lebeaux D, Roux D. Impact of panelists' experience on script concordance test scores of medical students. BMC MEDICAL EDUCATION 2020; 20:313. [PMID: 32943030 PMCID: PMC7499961 DOI: 10.1186/s12909-020-02243-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Accepted: 09/10/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND The evaluation process of French medical students will evolve in the next few years in order to improve assessment validity. Script concordance testing (SCT) offers the possibility to assess medical knowledge alongside clinical reasoning under conditions of uncertainty. In this study, we aimed at comparing the SCT scores of a large cohort of undergraduate medical students, according to the experience level of the reference panel. METHODS In 2019, the authors developed a 30-item SCT and sent it to experts with varying levels of experience. Data analysis included score comparisons with paired Wilcoxon rank sum tests and concordance analysis with Bland & Altman plots. RESULTS A panel of 75 experts was divided into three groups: 31 residents, 21 non-experienced physicians (NEP) and 23 experienced physicians (EP). Among each group, random samples of N = 20, 15 and 10 were selected. A total of 985 students from nine different medical schools participated in the SCT examination. No matter the size of the panel (N = 20, 15 or 10), students' SCT scores were lower with the NEP group when compared to the resident panel (median score 67.1 vs 69.1, p < 0.0001 if N = 20; 67.2 vs 70.1, p < 0.0001 if N = 15 and 67.7 vs 68.4, p < 0.0001 if N = 10) and with EP compared to NEP (65.4 vs 67.1, p < 0.0001 if N = 20; 66.0 vs 67.2, p < 0.0001 if N = 15 and 62.5 vs 67.7, p < 0.0001 if N = 10). Bland & Altman plots showed good concordances between students' SCT scores, whatever the experience level of the expert panel. CONCLUSIONS Even though student SCT scores differed statistically according to the expert panels, these differences were rather weak. These results open the possibility of including less-experienced experts in panels for the evaluation of medical students.
Collapse
Affiliation(s)
- Olivier Peyrony
- Department of Emergency Medicine, Saint-Louis University Hospital, Assistance Publique-Hôpitaux de Paris, 1 avenue Claude Vellefaux, 75010 Paris, France
| | - Alice Hutin
- SAMU de Paris, SMUR Necker, Necker Enfants Malades University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Jennifer Truchot
- Department of Emergency Medicine, SMUR, Lariboisère University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Paris Diderot University, Paris, France
| | - Raphaël Borie
- Department of Pneumology, Reference Center for Rare Pulmonary Diseases, Bichat University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- INSERM, UMR 1152, Paris Diderot University, Paris, France
| | - David Calvet
- Department of Neurology and Stroke Unit, Sainte-Anne University Hospital, Paris, France
- INSERM, UMR 1266, Psychiatry and Neurosciences Institute of Paris, Paris-Descartes University, Paris, France
| | | | | | - Pierre-Emmanuel Cailleaux
- Department of Getriatric Medicine, Louis-Mourier University Hospital, Assistance Publique-Hôpitaux de Paris, F-92700 Colombes, France
- INSERM, UMR 1132, BiOsCar, University of Paris, Paris, France
| | - Martin Flamant
- Department of Kidney Physiology, Bichat University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- INSERM, UMR 1149, Inflammatory Research Center, Paris, France
- University of Paris, Paris, France
| | - Clémence Martin
- Department of Respiratory Medicine, Cochin University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Cochin Institute, UMR 1016. Paris-Descartes University, Paris, France
| | - Jonathan Messika
- University of Paris, Paris, France
- Pulmonology and Lung Transplant Unit, Bichat University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Physiopathology and Epidemiology of Respiratory Diseases (PHERE), INSERM, UMR 1152, and Paris Transplant Group, Paris, France
| | | | - Mariana Mirabel
- University of Paris, Paris, France
- Department of Cardio-oncology, Georges Pompidou European University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- INSERM, UMR 970, Paris Cardiovascular Research Center PARCC, Paris, France
| | - Victoria Tea
- Department of Cardiology, Georges Pompidou European University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Xavier Treton
- University of Paris, Paris, France
- Department of Gastroenterology, Inflammatory Bowel Disease, and Nutritive Assistance, Beaujon University Hospital, Assistance Publique-Hôpitaux de Paris, Clichy, France
| | - Sylvie Chevret
- Department of Biostatistics and Medical Information, Saint-Louis University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
- Centre of Research in Epidemiology and StatisticS (CRESS), INSERM, UMR 1153, Epidemiology and Clinical Statistics for Tumor, Respiratory, and Resuscitation Assessments (ECSTRRA) Team, University of Paris, Paris, France
| | - David Lebeaux
- University of Paris, Paris, France
- Department of Microbiology, Mobile Infectiology Unit, Georges Pompidou European University Hospital, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Damien Roux
- University of Paris, Paris, France
- Department of Intensive Care, Louis Mourier University Hospital, Assistance Publique-Hôpitaux de Paris, F-92700 Colombes, France
| |
Collapse
|
4
|
Dangprapai Y, Ngamskulrungroj P, Senawong S, Ungprasert P, Harun A. Development of a New Scoring System To Accurately Estimate Learning Outcome Achievements via Single, Best-Answer, Multiple-Choice Questions for Preclinical Students in a Medical Microbiology Course. JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION 2020; 21:21.1.4. [PMID: 32148605 PMCID: PMC7048397 DOI: 10.1128/jmbe.v21i1.1773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 11/20/2019] [Indexed: 06/10/2023]
Abstract
During the preclinical years, single-best-answer multiple-choice questions (SBA-MCQs) are often used to test the higher-order cognitive processes of medical students (such as application and analysis) while simultaneously assessing lower-order processes (like knowledge and comprehension). Consequently, it can be difficult to pinpoint which learning outcome has been achieved or needs improvement. We developed a new scoring system for SBA-MCQs using a step-by-step methodology to evaluate each learning outcome independently. Enrolled in this study were third-year medical students (n = 316) who had registered in the basic microbiology course at the Faculty of Medicine, Siriraj Hospital, Mahidol University during the academic year 2017. A step-by-step SBA-MCQ with a new scoring system was created and used as a tool to evaluate the validity of the traditional SBA-MCQs that assess two separate outcomes simultaneously. The scores for the two methods, in percentages, were compared using two different questions (SBA-MCQ1 and SBA-MCQ2). SBA-MCQ1 tested the students' knowledge of the causative agent of a specific infectious disease and the basic characteristics of the microorganism, while SBA-MCQ2 tested their knowledge of the causative agent of a specific infectious disease and the pathogenic mechanism of the microorganism. The mean score obtained with the traditional SBA-MCQs was significantly lower than that obtained with the step-by-step SBA-MCQs (85.9% for the traditional approach versus 90.9% for step-by-step SBA-MCQ1; p < 0.001; and 81.5% for the traditional system versus 87.4% for step-by-step SBA-MCQ2; p < 0.001). Moreover, 65.8% and 87.8% of the students scored lower with the traditional SBA-MCQ1 and the traditional SBA-MCQ2, respectively, than with the corresponding sets of step-by-step SBA-MCQ questions. These results suggest that traditional SBA-MCQ scores need to be interpreted with caution because they have the potential to underestimate the learning achievement of students. Therefore, the step-by-step SBA-MCQ is preferable to the traditional SBA-MCQs and is recommended for use in examinations during the preclinical years.
Collapse
Affiliation(s)
- Yodying Dangprapai
- Department of Physiology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Popchai Ngamskulrungroj
- Department of Microbiology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Sansnee Senawong
- Department of Immunology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Patompong Ungprasert
- Clinical Epidemiology Unit, Department of Research and Development, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Azian Harun
- Department of Medical Microbiology and Parasitology, School of Medical Sciences, Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia
| |
Collapse
|
5
|
Baladrón J, Sánchez Lasheras F, Romeo Ladrero JM, Villacampa T, Curbelo J, Jiménez Fonseca P, García Guerrero A. The MIR 2018 Exam: Psychometric Study and Comparison with the Previous Nine Years. ACTA ACUST UNITED AC 2019; 55:medicina55120751. [PMID: 31756983 PMCID: PMC6956110 DOI: 10.3390/medicina55120751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 11/04/2019] [Accepted: 11/18/2019] [Indexed: 11/16/2022]
Abstract
Background and Objectives: The aim of the present research is to study the questions used in the 2018 MIR exam (a test that allows access to specialized medical training in Spain), describe their psychometric properties, and evaluate their quality. Materials and Methods: This analysis is performed with the help of classical test theory (CTT) and item response theory (IRT). The answers given to the test questions by a total of 3868 physicians are analyzed. Results: According to CTT, the average difficulty index for all of the test questions was 0.629, which falls into the acceptable category. The average difficulty index with correction for random effects was 0.515, which corresponds to a value within the optimal range. The mean discrimination index was 0.277, which is in the good category, while the mean point biserial correlation coefficient, with a value of 0.275 fits in the regular category. The values of difficulty and discrimination calculated according to the model of two parameters of the IRT seem adequate with average values of -0.389 and 0.677. The Cronbach alpha score obtained for the overall test was 0.944. This value is considered as very good. Conclusions: A decrease was observed in the average values of discrimination in the last three calls, which may be related to the greater proportion of Spanish graduates that take the exam in the same year of finalization of their studies in Medicine.
Collapse
Affiliation(s)
- Jaime Baladrón
- Curso Intensivo MIR Asturias, c/ Quintana 11A, 33005 Oviedo, Spain;
| | - Fernando Sánchez Lasheras
- Mathematics Department, Faculty of Sciences, University of Oviedo, c/ Federico García Lorca 18, 33007 Oviedo, Spain
- Correspondence: ; Tel.: +34-985-103-376
| | | | - Tomás Villacampa
- Villacampa Ophthalmic Clinic, c/ La Cámara 15, 33401 Avilés, Spain;
| | - José Curbelo
- Internal Medicine Department, Hospital Universitario La Princesa, 28006 Madrid, Spain;
| | - Paula Jiménez Fonseca
- Medical Oncology Department, Hospital Universitario Central de Asturias, 33011 Oviedo, Spain;
| | | |
Collapse
|