1
|
Schauber SK, Olsen AO, Werner EL, Magelssen M. Inconsistencies in rater-based assessments mainly affect borderline candidates: but using simple heuristics might improve pass-fail decisions. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2024:10.1007/s10459-024-10328-0. [PMID: 38649529 DOI: 10.1007/s10459-024-10328-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 03/24/2024] [Indexed: 04/25/2024]
Abstract
INTRODUCTION Research in various areas indicates that expert judgment can be highly inconsistent. However, expert judgment is indispensable in many contexts. In medical education, experts often function as examiners in rater-based assessments. Here, disagreement between examiners can have far-reaching consequences. The literature suggests that inconsistencies in ratings depend on the level of performance a to-be-evaluated candidate shows. This possibility has not been addressed deliberately and with appropriate statistical methods. By adopting the theoretical lens of ecological rationality, we evaluate if easily implementable strategies can enhance decision making in real-world assessment contexts. METHODS We address two objectives. First, we investigate the dependence of rater-consistency on performance levels. We recorded videos of mock-exams and had examiners (N=10) evaluate four students' performances and compare inconsistencies in performance ratings between examiner-pairs using a bootstrapping procedure. Our second objective is to provide an approach that aids decision making by implementing simple heuristics. RESULTS We found that discrepancies were largely a function of the level of performance the candidates showed. Lower performances were rated more inconsistently than excellent performances. Furthermore, our analyses indicated that the use of simple heuristics might improve decisions in examiner pairs. DISCUSSION Inconsistencies in performance judgments continue to be a matter of concern, and we provide empirical evidence for them to be related to candidate performance. We discuss implications for research and the advantages of adopting the perspective of ecological rationality. We point to directions both for further research and for development of assessment practices.
Collapse
Affiliation(s)
- Stefan K Schauber
- Centre for Health Sciences Education, Faculty of Medicine, University of Oslo, Oslo, Norway.
- Centre for Educational Measurement (CEMO), Faculty of Educational Sciences, University of Oslo, Oslo, Norway.
| | - Anne O Olsen
- Department of Community Medicine and Global Health, Institute of Health and Society, University of Oslo, Oslo, Norway
| | - Erik L Werner
- Department of General Practice, Institute of Health and Society, University of Oslo, Oslo, Norway
| | - Morten Magelssen
- Centre for Medical Ethics, Institute of Health and Society, University of Oslo, Oslo, Norway
| |
Collapse
|
2
|
Lee MC, Melcer EF, Merrell SB, Wong LY, Shields S, Eddington H, Trickey AW, Tsai J, Korndorffer JR, Lin DT, Liebert CA. Usability of ENTRUST as an Assessment Tool for Entrustable Professional Activities (EPAs): A Mixed Methods Analysis. JOURNAL OF SURGICAL EDUCATION 2023; 80:1693-1702. [PMID: 37821350 DOI: 10.1016/j.jsurg.2023.09.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 06/19/2023] [Accepted: 09/08/2023] [Indexed: 10/13/2023]
Abstract
OBJECTIVE As the American Board of Surgery transitions to a competency-based model of surgical education centered upon entrustable professional activities (EPAs), there is a growing need for objective tools to determine readiness for entrustment. This study evaluates the usability of ENTRUST, an innovative virtual patient simulation platform to assess surgical trainees' decision-making skills in preoperative, intra-operative, and post-operative settings. DESIGN This is a mixed-methods analysis of the usability of the ENTRUST platform. Quantitative data was collected using the system usability scale (SUS) and Likert responses. Analysis was performed with descriptive statistics, bivariate analysis, and multivariable linear regression. Qualitative analysis of open-ended responses was performed using the Nielsen-Shneiderman Heuristics framework. SETTING This study was conducted at an academic institution in a proctored exam setting. PARTICIPANTS The analysis includes n = 47 (PGY 1-5) surgical residents who completed an online usability survey following the ENTRUST Inguinal Hernia EPA Assessment. RESULTS The ENTRUST platform had a median SUS score of 82.5. On bivariate and multivariate analyses, there were no significant differences between usability based on demographic characteristics (all p > 0.05), and SUS score was independent of ENTRUST performance (r = 0.198, p = 0.18). Most participants agreed that the clinical workup of the patient was engaging (91.5%) and felt realistic (85.1%). The most frequent heuristics represented in the qualitative analysis included feedback, visibility, match, and control. Additional themes of educational value, enjoyment, and ease-of-use highlighted participants' perspectives on the usability of ENTRUST. CONCLUSIONS ENTRUST demonstrates high usability in this population. Usability was independent of ENTRUST score performance and there were no differences in usability identified in this analysis based on demographic subgroups. Qualitative analysis highlighted the acceptability of ENTRUST and will inform ongoing development of the platform. The ENTRUST platform holds potential as a tool for the assessment of EPAs in surgical residency programs.
Collapse
Affiliation(s)
- Melissa C Lee
- Stanford University School of Medicine, Stanford, California
| | - Edward F Melcer
- Department of Computational Media, University of California-Santa Cruz, Baskin School of Engineering, Santa Cruz, California
| | | | - Lye-Yeng Wong
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, California
| | - Samuel Shields
- Department of Computational Media, University of California-Santa Cruz, Baskin School of Engineering, Santa Cruz, California
| | - Hyrum Eddington
- Stanford-Surgery Policy Improvement Research and Education Center (S-SPIRE), Palo Alto, California
| | - Amber W Trickey
- Stanford-Surgery Policy Improvement Research and Education Center (S-SPIRE), Palo Alto, California
| | - Jason Tsai
- Department of Computational Media, University of California-Santa Cruz, Baskin School of Engineering, Santa Cruz, California; Department of Surgery, Stanford University School of Medicine, Stanford, California
| | - James R Korndorffer
- Department of Surgery, Stanford University School of Medicine, Stanford, California; VA Palo Alto Health Care System, Surgical Services, Palo Alto, California
| | - Dana T Lin
- Department of Surgery, Stanford University School of Medicine, Stanford, California
| | - Cara A Liebert
- Department of Surgery, Stanford University School of Medicine, Stanford, California; VA Palo Alto Health Care System, Surgical Services, Palo Alto, California.
| |
Collapse
|
3
|
Forrester CA, Lee DS, Hon E, Lim KY, Brock TP, Malone DT, Furletti SG, Lyons KM. Preceptor Perceptions of Pharmacy Student Performance Before and After a Curriculum Transformation. AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION 2023; 87:ajpe8575. [PMID: 34385168 PMCID: PMC10159500 DOI: 10.5688/ajpe8575] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 07/21/2021] [Indexed: 05/06/2023]
Abstract
Objective. To explore preceptors' perceptions about the performance of undergraduate pharmacy students during experiential placements in Australia, before and after curricular transformation.Methods. Using a semi-structured approach, we interviewed 26 preceptors who had recently supervised students who took part in the transformed curriculum and students from the previous curriculum. A directed content analysis approach was used to analyze the transcripts.Results. Preceptors described students from the transformed curriculum as having improved professional skills, behaviors, and attitudes and as having an increased ability to perform clinical activities compared to students of the previous curriculum. Preceptors also perceived that students in the transformed curriculum had improved clinical knowledge and knowledge application. They less frequently expressed that students in the transformed curriculum had lower-than-expected knowledge levels.Conclusion. The results of this study suggest that curricular transformation with a focus on skill-based and active learning can improve the performance of pharmacy students in terms of their professional behaviors and attitudes, skills, knowledge, and clinical abilities, as perceived by preceptors.
Collapse
Affiliation(s)
- Catherine A Forrester
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Da Sol Lee
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Ethel Hon
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Kai Ying Lim
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Tina P Brock
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Daniel T Malone
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Simon G Furletti
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| | - Kayley M Lyons
- Monash University, Faculty of Pharmacy and Pharmaceutical Sciences, Parkville, VIC, Australia
| |
Collapse
|
4
|
Van Meenen F, Coertjens L, Van Nes MC, Verschuren F. Peer overmarking and insufficient diagnosticity: the impact of the rating method for peer assessment. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2022; 27:1049-1066. [PMID: 35871407 DOI: 10.1007/s10459-022-10130-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 05/29/2022] [Indexed: 06/15/2023]
Abstract
The present study explores two rating methods for peer assessment (analytical rating using criteria and comparative judgement) in light of concurrent validity, reliability and insufficient diagnosticity (i.e. the degree to which substandard work is recognised by the peer raters). During a second-year undergraduate course, students wrote a one-page essay on an air pollutant. A first cohort (N = 260) relied on analytical rating using criteria to assess their peers' essays. A total of 1297 evaluations were made, and each essay received at least four peer ratings. Results indicate a small correlation between peer and teacher marks, and three essays of substandard quality were not recognised by the group of peer raters. A second cohort (N = 230) used comparative judgement. They completed 1289 comparisons, from which a rank order was calculated. Results suggest a large correlation between the university teacher marks and the peer scores and acceptable reliability of the rank order. In addition, the three essays of substandard quality were discerned as such by the group of peer raters. Although replication research is warranted, the results provide the first evidence that, when peer raters overmark and fail to identify substandard work using analytical rating with criteria, university teachers may consider changing the rating method of the peer assessment to comparative judgement.
Collapse
Affiliation(s)
- Florence Van Meenen
- Psychological Sciences Research Institute, Université catholique de Louvain, 10, Place Cardinal Mercier, 1348, Louvain-la-Neuve, Belgium.
| | - Liesje Coertjens
- Psychological Sciences Research Institute, Université catholique de Louvain, 10, Place Cardinal Mercier, 1348, Louvain-la-Neuve, Belgium
| | - Marie-Claire Van Nes
- Emergency Department, Cliniques Universitaires Saint-Luc, Institue of Experimental and Clinical Research IREC, Université catholique de Louvain, Brussels, Belgium
| | - Franck Verschuren
- Institute of Experimental and Clinical Research, Acute Medicine Department, Université catholique de Louvain, Brussels, Belgium
| |
Collapse
|
5
|
Yeates P, Moult A, Cope N, McCray G, Fuller R, McKinley R. Determining influence, interaction and causality of contrast and sequence effects in objective structured clinical exams. MEDICAL EDUCATION 2022; 56:292-302. [PMID: 34893998 PMCID: PMC9304241 DOI: 10.1111/medu.14713] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/03/2021] [Accepted: 12/08/2021] [Indexed: 06/14/2023]
Abstract
INTRODUCTION Differential rater function over time (DRIFT) and contrast effects (examiners' scores biased away from the standard of preceding performances) both challenge the fairness of scoring in objective structured clinical exams (OSCEs). This is important as, under some circumstances, these effects could alter whether some candidates pass or fail assessments. Benefitting from experimental control, this study investigated the causality, operation and interaction of both effects simultaneously for the first time in an OSCE setting. METHODS We used secondary analysis of data from an OSCE in which examiners scored embedded videos of student performances interspersed between live students. Embedded video position varied between examiners (early vs. late) whilst the standard of preceding performances naturally varied (previous high or low). We examined linear relationships suggestive of DRIFT and contrast effects in all within-OSCE data before comparing the influence and interaction of 'early' versus 'late' and 'previous high' versus 'previous low' conditions on embedded video scores. RESULTS Linear relationships data did not support the presence of DRIFT or contrast effects. Embedded videos were scored higher early (19.9 [19.4-20.5]) versus late (18.6 [18.1-19.1], p < 0.001), but scores did not differ between previous high and previous low conditions. The interaction term was non-significant. CONCLUSIONS In this instance, the small DRIFT effect we observed on embedded videos can be causally attributed to examiner behaviour. Contrast effects appear less ubiquitous than some prior research suggests. Possible mediators of these finding include the following: OSCE context, detail of task specification, examiners' cognitive load and the distribution of learners' ability. As the operation of these effects appears to vary across contexts, further research is needed to determine the prevalence and mechanisms of contrast and DRIFT effects, so that assessments may be designed in ways that are likely to avoid their occurrence. Quality assurance should monitor for these contextually variable effects in order to ensure OSCE equivalence.
Collapse
Affiliation(s)
- Peter Yeates
- School of MedicineKeele UniversityKeeleUK
- Fairfield General HospitalPennine Acute Hospitals NHS TrustBuryUK
| | | | | | | | | | | |
Collapse
|
6
|
Humphrey-Murto S, Shaw T, Touchie C, Pugh D, Cowley L, Wood TJ. Are raters influenced by prior information about a learner? A review of assimilation and contrast effects in assessment. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2021; 26:1133-1156. [PMID: 33566199 DOI: 10.1007/s10459-021-10032-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 01/25/2021] [Indexed: 06/12/2023]
Abstract
Understanding which factors can impact rater judgments in assessments is important to ensure quality ratings. One such factor is whether prior performance information (PPI) about learners influences subsequent decision making. The information can be acquired directly, when the rater sees the same learner, or different learners over multiple performances, or indirectly, when the rater is provided with external information about the same learner prior to rating a performance (i.e., learner handover). The purpose of this narrative review was to summarize and highlight key concepts from multiple disciplines regarding the influence of PPI on subsequent ratings, discuss implications for assessment and provide a common conceptualization to inform research. Key findings include (a) assimilation (rater judgments are biased towards the PPI) occurs with indirect PPI and contrast (rater judgments are biased away from the PPI) with direct PPI; (b) negative PPI appears to have a greater effect than positive PPI; (c) when viewing multiple performances, context effects of indirect PPI appear to diminish over time; and (d) context effects may occur with any level of target performance. Furthermore, some raters are not susceptible to context effects, but it is unclear what factors are predictive. Rater expertise and training do not consistently reduce effects. Making raters more accountable, providing specific standards and reducing rater cognitive load may reduce context effects. Theoretical explanations for these findings will be discussed.
Collapse
Affiliation(s)
- Susan Humphrey-Murto
- Department of Medicine, Faculty of Medicine, The Ottawa Hospital-Riverside Campus, University of Ottawa, 1967 Riverside Drive, Box 67, Ottawa, ON, Canada.
- Department of Innovation in Medical Education, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada.
| | - Tammy Shaw
- Department of Medicine, Faculty of Medicine, The Ottawa Hospital-General Campus, Ottawa, ON, Canada
| | - Claire Touchie
- Department of Medicine, Faculty of Medicine, The Ottawa Hospital-Riverside Campus, University of Ottawa, 1967 Riverside Drive, Box 67, Ottawa, ON, Canada
- Medical Council of Canada, Ottawa, ON, Canada
| | - Debra Pugh
- Department of Medicine, Faculty of Medicine, The Ottawa Hospital-Riverside Campus, University of Ottawa, 1967 Riverside Drive, Box 67, Ottawa, ON, Canada
- Medical Council of Canada, Ottawa, ON, Canada
| | - Lindsay Cowley
- Department of Innovation in Medical Education, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Timothy J Wood
- Department of Innovation in Medical Education, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
7
|
Daly Guris RJ, Miller CR, Schiavi A, Toy S. Examining novice anaesthesia trainee simulation performance: a tale of two clusters. BMJ SIMULATION & TECHNOLOGY ENHANCED LEARNING 2021; 7:548-554. [DOI: 10.1136/bmjstel-2020-000812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/28/2021] [Indexed: 11/04/2022]
Abstract
IntroductionUnderstanding performance differences between learners may provide useful context for optimising medical education. This pilot study aimed to explore a technique to contextualise performance differences through retrospective secondary analyses of two randomised controlled simulation studies. One study focused on speaking up (non-technical skill); the other focused on oxygen desaturation management (technical skill).MethodsWe retrospectively analysed data from two independent simulation studies conducted in 2017 and 2018. We used multivariate hierarchical cluster analysis to explore whether participants in each study formed homogenous performance clusters. We then used mixed-design analyses of variance and χ2 analyses to examine whether reported task load differences or demographic variables were associated with cluster membership.ResultsIn both instances, a two-cluster solution emerged; one cluster represented trainees exhibiting higher performance relative to peers in the second cluster. Cluster membership was independent of experimental allocation in each of the original studies. There were no discernible demographic differences between cluster members. Performance differences between clusters persisted for at least 8 months for the non-technical skill but quickly disappeared following simulation training for the technical skill. High performers in speaking up initially reported lower task load than standard performers, a difference that disappeared over time. There was no association between performance and task load during desaturation management.ConclusionThis pilot study suggests that cluster analysis can be used to objectively identify high-performing trainees for both a technical and a non-technical skill as observed in a simulated clinical setting. Non-technical skills may be more difficult to teach and retain than purely technical ones, and there may be an association between task load and initial non-technical performance. Further study is needed to understand what factors may confer inherent performance advantages, whether these advantages translate to clinical performance and how curricula can best be designed to drive targeted improvement for individual trainees.
Collapse
|
8
|
Burm S, Chahine S, Goldszmidt M. "Doing it Right" Overnight: a Multi-perspective Qualitative Study Exploring Senior Medical Resident Overnight Call. J Gen Intern Med 2021; 36:881-887. [PMID: 33078297 PMCID: PMC8041983 DOI: 10.1007/s11606-020-06284-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 10/05/2020] [Indexed: 11/27/2022]
Abstract
BACKGROUND Competency-based medical education (CBME) requires the development of workplace-based assessment tools that are grounded in authentic clinical work. Developing such tools, however, requires a deep understanding of the underlying facets of the competencies being assessed. Gaining this understanding remains challenging in contexts where performance is not readily visible to supervisors such as the senior medical resident (SMR) on-call role in internal medicine. OBJECTIVE This study draws on the perspectives of healthcare professionals with whom the SMR interacts with overnight to generate insights into the different components of on-call SMR practice and the range of ways SMRs effectively and less effectively enact these. APPROACH We used a constructivist grounded theory (CGT) approach to examine variation in how on-call SMRs carry out their role overnight. PARTICIPANTS Six medical students, five junior residents, five internal medicine attending physicians, five emergency physicians, and three emergency nurses conducted observations of their on-call interactions with SMRs. Participants were then interviewed and asked to elaborate on their observations as well as provide comparative reflections on the practices of past SMRs they worked with. KEY RESULTS Strong collaboration and organizational skills were identified as critical components to effectively being the on-call SMR. Perceived weaker SMRs, while potentially also having issues with clinical skills, stood out more when they could not effectively manage the realities of collaboration in a busy workplace. CONCLUSION What consistently differentiated a perceived effective SMR from a less effective SMR was someone who was equipped to manage the realities of interprofessional collaboration in a busy workplace. Our study invites medical educators to consider what residents, particularly those in more complex roles, need to receive feedback on to support their development as physicians. It is our intention that the findings be used to inform the ways programs approach teaching, assessment, and the provision of feedback.
Collapse
Affiliation(s)
- Sarah Burm
- Continuing Professional Development/Division of Medical Education, Faculty of Medicine, Dalhousie University, , Room 2L-23 Sir Charles Tupper Medical Building, 5850 College Street, Halifax, Nova Scotia, Canada.
| | - Saad Chahine
- Faculty of Education, Queen's University, , Kingston, Ontario, Canada
| | - Mark Goldszmidt
- Division of General Internal Medicine, Department of Medicine, Centre for Education Research and Innovation, Western University, , London, Ontario, Canada
| |
Collapse
|
9
|
Shaw T, Wood TJ, Touchie C, Pugh D, Humphrey-Murto SM. How biased are you? The effect of prior performance information on attending physician ratings and implications for learner handover. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2021; 26:199-214. [PMID: 32577927 DOI: 10.1007/s10459-020-09979-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 06/15/2020] [Indexed: 06/11/2023]
Abstract
Learner handover (LH), the process of sharing of information about learners between faculty supervisors, allows for longitudinal assessment fundamental in the competency-based education model. However, the potential to bias future assessments has been raised as a concern. The purpose of this study is to determine whether prior performance information such as LH influences the assessment of learners in the clinical context. Between December 2017 and June 2018, forty-two faculty members and final-year residents from the Department of Medicine at the University of Ottawa were assigned to one of three study groups through quasi-randomisation, taking into account gender, speciality and rater experience. In a counter-balanced design, each group received either positive, negative or no LH prior to watching six simulated learner-patient encounter videos. Participants rated each video using the mini-CEX and completed a questionnaire on the raters' general impressions of LH. A significant difference in the mean mini-CEX competency scale scores between the negative (M = 5.29) and positive (M = 5.97) LH groups (P < .001, d = 0.81) was noted. Similar findings were found for the single overall clinical competence ratings. In the post-study questionnaire, 22/28 (78%) of participants had correctly deduced the purpose of the study and 14/28 (50%) felt LH did not influence their assessment. LH influenced mini-CEX scores despite raters' awareness of the potential for bias. These results suggest that LH could influence a rater's performance assessment and careful consideration of the potential implications of LH is required.
Collapse
Affiliation(s)
- Tammy Shaw
- Department of Medicine, University of Ottawa, Ottawa, ON, Canada.
- The Ottawa Hospital - General Campus, 501 Smyth Road Box 209, Ottawa, ON, K1H 8L6, Canada.
| | - Timothy J Wood
- Department of Innovation in Medical Education, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Claire Touchie
- Department of Medicine, University of Ottawa, Ottawa, ON, Canada
- Department of Innovation in Medical Education, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
- Medical Council of Canada, Ottawa, Canada
| | - Debra Pugh
- Department of Medicine, University of Ottawa, Ottawa, ON, Canada
- Medical Council of Canada, Ottawa, Canada
| | - Susan M Humphrey-Murto
- Department of Medicine, University of Ottawa, Ottawa, ON, Canada
- Department of Innovation in Medical Education, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
10
|
Davids J, Makariou SG, Ashrafian H, Darzi A, Marcus HJ, Giannarou S. Automated Vision-Based Microsurgical Skill Analysis in Neurosurgery Using Deep Learning: Development and Preclinical Validation. World Neurosurg 2021; 149:e669-e686. [PMID: 33588081 DOI: 10.1016/j.wneu.2021.01.117] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 01/22/2021] [Accepted: 01/23/2021] [Indexed: 12/22/2022]
Abstract
BACKGROUND/OBJECTIVE Technical skill acquisition is an essential component of neurosurgical training. Educational theory suggests that optimal learning and improvement in performance depends on the provision of objective feedback. Therefore, the aim of this study was to develop a vision-based framework based on a novel representation of surgical tool motion and interactions capable of automated and objective assessment of microsurgical skill. METHODS Videos were obtained from 1 expert, 6 intermediate, and 12 novice surgeons performing arachnoid dissection in a validated clinical model using a standard operating microscope. A mask region convolutional neural network framework was used to segment the tools present within the operative field in a recorded video frame. Tool motion analysis was achieved using novel triangulation metrics. Performance of the framework in classifying skill levels was evaluated using the area under the curve and accuracy. Objective measures of classifying the surgeons' skill level were also compared using the Mann-Whitney U test, and a value of P < 0.05 was considered statistically significant. RESULTS The area under the curve was 0.977 and the accuracy was 84.21%. A number of differences were found, which included experts having a lower median dissector velocity (P = 0.0004; 190.38 ms-1 vs. 116.38 ms-1), and a smaller inter-tool tip distance (median 46.78 vs. 75.92; P = 0.0002) compared with novices. CONCLUSIONS Automated and objective analysis of microsurgery is feasible using a mask region convolutional neural network, and a novel tool motion and interaction representation. This may support technical skills training and assessment in neurosurgery.
Collapse
Affiliation(s)
- Joseph Davids
- Department of Surgery and Cancer, Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, St. Mary's Praed St., Paddington, London, United Kingdom; Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, United Kingdom
| | - Savvas-George Makariou
- Department of Surgery and Cancer, Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom
| | - Hutan Ashrafian
- Department of Surgery and Cancer, Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, St. Mary's Praed St., Paddington, London, United Kingdom
| | - Ara Darzi
- Department of Surgery and Cancer, Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, St. Mary's Praed St., Paddington, London, United Kingdom
| | - Hani J Marcus
- Department of Surgery and Cancer, Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, St. Mary's Praed St., Paddington, London, United Kingdom; Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, United Kingdom
| | - Stamatia Giannarou
- Department of Surgery and Cancer, Hamlyn Centre for Robotic Surgery, Imperial College London, London, United Kingdom.
| |
Collapse
|
11
|
Yeates P, Moult A, Lefroy J, Walsh-House J, Clews L, McKinley R, Fuller R. Understanding and developing procedures for video-based assessment in medical education. MEDICAL TEACHER 2020; 42:1250-1260. [PMID: 32749915 DOI: 10.1080/0142159x.2020.1801997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
INTRODUCTION Novel uses of video aim to enhance assessment in health-professionals education. Whilst these uses presume equivalence between video and live scoring, some research suggests that poorly understood variations could challenge validity. We aimed to understand examiners' and students' interaction with video whilst developing procedures to promote its optimal use. METHODS Using design-based research we developed theory and procedures for video use in assessment, iteratively adapting conditions across simulated OSCE stations. We explored examiners' and students' perceptions using think-aloud, interviews and focus group. Data were analysed using constructivist grounded-theory methods. RESULTS Video-based assessment produced detachment and reduced volitional control for examiners. Examiners ability to make valid video-based judgements was mediated by the interaction of station content and specifically selected filming parameters. Examiners displayed several judgemental tendencies which helped them manage videos' limitations but could also bias judgements in some circumstances. Students rarely found carefully-placed cameras intrusive and considered filming acceptable if adequately justified. DISCUSSION Successful use of video-based assessment relies on balancing the need to ensure station-specific information adequacy; avoiding disruptive intrusion; and the degree of justification provided by video's educational purpose. Video has the potential to enhance assessment validity and students' learning when an appropriate balance is achieved.
Collapse
Affiliation(s)
- Peter Yeates
- School of Medicine, Keele University, Keele, UK
- Department of Acute Medicine, Fairfield General Hospital, Pennine Acute Hospital NHS Trust, Bury, UK
| | - Alice Moult
- School of Medicine, Keele University, Keele, UK
| | | | | | | | | | - Richard Fuller
- School of Medicine, University of Liverpool, Liverpool, UK
| |
Collapse
|
12
|
Zimmermann P, Kadmon M. Standardized examinees: development of a new tool to evaluate factors influencing OSCE scores and to train examiners. GMS JOURNAL FOR MEDICAL EDUCATION 2020; 37:Doc40. [PMID: 32685668 PMCID: PMC7346289 DOI: 10.3205/zma001333] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 02/23/2020] [Accepted: 04/27/2020] [Indexed: 05/27/2023]
Abstract
Introduction: The Objective Structured Clinical Examination (OSCE) is an established format for practical clinical assessments at most medical schools and discussion is underway in Germany to make it part of future state medical exams. Examiner behavior that influences assessment results is described. Erroneous assessments of student performance can result, for instance, from systematic leniency, inconsistent grading, halo effects, and even a lack of differentiation between the tasks to be performed over the entire grading scale. The aim of this study was to develop a quality assurance tool that can monitor factors influencing grading in a real OSCE and enable targeted training of examiners. Material, Methods and Students: Twelve students at the Medical Faculty of the University of Heidelberg were each trained to perform a defined task for a particular surgical OSCE station. Definitions were set and operationalized for an excellent and a borderline performance. In a simulated OSCE during the first part of the study, the standardized student performances were assessed and graded by different examiners three times in succession; video recordings were made. Quantitative and qualitative analysis of the videos was also undertaken by the study coordinator. In the second part of the study, the videos were used to investigate the examiners' acceptance of standardized examinees and to analyze potential influences on scoring that stemmed from the examiners' experience. Results: In the first part of the study, the OSCE scores and subsequent video analysis showed that standardization for defined performance levels at different OSCE stations is generally possible. Individual deviations from the prescribed examinee responses were observed and occurred primarily with increased complexity of OSCE station content. In the second part of the study, inexperienced examiners assessed a borderline performance significantly lower than their experienced colleagues (13.50 vs. 15.15, p=0.035). No difference was seen in the evaluation of the excellent examinees. Both groups of examiners graded the item "ocial competence" - despite identical standardization - significantly lower for examinees with borderline performances than for excellent examinees (4.13 vs. 4.80, p<0.001). Conclusion: Standardization of examinees for previously defined performance levels is possible, making a new tool available in future not only for OSCE quality assurance, but also for training examiners. Detailed preparation of the OSCE checklists and intensive training of the examinees are essential. This new tool takes on a special importance if standardized OSCEs are integrated into state medical exams and, as such, become high-stakes assessments.
Collapse
Affiliation(s)
- Petra Zimmermann
- Ludwig-Maximilians-Universität München, Klinikum der Universität, Klinik für Allgemein-, Viszeral- und Transplantationschirurgie, München, Germany
| | - Martina Kadmon
- Universität Augsburg, Medizinische Fakultät, Gründungsdekanat, Augsburg, Germany
| |
Collapse
|
13
|
Lewis P, Hunt L, Ramjan LM, Daly M, O'Reilly R, Salamonson Y. Factors contributing to undergraduate nursing students' satisfaction with a video assessment of clinical skills. NURSE EDUCATION TODAY 2020; 84:104244. [PMID: 31715471 DOI: 10.1016/j.nedt.2019.104244] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 09/06/2019] [Accepted: 10/10/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND Clinical skill assessment via Objective Structured Clinical Assessment (OSCA) has many challenges for undergraduate nursing students. These include high levels of anxiety that can compromise performance during the assessment, inconsistency with assessor reliability and is inconsistent with clinical skills performance in the real world. The implementation of a Video Assessment of Clinical Skills (VACS) that integrates formative feedback may be a way to address the challenges posed by OSCA assessment. OBJECTIVES The aim of this study was to examine the acceptability, utility, and nursing student satisfaction with a formative feedback strategy - the Video Assessment of a Clinical Skill (VACS). DESIGN A cross sectional survey. SETTINGS Undergraduate Bachelor of Nursing degree students from a large Australian University. PARTICIPANTS Third year undergraduate nursing students (final year) enrolled in a Bachelor of Nursing Program. METHODS Participants were recruited via purposive sampling. A pre-survey (prior to VACs assessment) and post-survey (after VACS assessment) were completed. This paper reports on the open-ended responses in the post-survey that explored students' insights and perceptions into formative feedback and its impact on their learning for the VACS assessment. RESULTS A total of 731 open-ended responses were analysed with findings being organised into 3 major themes; (i) Flexibility and reflexivity, (ii) Editing and repeated attempts, and (iii) Working together. CONCLUSIONS Video Assessment of a Clinical Skill has demonstrated good utility, acceptability, and satisfaction among undergraduate nursing students.
Collapse
Affiliation(s)
- Peter Lewis
- Western Sydney University, School of Nursing and Midwifery, Locked Bag 1797, Penrith, NSW 2751, Australia.
| | - Leanne Hunt
- Western Sydney University, School of Nursing and Midwifery, Locked Bag 1797, Penrith, NSW 2751, Australia.
| | - Lucie M Ramjan
- Western Sydney University, School of Nursing and Midwifery, Locked Bag 1797, Penrith, NSW 2751, Australia.
| | - Miranda Daly
- Western Sydney University, School of Nursing and Midwifery, Locked Bag 1797, Penrith, NSW 2751, Australia.
| | - Rebecca O'Reilly
- Western Sydney University, School of Nursing and Midwifery, Locked Bag 1797, Penrith, NSW 2751, Australia.
| | - Yenna Salamonson
- Western Sydney University, School of Nursing and Midwifery, Locked Bag 1797, Penrith, NSW 2751, Australia.
| |
Collapse
|
14
|
Fieler G. Utilization of Video for Competency Evaluation. Nurs Educ Perspect 2020; 41:255-257. [PMID: 32574477 DOI: 10.1097/01.nep.0000000000000645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Learner self-assessment and accountability are important features in nursing education. Innovative competency sessions enhance the delivery of quality care. This project investigated the addition of video into established competency evaluations for the purpose of learner review and self-evaluation. An experimental design was used to determine the effects of video on learner self-evaluation. With 95 percent confidence, the mean total self-evaluation score for those without the video was higher than for those with the video, supporting the hypothesis that those without video recall would inflate their self-evaluation scores.
Collapse
Affiliation(s)
- Gina Fieler
- About the Author Gina Fieler, DNP, RN, CHSE, is Director of Clinical Simulation, Northern Kentucky University, Highland Heights, Kentucky. For more information, contact her at
| |
Collapse
|
15
|
MacKenzie DE, Merritt BK, Holstead R, Sarty GE. Professional practice behaviour: Identification and validation of key indicators. Br J Occup Ther 2019. [DOI: 10.1177/0308022619879361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
IntroductionProfessional behaviour is regarded as an important competency for occupational therapy practice, yet little guidance exists for indicators underpinning development or remediation in the educational or practice settings. This study sought to confirm the content validity of observable professional behaviour indicators from an existing evaluation framework for representativeness and relevance for occupational therapy practice.MethodsA modified Delphi approach was conducted with expert panellists ( n = 30) consisting of regulators, administrators, faculty members, practitioners, and students for professional behaviour indicator consensus, together with a cross-sectional survey of practitioners ( n = 119). Fleiss’ κ and χ2 contingency tables were completed for agreement across panellists, and between panellist and survey groups. Cross-case qualitative analyses identified facilitators and barriers for professional behaviour practice.ResultsContent validity of 17 professional behaviour indicators was achieved, with >85% agreement from the expert panellists and the cross-sectional survey group. Main professional behaviour reporting issues in practice included fear of reprisal, lack of formal policies, and an unsupportive culture. Support from others, documented workplace policies, and self-regulation/duty to monitor were the critical facilitators for supporting professional behaviour in practice.ConclusionThe professional behaviour indicators in this study offer observable behaviours from which assessment rubrics or tools may be developed. Further study is warranted.
Collapse
|
16
|
Holmstrom AL, Meyerson SL. Obtaining Meaningful Assessment in Thoracic Surgery Education. Thorac Surg Clin 2019; 29:239-247. [PMID: 31235292 DOI: 10.1016/j.thorsurg.2019.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Training in thoracic surgery has evolved immensely over the past decade due to the advent of integrated programs, technological innovations, and regulations on resident duty hours, decreasing the time trainees have to learn. These changes have made assessment of thoracic surgical trainees even more important. Shifts in medical education have increasingly emphasized competency, which has led to novel competency-based assessment tools for clinical and operative assessment. These novel tools take advantage of simulation and modern technology to provide more frequent and comprehensive assessment of the surgical trainee to ensure competence.
Collapse
Affiliation(s)
- Amy L Holmstrom
- Department of Surgery, Northwestern University Feinberg School of Medicine, 676 North Saint Clair Street, Suite 2320, Chicago, IL 60611, USA
| | - Shari L Meyerson
- Department of Surgery, University of Kentucky, 740 South Limestone, Suite A301, Lexington, KY 40536, USA.
| |
Collapse
|
17
|
|
18
|
Lee V, Brain K, Martin J. From opening the 'black box' to looking behind the curtain: cognition and context in assessor-based judgements. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2019; 24:85-102. [PMID: 30302670 DOI: 10.1007/s10459-018-9851-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 09/06/2018] [Indexed: 06/08/2023]
Abstract
The increasing use of direct observation tools to assess routine performance has resulted in the growing reliance on assessor-based judgements in the workplace. However, we have a limited understanding of how assessors make judgements and formulate ratings in real world contexts. The current research on assessor cognition has largely focused on the cognitive domain but the contextual factors are equally important, and both are closely interconnected. This study aimed to explore the perceived cognitive and contextual factors influencing Mini-CEX assessor judgements in the Emergency Department setting. We used a conceptual framework of assessor-based judgement to develop a sequential mixed methods study. We analysed and integrated survey and focus group results to illustrate self-reported cognitive and contextual factors influencing assessor judgements. We used situated cognition theory as a sensitizing lens to explore the interactions between people and their environment. The major factors highlighted through our mixed methods study were: clarity of the assessment, reliance on and variable approach to overall impression (gestalt), role tension especially when giving constructive feedback, prior knowledge of the trainee and case complexity. We identified prevailing tensions between participants (assessors and trainees), interactions (assessment and feedback) and setting. The two practical implications of our research are the need to broaden assessor training to incorporate both cognitive and contextual domains, and the need to develop a more holistic understanding of assessor-based judgements in real world contexts to better inform future research and development in workplace-based assessments.
Collapse
Affiliation(s)
- Victor Lee
- Department of Emergency Medicine, Austin Health, P.O. Box 5555, Heidelberg, VIC, 3084, Australia.
| | | | - Jenepher Martin
- Eastern Health Clinical School, Monash University and Deakin University, Box Hill, VIC, Australia
| |
Collapse
|
19
|
Dory V, Gomez-Garibello C, Cruess R, Cruess S, Cummings BA, Young M. The challenges of detecting progress in generic competencies in the clinical setting. MEDICAL EDUCATION 2018; 52:1259-1270. [PMID: 30430619 DOI: 10.1111/medu.13749] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 03/26/2018] [Accepted: 08/28/2018] [Indexed: 05/13/2023]
Abstract
CONTEXT Competency-based medical education has spurred the implementation of longitudinal workplace-based assessment (WBA) programmes to track learners' development of competencies. These hinge on the appropriate use of assessment instruments by assessors. This study aimed to validate our assessment programme and specifically to explore whether assessors' beliefs and behaviours rendered the detection of progress possible. METHODS We implemented a longitudinal WBA programme in the third year of a primarily rotation-based clerkship. The programme used the professionalism mini-evaluation exercise (P-MEX) to detect progress in generic competencies. We used mixed methods: a retrospective psychometric examination of student assessment data in one academic year, and a prospective focus group and interview study of assessors' beliefs and reported behaviours related to the assessment. RESULTS We analysed 1662 assessment forms for 186 students. We conducted interviews and focus groups with 21 assessors from different professions and disciplines. Scores were excellent from the outset (3.5-3.7/4), with no meaningful increase across blocks (average overall scores: 3.6 in block 1 versus 3.7 in blocks 2 and 3; F = 8.310, d.f. 2, p < 0.001). The main source of variance was the forms (47%) and only 1% of variance was attributable to students, which led to low generalisability across forms (Eρ2 = 0.18). Assessors reported using multiple observations to produce their assessments and were reluctant to harm students by consigning anything negative to writing. They justified the use of a consistent benchmark across time by citing the basic nature of the form or a belief that the 'competencies' assessed were in fact fixed attributes that were unlikely to change. CONCLUSIONS Assessors may purposefully deviate from instructions in order to meet their ethical standards of good assessment. Furthermore, generic competencies may be viewed as intrinsic and fixed rather than as learnable. Implementing a longitudinal WBA programme is complex and requires careful consideration of assessors' beliefs and values.
Collapse
Affiliation(s)
- Valérie Dory
- Centre for Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Medicine, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Carlos Gomez-Garibello
- Centre for Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Medicine, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Richard Cruess
- Centre for Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Surgery, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Sylvia Cruess
- Centre for Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Medicine, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Beth-Ann Cummings
- Centre for Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Medicine, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Undergraduate Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Meredith Young
- Centre for Medical Education, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Medicine, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
20
|
Gingerich A, Schokking E, Yeates P. Comparatively salient: examining the influence of preceding performances on assessors' focus and interpretations in written assessment comments. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2018; 23:937-959. [PMID: 29980956 DOI: 10.1007/s10459-018-9841-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 07/03/2018] [Indexed: 06/08/2023]
Abstract
Recent literature places more emphasis on assessment comments rather than relying solely on scores. Both are variable, however, emanating from assessment judgements. One established source of variability is "contrast effects": scores are shifted away from the depicted level of competence in a preceding encounter. The shift could arise from an effect on the range-frequency of assessors' internal scales or the salience of performance aspects within assessment judgments. As these suggest different potential interventions, we investigated assessors' cognition by using the insight provided by "clusters of consensus" to determine whether any change in the salience of performance aspects was induced by contrast effects. A dataset from a previous experiment contained scores and comments for 3 encounters: 2 with significant contrast effects and 1 without. Clusters of consensus were identified using F-sort and latent partition analysis both when contrast effects were significant and non-significant. The proportion of assessors making similar comments only significantly differed when contrast effects were significant with assessors more frequently commenting on aspects that were dissimilar with the standard of competence demonstrated in the preceding performance. Rather than simply influencing range-frequency of assessors' scales, preceding performances may affect salience of performance aspects through comparative distinctiveness: when juxtaposed with the context some aspects are more distinct and selectively draw attention. Research is needed to determine whether changes in salience indicate biased or improved assessment information. The potential should be explored to augment existing benchmarking procedures in assessor training by cueing assessors' attention through observation of reference performances immediately prior to assessment.
Collapse
Affiliation(s)
- Andrea Gingerich
- Northern Medical Program, University of Northern British Columbia, 3333 University Way, Prince George, BC, V2N 4Z9, Canada.
| | - Edward Schokking
- Northern Medical Program, University of Northern British Columbia, 3333 University Way, Prince George, BC, V2N 4Z9, Canada
| | - Peter Yeates
- Keele University School of Medicine, Keele, Staffordshire, UK
- Pennine Acute Hospitals NHS Trust, Bury, Lancashire, UK
| |
Collapse
|
21
|
Wilbur K, Wilby KJ, Pawluk S. Pharmacy Preceptor Judgments of Student Performance and Behavior During Experiential Training. AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION 2018; 82:6451. [PMID: 30643308 PMCID: PMC6325462 DOI: 10.5688/ajpe6451] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Accepted: 08/10/2017] [Indexed: 05/29/2023]
Abstract
Objective. To report the findings of how Canadian preceptors perceive and subsequently evaluate diverse levels of trainees during pharmacy clerkships. Methods. Using modified Delphi technique, 17 Doctor of Pharmacy (PharmD) preceptors from across Canada categorized 16 student narrative descriptions pertaining to their perception of described student performance: exceeds, meets, or falls below their expectations. Results. Twelve (75%) student narratives profiles were categorized unanimously in the final round, six of which were below expectations. Out of 117 ratings of below expectations by responding preceptors, the majority (115, 98%) of post-baccalaureate PharmD students described would fail. Conversely, if the same narrative instead profiled a resident or an entry-to-practice PharmD student, rotation failure decreased to 95 (81%) and 89 (76%), respectively. Conclusion. Pharmacy preceptors do not uniformly judge the same described student performance and inconsistently apply failing rotation grades when they do agree that performance falls below expectations.
Collapse
Affiliation(s)
- Kerry Wilbur
- Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Kyle J. Wilby
- School of Pharmacy, University of Otago, Dunedin, New Zealand
| | - Shane Pawluk
- College of Pharmacy, Qatar University, Doha, Qatar
| |
Collapse
|
22
|
Frallicciardi A, Lotterman S, Ledford M, Prenovitz I, Meter RV, Kuo C, Nowicki T, Fuller R. Training for Failure: A Simulation Program for Emergency Medicine Residents to Improve Communication Skills in Service Recovery. AEM EDUCATION AND TRAINING 2018; 2:277-287. [PMID: 30386837 PMCID: PMC6194038 DOI: 10.1002/aet2.10116] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 06/08/2018] [Accepted: 06/10/2018] [Indexed: 06/08/2023]
Abstract
OBJECTIVES Service failures such as long waits, testing delays, and medical errors are daily occurrences in every emergency department (ED). Service recovery refers to the immediate response of an organization or individual to resolve these failures. Effective service recovery can improve the experience of both the patient and the physician. This study investigated a simulation-based program to improve service recovery skills in postgraduate year 1 emergency medicine (PGY-1 EM) residents. METHODS Eighteen PGY-1 EM residents participated in six cases that simulated common ED service failures. The patient instructors (PIs) participating in each case and two independent emergency medicine (EM) faculty observers used the modified Master Interview Rating Scale to assess the communication skills of each resident in three simulation cases before and three simulation cases after a service recovery debriefing. For each resident, the mean scores of the first three cases and those of the last three cases were termed pre- and postintervention scores, respectively. The means and standard deviations of the pre- and postintervention scores were calculated by the type of rater and compared using paired t-tests. Additionally, the mean scores of each case were summarized. In the framework of the linear mixed-effects model, the variance in scores from the PIs and faculty observers was decomposed into variance contributed by PIs/cases, the program effect on individual residents, and the unexplained variance. In reliability analyses, the intraclass correlation coefficient between rater types and the 95% confidence interval were reported before and after the intervention. RESULTS When rated by the PIs, the pre- and postintervention scores showed no difference (p = 0.852). In contrast, when scored by the faculty observers, the postintervention score was significantly improved compared to the preintervention score (p < 0.001). In addition, for the faculty observers, the program effect was a significant contributor to the variation in scores. Low intraclass correlation was observed between rater groups. CONCLUSIONS This innovative simulation-based program was effective at teaching service recovery communication skills to residents as evaluated by EM faculty, but not PIs. This study supports further exploration into programs to teach and evaluate service recovery communication skills in EM residents.
Collapse
Affiliation(s)
- Alise Frallicciardi
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| | - Seth Lotterman
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| | - Matthew Ledford
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| | - Ilana Prenovitz
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| | - Rochelle Van Meter
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| | - Chia‐Ling Kuo
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| | - Thomas Nowicki
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
- Emergency DepartmentHartford HospitalHartfordCT
| | - Robert Fuller
- Department of Emergency MedicineUniversity of Connecticut School of MedicineFarmingtonCT
| |
Collapse
|
23
|
de Jonge LPJWM, Timmerman AA, Govaerts MJB, Muris JWM, Muijtjens AMM, Kramer AWM, van der Vleuten CPM. Stakeholder perspectives on workplace-based performance assessment: towards a better understanding of assessor behaviour. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2017; 22:1213-1243. [PMID: 28155004 PMCID: PMC5663793 DOI: 10.1007/s10459-017-9760-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Accepted: 01/24/2017] [Indexed: 05/13/2023]
Abstract
Workplace-Based Assessment (WBA) plays a pivotal role in present-day competency-based medical curricula. Validity in WBA mainly depends on how stakeholders (e.g. clinical supervisors and learners) use the assessments-rather than on the intrinsic qualities of instruments and methods. Current research on assessment in clinical contexts seems to imply that variable behaviours during performance assessment of both assessors and learners may well reflect their respective beliefs and perspectives towards WBA. We therefore performed a Q methodological study to explore perspectives underlying stakeholders' behaviours in WBA in a postgraduate medical training program. Five different perspectives on performance assessment were extracted: Agency, Mutuality, Objectivity, Adaptivity and Accountability. These perspectives reflect both differences and similarities in stakeholder perceptions and preferences regarding the utility of WBA. In comparing and contrasting the various perspectives, we identified two key areas of disagreement, specifically 'the locus of regulation of learning' (i.e., self-regulated versus externally regulated learning) and 'the extent to which assessment should be standardised' (i.e., tailored versus standardised assessment). Differing perspectives may variously affect stakeholders' acceptance, use-and, consequently, the effectiveness-of assessment programmes. Continuous interaction between all stakeholders is essential to monitor, adapt and improve assessment practices and to stimulate the development of a shared mental model. Better understanding of underlying stakeholder perspectives could be an important step in bridging the gap between psychometric and socio-constructivist approaches in WBA.
Collapse
Affiliation(s)
- Laury P J W M de Jonge
- Department of Family Medicine, FHML, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands.
| | - Angelique A Timmerman
- Department of Family Medicine, FHML, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
| | - Marjan J B Govaerts
- Department of Educational Research and Development, FHML, Maastricht University, Maastricht, The Netherlands
| | - Jean W M Muris
- Department of Family Medicine, FHML, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
| | - Arno M M Muijtjens
- Department of Educational Research and Development, FHML, Maastricht University, Maastricht, The Netherlands
| | - Anneke W M Kramer
- Department of Family Medicine, Leiden University, Leiden, The Netherlands
| | - Cees P M van der Vleuten
- Department of Educational Research and Development, FHML, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
24
|
Kogan JR, Hatala R, Hauer KE, Holmboe E. Guidelines: The do's, don'ts and don't knows of direct observation of clinical skills in medical education. PERSPECTIVES ON MEDICAL EDUCATION 2017; 6:286-305. [PMID: 28956293 PMCID: PMC5630537 DOI: 10.1007/s40037-017-0376-7] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
INTRODUCTION Direct observation of clinical skills is a key assessment strategy in competency-based medical education. The guidelines presented in this paper synthesize the literature on direct observation of clinical skills. The goal is to provide a practical list of Do's, Don'ts and Don't Knows about direct observation for supervisors who teach learners in the clinical setting and for educational leaders who are responsible for clinical training programs. METHODS We built consensus through an iterative approach in which each author, based on their medical education and research knowledge and expertise, independently developed a list of Do's, Don'ts, and Don't Knows about direct observation of clinical skills. Lists were compiled, discussed and revised. We then sought and compiled evidence to support each guideline and determine the strength of each guideline. RESULTS A final set of 33 Do's, Don'ts and Don't Knows is presented along with a summary of evidence for each guideline. Guidelines focus on two groups: individual supervisors and the educational leaders responsible for clinical training programs. Guidelines address recommendations for how to focus direct observation, select an assessment tool, promote high quality assessments, conduct rater training, and create a learning culture conducive to direct observation. CONCLUSIONS High frequency, high quality direct observation of clinical skills can be challenging. These guidelines offer important evidence-based Do's and Don'ts that can help improve the frequency and quality of direct observation. Improving direct observation requires focus not just on individual supervisors and their learners, but also on the organizations and cultures in which they work and train. Additional research to address the Don't Knows can help educators realize the full potential of direct observation in competency-based education.
Collapse
Affiliation(s)
- Jennifer R Kogan
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA.
| | - Rose Hatala
- University of British Columbia, Vancouver, British Columbia, Canada
| | - Karen E Hauer
- University of California San Francisco, San Francisco, CA, USA
| | - Eric Holmboe
- Accreditation Council of Graduate Medical Education, Chicago, IL, USA
| |
Collapse
|
25
|
Wilbur K, Hassaballa N, Mahmood OS, Black EK. Describing student performance: a comparison among clinical preceptors across cultural contexts. MEDICAL EDUCATION 2017; 51:411-422. [PMID: 28220518 DOI: 10.1111/medu.13223] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Revised: 02/26/2016] [Accepted: 09/09/2016] [Indexed: 06/06/2023]
Abstract
CONTEXT Health professional student evaluation during experiential training is notably subjective and assessor judgements may be affected by socio-cultural influences. OBJECTIVES This study sought to explore how clinical preceptors in pharmacy conceptualise varying levels of student performance and to identify any contextual differences that may exist across different countries. METHODS The qualitative research design employed semi-structured interviews. A sample of 20 clinical preceptors for post-baccalaureate Doctor of Pharmacy programmes in Canada and the Middle East gave personal accounts of how students they had supervised fell below, met or exceeded their expectations. Discussions were analysed following constructivist grounded theory principles. RESULTS Seven major themes encompassing how clinical pharmacy preceptors categorise levels of student performance and behaviour were identified: knowledge; team interaction; motivation; skills; patient care; communication, and professionalism. Expectations were outlined using both positive and negative descriptions. Pharmacists typically described supervisory experiences representing a series of these categories, but arrived at concluding judgements in a holistic fashion: if valued traits of motivation and positive attitude were present, overall favourable impressions of a student could be maintained despite observations of a few deficiencies. Some prioritised dimensions could not be mapped to defined existing educational outcomes. There was no difference in thresholds for how student performance was distinguished by participants in the two regions. CONCLUSIONS The present research findings are congruent with current literature related to the constructs used by clinical supervisors in health professional student workplace-based assessment and provide additional insight into cross-national perspectives in pharmacy. As previously determined in social work and medicine, further study of how evaluation instruments and associated processes can integrate these judgements should be pursued in this discipline.
Collapse
Affiliation(s)
- Kerry Wilbur
- College of Pharmacy, Qatar University, Doha, Qatar
| | | | | | - Emily K Black
- College of Pharmacy, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
26
|
Patel M, Agius S. Cross-cultural comparisons of assessment of clinical performance. MEDICAL EDUCATION 2017; 51:348-350. [PMID: 28299843 DOI: 10.1111/medu.13262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
|
27
|
Beckstead JW, Boutis K, Pecaric M, Pusic MV. Sequential dependencies in categorical judgments of radiographic images. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2017; 22:197-207. [PMID: 27272512 DOI: 10.1007/s10459-016-9692-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 06/01/2016] [Indexed: 06/06/2023]
Abstract
Sequential context effects, the psychological interactions occurring between the events of successive trials when a sequence of similar stimuli are judged, have interested psychologists for decades. It has been well established that individuals exhibit sequential context effects in psychophysical experiments involving unidimensional stimuli. Recent evidence shows that these effects generalize to quantitative judgments of more complex multidimensional stimuli such as images of faces, chairs, and shoes. In this article, we test for the presence of sequential context effects by re-examining previously published data on categorical judgments of 234 complex radiographic images made by 20 experienced physicians and 20 medical students engaged in an online training task. We found that medical students, but not experienced physicians, displayed evidence of sequential context effects. We also found evidence suggesting that as the students learned over blocks of trials, they tended to shift from relative comparisons between consecutive images toward more independent comparisons of each image against (strengthening) internalized standards.
Collapse
Affiliation(s)
- Jason W Beckstead
- University of South Florida College of Nursing, 12901 Bruce B. Downs Boulevard, MDC22, Tampa, FL, 33612, USA.
| | | | | | | |
Collapse
|
28
|
Gauthier G, St-Onge C, Tavares W. Rater cognition: review and integration of research findings. MEDICAL EDUCATION 2016; 50:511-22. [PMID: 27072440 DOI: 10.1111/medu.12973] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 07/20/2015] [Accepted: 11/13/2015] [Indexed: 05/21/2023]
Abstract
BACKGROUND Given the complexity of competency frameworks, associated skills and abilities, and contexts in which they are to be assessed in competency-based education (CBE), there is an increased reliance on rater judgements when considering trainee performance. This increased dependence on rater-based assessment has led to the emergence of rater cognition as a field of research in health professions education. The topic, however, is often conceptualised and ultimately investigated using many different perspectives and theoretical frameworks. Critically analysing how researchers think about, study and discuss rater cognition or the judgement processes in assessment frameworks may provide meaningful and efficient directions in how the field continues to explore the topic. METHODS We conducted a critical and integrative review of the literature to explore common conceptualisations and unified terminology associated with rater cognition research. We identified 1045 articles on rater-based assessment in health professions education using Scorpus, Medline and ERIC and 78 articles were included in our review. RESULTS We propose a three-phase framework of observation, processing and integration. We situate nine specific mechanisms and sub-mechanisms described across the literature within these phases: (i) generating automatic impressions about the person; (ii) formulating high-level inferences; (iii) focusing on different dimensions of competencies; (iv) categorising through well-developed schemata based on (a) personal concept of competence, (b) comparison with various exemplars and (c) task and context specificity; (v) weighting and synthesising information differently, (vi) producing narrative judgements; and (vii) translating narrative judgements into scales. CONCLUSION Our review has allowed us to identify common underlying conceptualisations of observed rater mechanisms and subsequently propose a comprehensive, although complex, framework for the dynamic and contextual nature of the rating process. This framework could help bridge the gap between researchers adopting different perspectives when studying rater cognition and enable the interpretation of contradictory findings of raters' performance by determining which mechanism is enabled or disabled in any given context.
Collapse
Affiliation(s)
| | - Christina St-Onge
- Medecine interne, Universite de Sherbrooke, Sherbrooke, Quebec, Canada
| | - Walter Tavares
- Division of Emergency Medicine, McMaster University, Hamilton, Ontario, Canada
- Centennial College, School of Community and Health Studies, Toronto, Ontario, Canada
- ORNGE Transport Medicine, Faculty of Medicine, Mississauga, Ontario, Canada
| |
Collapse
|
29
|
Nash RE, Chalmers L, Stupans I, Brown N. Knowledge, use and perceived relevance of a profession's Competency Standards; implications for Pharmacy Education. INTERNATIONAL JOURNAL OF PHARMACY PRACTICE 2016; 24:390-402. [PMID: 27103063 DOI: 10.1111/ijpp.12267] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 03/22/2016] [Indexed: 11/27/2022]
Abstract
OBJECTIVES To determine the extent of use and perceived relevance of the National Competency Standards Framework for Pharmacists in Australia (NCS). Based on these findings, to suggest approaches for the enhancement of pharmacy education for the profession locally and globally. METHODS Convenience sampling techniques were employed between November 2013 and June 2014 in conducting an online survey with Australian pharmacy students, interns, pharmacists and educators. KEY FINDINGS Data from 527 participants were included in the final analysis. Fewer students (52%, 96/183) and interns 78% (69/88) knew the NCS framing pharmacy practice compared with pharmacists (86%, 115/134). Despite knowledge that the NCS existed most participants reported poor familiarity with and use of the NCS. Registered pharmacists reported annual use but not for Continuing Professional Development (CPD) plans or annual re-registration requirements. Respondents reported that practical use of NCS (e.g. mentoring interns) increased their use for personal needs. Some participants suggested regular instruction on self-assessment skills development would enhance meaningful use of the NCS. CONCLUSION Despite self-assessment against NCS being mandated annually, Australia's practising pharmacists provided explanations for why this is not common in practice. The barriers provided by respondents are interconnected; their enablers are practical solutions to each barrier. The findings reinforce the notion that student pharmacists must have their competency standards, life-long learning and self-assessment skills embedded into their university curriculum to ensure a strong foundation for practice. The opportunity offered by periodic renewal of standards must prompt regular profession-wide evaluation of its education to practice nexus. Insights and author recommendations are portable to the pharmacy profession globally.
Collapse
Affiliation(s)
- Rose E Nash
- Pharmacy, School of Medicine, University of Tasmania, Hobart, TAS, Australia
| | - Leanne Chalmers
- Pharmacy, School of Medicine, University of Tasmania, Hobart, TAS, Australia
| | - Ieva Stupans
- School of Health and Biomedical Sciences, RMIT University, Melbourne, VIC, Australia
| | - Natalie Brown
- Tasmanian Institute of Learning and Teaching, University of Tasmania, Hobart, TAS, Australia
| |
Collapse
|
30
|
Vaughan B, Moore K. The mini Clinical Evaluation Exercise (mini-CEX) in a pre-registration osteopathy program: Exploring aspects of its validity. INT J OSTEOPATH MED 2016. [DOI: 10.1016/j.ijosm.2015.07.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
31
|
Yeates P, Cardell J, Byrne G, Eva KW. Relatively speaking: contrast effects influence assessors' scores and narrative feedback. MEDICAL EDUCATION 2015; 49:909-919. [PMID: 26296407 DOI: 10.1111/medu.12777] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Revised: 12/22/2014] [Accepted: 04/27/2015] [Indexed: 06/04/2023]
Abstract
CONTEXT In prior research, the scores assessors assign can be biased away from the standard of preceding performances (i.e. 'contrast effects' occur). OBJECTIVES This study examines the mechanism and robustness of these findings to advance understanding of assessor cognition. We test the influence of the immediately preceding performance relative to that of a series of prior performances. Further, we examine whether assessors' narrative comments are similarly influenced by contrast effects. METHODS Clinicians (n = 61) were randomised to three groups in a blinded, Internet-based experiment. Participants viewed identical videos of good, borderline and poor performances by first-year doctors in varied orders. They provided scores and written feedback after each video. Narrative comments were blindly content-analysed to generate measures of valence and content. Variability of narrative comments and scores was compared between groups. RESULTS Comparisons indicated contrast effects after a single performance. When a good performance was preceded by a poor performance, ratings were higher (mean 5.01, 95% confidence interval [CI] 4.79-5.24) than when observation of the good performance was unbiased (mean 4.36, 95% CI 4.14-4.60; p < 0.05, d = 1.3). Similarly, borderline performance was rated lower when preceded by good performance (mean 2.96, 95% CI 2.56-3.37) than when viewed without preceding bias (mean 3.55, 95% CI 3.17-3.92; p < 0.05, d = 0.7). The series of ratings participants assigned suggested that the magnitude of contrast effects is determined by an averaging of recent experiences. The valence (but not content) of narrative comments showed contrast effects similar to those found in numerical scores. CONCLUSIONS These findings are consistent with research from behavioural economics and psychology that suggests judgement tends to be relative in nature. Observing that the valence of narrative comments is similarly influenced suggests these effects represent more than difficulty in translating impressions into a number. The extent to which such factors impact upon assessment in practice remains to be determined as the influence is likely to depend on context.
Collapse
Affiliation(s)
- Peter Yeates
- Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of Manchester, Manchester, UK
| | - Jenna Cardell
- Royal Bolton Hospital, Bolton NHS Foundation Trust, Bolton, Lancashire, UK
| | - Gerard Byrne
- Health Education North West, Health Education England, Manchester, UK
| | - Kevin W Eva
- Centre for Health Education Scholarship, Division of Medicine, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
32
|
Kogan JR, Conforti LN, Bernabeo E, Iobst W, Holmboe E. How faculty members experience workplace-based assessment rater training: a qualitative study. MEDICAL EDUCATION 2015; 49:692-708. [PMID: 26077217 DOI: 10.1111/medu.12733] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2014] [Revised: 11/13/2014] [Accepted: 02/11/2015] [Indexed: 05/09/2023]
Abstract
CONTEXT Direct observation of clinical skills is a common approach in workplace-based assessment (WBA). Despite widespread use of the mini-clinical evaluation exercise (mini-CEX), faculty development efforts are typically required to improve assessment quality. Little consensus exists regarding the most effective training methods, and few studies explore faculty members' reactions to rater training. OBJECTIVES This study was conducted to qualitatively explore the experiences of faculty staff with two rater training approaches - performance dimension training (PDT) and a modified approach to frame of reference training (FoRT) - to elucidate how such faculty development can be optimally designed. METHODS In a qualitative study of a multifaceted intervention using complex intervention principles, 45 out-patient resident faculty preceptors from 26 US internal medicine residency programmes participated in a rater training faculty development programme. All participants were interviewed individually and in focus groups during and after the programme to elicit how the training influenced their approach to assessment. A constructivist grounded theory approach was used to analyse the data. RESULTS Many participants perceived that rater training positively influenced their approach to direct observation and feedback, their ability to use entrustment as the standard for assessment, and their own clinical skills. However, barriers to implementation and change included: (i) a preference for holistic assessment over frameworks; (ii) challenges in defining competence; (iii) difficulty in changing one's approach to assessment, and (iv) concerns about institutional culture and buy-in. CONCLUSIONS Rater training using PDT and a modified approach to FoRT can provide faculty staff with assessment skills that are congruent with principles of criterion-referenced assessment and entrustment, and foundational principles of competency-based education, while providing them with opportunities to reflect on their own clinical skills. However, multiple challenges to incorporating new forms of training exist. Ongoing efforts to improve WBA are needed to address institutional and cultural contexts, and systems of care delivery.
Collapse
Affiliation(s)
- Jennifer R Kogan
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Lisa N Conforti
- Milestones Development and Evaluation, Accreditation Council of Graduate Medical Education, Chicago, Illinois, USA
| | - Elizabeth Bernabeo
- Evaluation Research and Development, American Board of Internal Medicine, Philadelphia, Pennsylvania, USA
| | - William Iobst
- Academic and Clinical Affairs, Commonwealth Medical College, Scranton, Pennsylvania, USA
| | - Eric Holmboe
- Milestones Development and Evaluation, Accreditation Council of Graduate Medical Education, Chicago, Illinois, USA
| |
Collapse
|
33
|
Yeates P, Moreau M, Eva K. Are Examiners' Judgments in OSCE-Style Assessments Influenced by Contrast Effects? ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2015; 90:975-80. [PMID: 25629945 DOI: 10.1097/acm.0000000000000650] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
PURPOSE Laboratory studies have shown that performance assessment judgments can be biased by "contrast effects." Assessors' scores become more positive, for example, when the assessed performance is preceded by relatively weak candidates. The authors queried whether this effect occurs in real, high-stakes performance assessments despite increased formality and behavioral descriptors. METHOD Data were obtained for the 2011 United Kingdom Foundational Programme clinical assessment and the 2008 University of Alberta Multiple Mini Interview. Candidate scores were compared with scores for immediately preceding candidates and progressively distant candidates. In addition, average scores for the preceding three candidates were calculated. Relationships between these variables were examined using linear regression. RESULTS Negative relationships were observed between index scores and both immediately preceding and recent scores for all exam formats. Relationships were greater between index scores and the average of the three preceding scores. These effects persisted even when examiners had judged several performances, explaining up to 11% of observed variance on some occasions. CONCLUSIONS These findings suggest that contrast effects do influence examiner judgments in high-stakes performance-based assessments. Although the observed effect was smaller than observed in experimentally controlled laboratory studies, this is to be expected given that real-world data lessen the strength of the intervention by virtue of less distinct differences between candidates. Although it is possible that the format of circuital exams reduces examiners' susceptibility to these influences, the finding of a persistent effect after examiners had judged several candidates suggests that the potential influence on candidate scores should not be ignored.
Collapse
Affiliation(s)
- Peter Yeates
- P. Yeates is clinical lecturer in medical education, Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of Manchester, and specialist registrar, Respiratory and General Internal Medicine, Health Education North West, Manchester, United Kingdom. M. Moreau is assistant dean for admissions, Faculty of Medicine and Dentistry, and professor, Division of Orthopaedic Surgery, University of Alberta, Edmonton, Alberta, Canada. K. Eva is senior scientist, Centre for Health Education Scholarship, and professor and director of educational research and scholarship, Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | |
Collapse
|
34
|
Essers G, Dielissen P, van Weel C, van der Vleuten C, van Dulmen S, Kramer A. How do trained raters take context factors into account when assessing GP trainee communication performance? An exploratory, qualitative study. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2015; 20:131-147. [PMID: 24858236 DOI: 10.1007/s10459-014-9511-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Accepted: 05/05/2014] [Indexed: 06/03/2023]
Abstract
Communication assessment in real-life consultations is a complex task. Generic assessment instruments help but may also have disadvantages. The generic nature of the skills being assessed does not provide indications for context-specific behaviour required in practice situations; context influences are mostly taken into account implicitly. Our research questions are: 1. What factors do trained raters observe when rating workplace communication? 2. How do they take context factors into account when rating communication performance with a generic rating instrument? Nineteen general practitioners (GPs), trained in communication assessment with a generic rating instrument (the MAAS-Global), participated in a think-aloud protocol reflecting concurrent thought processes while assessing videotaped real-life consultations. They were subsequently interviewed to answer questions explicitly asking them to comment on the influence of predefined contextual factors on the assessment process. Results from both data sources were analysed. We used a grounded theory approach to untangle the influence of context factors on GP communication and on communication assessment. Both from the think-aloud procedure and from the interviews we identified various context factors influencing communication, which were categorised into doctor-related (17), patient-related (13), consultation-related (18), and education-related factors (18). Participants had different views and practices on how to incorporate context factors into the GP(-trainee) communication assessment. Raters acknowledge that context factors may affect communication in GP consultations, but struggle with how to take contextual influences into account when assessing communication performance in an educational context. To assess practice situations, raters need extra guidance on how to handle specific contextual factors.
Collapse
Affiliation(s)
- Geurt Essers
- Department of Public Health and Primary Care, Leiden University Medical Centre, Hippocratespad 21, 2333 ZP, Leiden, The Netherlands,
| | | | | | | | | | | |
Collapse
|
35
|
Abstract
BACKGROUND OSCEs can be both reliable and valid but are subject to sources of error. Examiners become more hawkish as their experience grows, and recent research suggests that in clinical contexts, examiners are influenced by the ability of recently observed candidates. In OSCEs, where examiners test many candidates over a short space of time, this may introduce bias that does not reflect a candidate's true ability. AIMS Test whether examiners marked more or less stringently as time elapsed in a summative OSCE, and evaluate the practical impact of this bias. METHODS We measured changes in examiner stringency in a 13 station OSCE sat by 278 third year MBChB students over the course of two days. RESULTS Examiners were most lenient at the start of the OSCE in the clinical section (β = -0.14, p = 0.018) but not in the online section where student answers were machine marked (β = -0.003, p = 0.965). CONCLUSIONS The change in marks was likely caused by increased examiner stringency over time derived from a combination of growing experience and exposure to an increasing number of successful candidates. The need for better training and for reviewing standards during the OSCE is discussed.
Collapse
|
36
|
Gingerich A, Kogan J, Yeates P, Govaerts M, Holmboe E. Seeing the 'black box' differently: assessor cognition from three research perspectives. MEDICAL EDUCATION 2014; 48:1055-68. [PMID: 25307633 DOI: 10.1111/medu.12546] [Citation(s) in RCA: 165] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Revised: 06/04/2014] [Accepted: 07/11/2014] [Indexed: 05/09/2023]
Abstract
CONTEXT Performance assessments, such as workplace-based assessments (WBAs), represent a crucial component of assessment strategy in medical education. Persistent concerns about rater variability in performance assessments have resulted in a new field of study focusing on the cognitive processes used by raters, or more inclusively, by assessors. METHODS An international group of researchers met regularly to share and critique key findings in assessor cognition research. Through iterative discussions, they identified the prevailing approaches to assessor cognition research and noted that each of them were based on nearly disparate theoretical frameworks and literatures. This paper aims to provide a conceptual review of the different perspectives used by researchers in this field using the specific example of WBA. RESULTS Three distinct, but not mutually exclusive, perspectives on the origins and possible solutions to variability in assessment judgements emerged from the discussions within the group of researchers: (i) the assessor as trainable: assessors vary because they do not apply assessment criteria correctly, use varied frames of reference and make unjustified inferences; (ii) the assessor as fallible: variations arise as a result of fundamental limitations in human cognition that mean assessors are readily and haphazardly influenced by their immediate context, and (iii) the assessor as meaningfully idiosyncratic: experts are capable of making sense of highly complex and nuanced scenarios through inference and contextual sensitivity, which suggests assessor differences may represent legitimate experience-based interpretations. CONCLUSIONS Although each of the perspectives discussed in this paper advances our understanding of assessor cognition and its impact on WBA, every perspective has its limitations. Following a discussion of areas of concordance and discordance across the perspectives, we propose a coexistent view in which researchers and practitioners utilise aspects of all three perspectives with the goal of advancing assessment quality and ultimately improving patient care.
Collapse
Affiliation(s)
- Andrea Gingerich
- Northern Medical Program, University of Northern British Columbia, Prince George, British Columbia, Canada
| | | | | | | | | |
Collapse
|