51
|
Torre DM, Schuwirth LWT, Van der Vleuten CPM. Theoretical considerations on programmatic assessment. MEDICAL TEACHER 2020; 42:213-220. [PMID: 31622126 DOI: 10.1080/0142159x.2019.1672863] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Introduction: Programmatic assessment (PA) is an approach to assessment aimed at optimizing learning which continues to gain educational momentum. However, the theoretical underpinnings of PA have not been clearly described. An explanation of the theoretical underpinnings of PA will allow educators to gain a better understanding of this approach and, perhaps, facilitate its use and effective implementation. The purpose of this article is twofold: first, to describe salient theoretical perspectives on PA; second to examine how theory may help educators to develop effective PA programs, helping to overcome challenges around PA.Results: We outline a number of learning theories that underpin key educational principles of PA: constructivist and social constructivist theory supporting meaning making, and longitudinality; cognitivist and cognitive development orientation scaffolding the practice of a continuous feedback process; theory of instructional design underpinning assessment as learning; self-determination theory (SDT), self-regulation learning theory (SRL), and principles of deliberate practice providing theoretical tenets for student agency and accountability.Conclusion: The construction of a plausible and coherent link between key educational principles of PA and learning theories should enable educators to pose new and important inquiries, reflect on their assessment practices and help overcome future challenges in the development and implementation of PA in their programs.
Collapse
Affiliation(s)
- Dario M Torre
- Department of Medicine, Uniformed Services University of Health Sciences, Bethesda, MD, USA
| | - L W T Schuwirth
- Department of Education and Health Profession Education, Flinders Medical School, Adelaide, Australia
| | - C P M Van der Vleuten
- Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands
- Faculty of Health Medicine and Life Sciences, School of Health Professions Education, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
52
|
Kelly MS, Mooney CJ, Rosati JF, Braun MK, Thompson Stone R. Education Research: The Narrative Evaluation Quality Instrument: Development of a tool to assess the assessor. Neurology 2020; 94:91-95. [PMID: 31932402 DOI: 10.1212/wnl.0000000000008794] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
OBJECTIVE Determining the quality of narrative evaluations to assess medical student neurology clerkship performance remains a challenge. This study sought to develop a tool to comprehensively and systematically assess quality of student narrative evaluations. METHODS The Narrative Evaluation Quality Instrument (NEQI) was created to assess several components within clerkship narrative evaluations: performance domains, specificity, and usefulness to learner. In this retrospective study, 5 investigators scored 123 narrative evaluations using the NEQI. Inter-rater reliability was estimated by calculating interclass correlation coefficients (ICC) across 615 NEQI scores. RESULTS The average overall NEQI score was 6.4 (SD 2.9), with mean component arm scores of 2.6 for performance domains (SD 0.9), 1.8 for specificity (SD 1.1), and 2.0 for usefulness (SD 1.4). Each component arm exhibited moderate reliability: performance domains ICC 0.65 (95% confidence interval [CI] 0.58-0.72), specificity ICC 0.69 (95% CI 0.61-0.77), and usefulness ICC 0.73 (95% CI 0.66-0.80). Overall NEQI score exhibited good reliability (0.81; 95% CI 0.77-0.86). CONCLUSION The NEQI is a novel, reliable tool to comprehensively assess the quality of narrative evaluation of neurology clerks and will enhance the study of interventions seeking to improve clerkship evaluation.
Collapse
Affiliation(s)
- Michael S Kelly
- From the Department of Neurology (R.T.S., J.R., M.B.), University of Rochester School of Medicine and Dentistry (C.M., M.K.), NY
| | - Christopher J Mooney
- From the Department of Neurology (R.T.S., J.R., M.B.), University of Rochester School of Medicine and Dentistry (C.M., M.K.), NY
| | - Justin F Rosati
- From the Department of Neurology (R.T.S., J.R., M.B.), University of Rochester School of Medicine and Dentistry (C.M., M.K.), NY
| | - Melanie K Braun
- From the Department of Neurology (R.T.S., J.R., M.B.), University of Rochester School of Medicine and Dentistry (C.M., M.K.), NY
| | - Robert Thompson Stone
- From the Department of Neurology (R.T.S., J.R., M.B.), University of Rochester School of Medicine and Dentistry (C.M., M.K.), NY
| |
Collapse
|
53
|
Diller D, Cooper S, Jain A, Lam CN, Riddell J. Which Emergency Medicine Milestone Sub-competencies are Identified Through Narrative Assessments? West J Emerg Med 2019; 21:173-179. [PMID: 31913841 PMCID: PMC6948702 DOI: 10.5811/westjem.2019.12.44468] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Accepted: 12/04/2019] [Indexed: 12/02/2022] Open
Abstract
Introduction Evaluators use assessment data to make judgments on resident performance within the Accreditation Council for Graduate Medical Education (ACGME) milestones framework. While workplace-based narrative assessments (WBNA) offer advantages to rating scales, validity evidence for their use in assessing the milestone sub-competencies is lacking. This study aimed to determine the frequency of sub-competencies assessed through WBNAs in an emergency medicine (EM) residency program. Methods We performed a retrospective analysis of WBNAs of postgraduate year (PGY) 2–4 residents. A shared mental model was established by reading and discussing the milestones framework, and we created a guide for coding WBNAs to the milestone sub-competencies in an iterative process. Once inter-rater reliability was satisfactory, raters coded each WBNA to the 23 EM milestone sub-competencies. Results We analyzed 2517 WBNAs. An average of 2.04 sub-competencies were assessed per WBNA. The sub-competencies most frequently identified were multitasking, medical knowledge, practice-based performance improvement, patient-centered communication, and team management. The sub-competencies least frequently identified were pharmacotherapy, airway management, anesthesia and acute pain management, goal-directed focused ultrasound, wound management, and vascular access. Overall, the frequency with which WBNAs assessed individual sub-competencies was low, with 14 of the 23 sub-competencies being assessed in less than 5% of WBNAs. Conclusion WBNAs identify few milestone sub-competencies. Faculty assessed similar sub-competencies related to interpersonal and communication skills, practice-based learning and improvement, and medical knowledge, while neglecting sub-competencies related to patient care and procedural skills. These findings can help shape faculty development programs designed to improve assessments of specific workplace behaviors and provide more robust data for the summative assessment of residents.
Collapse
Affiliation(s)
- David Diller
- LAC+USC Medical Center, Keck School of Medicine of the University of Southern California, Department of Emergency Medicine, Los Angeles, California
| | - Shannon Cooper
- Henry Ford Allegiance Health, Department of Emergency Medicine, Jackson, Michigan
| | - Aarti Jain
- LAC+USC Medical Center, Keck School of Medicine of the University of Southern California, Department of Emergency Medicine, Los Angeles, California
| | - Chun Nok Lam
- LAC+USC Medical Center, Keck School of Medicine of the University of Southern California, Department of Emergency Medicine, Los Angeles, California
| | - Jeff Riddell
- LAC+USC Medical Center, Keck School of Medicine of the University of Southern California, Department of Emergency Medicine, Los Angeles, California
| |
Collapse
|
54
|
Tekian A, Park YS, Tilton S, Prunty PF, Abasolo E, Zar F, Cook DA. Competencies and Feedback on Internal Medicine Residents' End-of-Rotation Assessments Over Time: Qualitative and Quantitative Analyses. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:1961-1969. [PMID: 31169541 PMCID: PMC6882536 DOI: 10.1097/acm.0000000000002821] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
PURPOSE To examine how qualitative narrative comments and quantitative ratings from end-of-rotation assessments change for a cohort of residents from entry to graduation, and explore associations between comments and ratings. METHOD The authors obtained end-of-rotation quantitative ratings and narrative comments for 1 cohort of internal medicine residents at the University of Illinois at Chicago College of Medicine from July 2013-June 2016. They inductively identified themes in comments, coded orientation (praising/critical) and relevance (specificity and actionability) of feedback, examined associations between codes and ratings, and evaluated changes in themes and ratings across years. RESULTS Data comprised 1,869 assessments (828 comments) on 33 residents. Five themes aligned with ACGME competencies (interpersonal and communication skills, professionalism, medical knowledge, patient care, and systems-based practice), and 3 did not (personal attributes, summative judgment, and comparison to training level). Work ethic was the most frequent subtheme. Comments emphasized medical knowledge more in year 1 and focused more on autonomy, leadership, and teaching in later years. Most comments (714/828 [86%]) contained high praise, and 412/828 (50%) were very relevant. Average ratings correlated positively with orientation (β = 0.46, P < .001) and negatively with relevance (β = -0.09, P = .01). Ratings increased significantly with each training year (year 1, mean [standard deviation]: 5.31 [0.59]; year 2: 5.58 [0.47]; year 3: 5.86 [0.43]; P < .001). CONCLUSIONS Narrative comments address resident attributes beyond the ACGME competencies and change as residents progress. Lower quantitative ratings are associated with more specific and actionable feedback.
Collapse
Affiliation(s)
- Ara Tekian
- A. Tekian is professor and associate dean for international affairs, Department of Medical Education, University of Illinois at Chicago College of Medicine, Chicago, Illinois; ORCID: https://orcid.org/0000-0002-9252-1588
| | - Yoon Soo Park
- Y.S. Park is associate professor, Department of Medical Education, University of Illinois at Chicago College of Medicine, Chicago, Illinois; ORCID: http://orcid.org/0000-0001-8583-4335
| | - Sarette Tilton
- S. Tilton is a PharmD candidate, University of Illinois at Chicago College of Pharmacy, Chicago, Illinois
| | - Patrick F. Prunty
- P.F. Prunty is a PharmD candidate, University of Illinois at Chicago College of Pharmacy, Chicago, Illinois
| | - Eric Abasolo
- E. Abasolo is a PharmD candidate, University of Illinois at Chicago College of Pharmacy, Chicago, Illinois
| | - Fred Zar
- F. Zar is professor and program director, Department of Medicine, University of Illinois at Chicago College of Medicine, Chicago, Illinois
| | - David A. Cook
- D.A. Cook is professor of medicine and medical education and associate director, Office of Applied Scholarship and Education Science, and consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota; ORCID: https://orcid.org/0000-0003-2383-4633
| |
Collapse
|
55
|
Tremblay G, Carmichael PH, Maziade J, Grégoire M. Detection of Residents With Progress Issues Using a Keyword-Specific Algorithm. J Grad Med Educ 2019; 11:656-662. [PMID: 31871565 PMCID: PMC6919172 DOI: 10.4300/jgme-d-19-00386.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND The literature suggests that specific keywords included in summative rotation assessments might be an early indicator of abnormal progress or failure. OBJECTIVE This study aims to determine the possible relationship between specific keywords on in-training evaluation reports (ITERs) and subsequent abnormal progress or failure. The goal is to create a functional algorithm to identify residents at risk of failure. METHODS A database of all ITERs from all residents training in accredited programs at Université Laval between 2001 and 2013 was created. An instructional designer reviewed all ITERs and proposed terms associated with reinforcing and underperformance feedback. An algorithm based on these keywords was constructed by recursive partitioning using classification and regression tree methods. The developed algorithm was tuned to achieve 100% sensitivity while maximizing specificity. RESULTS There were 41 618 ITERs for 3292 registered residents. Residents with failure to progress were detected for family medicine (6%, 67 of 1129) and 36 other specialties (4%, 78 of 2163), while the positive predictive values were 23.3% and 23.4%, respectively. The low positive predictive value may be a reflection of residents improving their performance after receiving feedback or a reluctance by supervisors to ascribe a "fail" or "in difficulty" score on the ITERs. CONCLUSIONS Classification and regression trees may be helpful to identify pertinent keywords and create an algorithm, which may be implemented in an electronic assessment system to detect future residents at risk of poor performance.
Collapse
|
56
|
van der Vleuten CPM, Schuwirth LWT. Assessment in the context of problem-based learning. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2019; 24:903-914. [PMID: 31578642 PMCID: PMC6908559 DOI: 10.1007/s10459-019-09909-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 08/07/2019] [Indexed: 05/29/2023]
Abstract
Arguably, constructive alignment has been the major challenge for assessment in the context of problem-based learning (PBL). PBL focuses on promoting abilities such as clinical reasoning, team skills and metacognition. PBL also aims to foster self-directed learning and deep learning as opposed to rote learning. This has incentivized researchers in assessment to find possible solutions. Originally, these solutions were sought in developing the right instruments to measure these PBL-related skills. The search for these instruments has been accelerated by the emergence of competency-based education. With competency-based education assessment moved away from purely standardized testing, relying more heavily on professional judgment of complex skills. Valuable lessons have been learned that are directly relevant for assessment in PBL. Later, solutions were sought in the development of new assessment strategies, initially again with individual instruments such as progress testing, but later through a more holistic approach to the assessment program as a whole. Programmatic assessment is such an integral approach to assessment. It focuses on optimizing learning through assessment, while at the same gathering rich information that can be used for rigorous decision-making about learner progression. Programmatic assessment comes very close to achieving the desired constructive alignment with PBL, but its wide adoption-just like PBL-will take many years ahead of us.
Collapse
Affiliation(s)
- Cees P M van der Vleuten
- School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands.
| | - Lambert W T Schuwirth
- Prideaux Centre for Research in Health Professions Education, College of Medicine and Public Health, Flinders University, Sturt Road, Bedford Park, SA, 5042, Australia
| |
Collapse
|
57
|
Ramani S, Könings KD, Ginsburg S, van der Vleuten CPM. Meaningful feedback through a sociocultural lens. MEDICAL TEACHER 2019; 41:1342-1352. [PMID: 31550434 DOI: 10.1080/0142159x.2019.1656804] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This AMEE guide provides a framework and practical strategies for teachers, learners and institutions to promote meaningful feedback conversations that emphasise performance improvement and professional growth. Recommended strategies are based on recent feedback research and literature, which emphasise the sociocultural nature of these complex interactions. We use key concepts from three theories as the underpinnings of the recommended strategies: sociocultural, politeness and self-determination theories. We view the content and impact of feedback conversations through the perspective of learners, teachers and institutions, always focussing on learner growth. The guide emphasises the role of teachers in forming educational alliances with their learners, setting a safe learning climate, fostering self-awareness about their performance, engaging with learners in informed self-assessment and reflection, and co-creating the learning environment and learning opportunities with their learners. We highlight the role of institutions in enhancing the feedback culture by encouraging a growth mind-set and a learning goal-orientation. Practical advice is provided on techniques and strategies that can be used and applied by learners, teachers and institutions to effectively foster all these elements. Finally, we highlight throughout the critical importance of congruence between the three levels of culture: unwritten values, espoused values and day to day behaviours.
Collapse
Affiliation(s)
- Subha Ramani
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Research and Scholarship, Harvard Macy Institute, Boston, MA, USA
| | - Karen D Könings
- Department of Educational Development and Research and the School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, Netherlands
| | - Shiphra Ginsburg
- Department of Medicine (Respirology) and Wilson Centre for Research in Education, University of Toronto, Toronto, Canada
| | - Cees P M van der Vleuten
- Department of Educational Development and Research and the School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
58
|
Wilby KJ, Dolmans DHJM, Austin Z, Govaerts MJB. Assessors' interpretations of narrative data on communication skills in a summative OSCE. MEDICAL EDUCATION 2019; 53:1003-1012. [PMID: 31304615 DOI: 10.1111/medu.13924] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 03/08/2019] [Accepted: 05/29/2019] [Indexed: 06/10/2023]
Abstract
OBJECTIVES Increasingly, narrative assessment data are used to substantiate and enhance the robustness of assessor judgements. However, the interpretation of written assessment comments is inherently complex and relies on human (expert) judgements. The purpose of this study was to explore how expert assessors process and construe or bring meaning to narrative data when interpreting narrative assessment comments written by others in the setting of standardised performance assessment. METHODS Narrative assessment comments on student communication skills and communication scores across six objective structured clinical examination stations were obtained for 24 final-year pharmacy students. Aggregated narrative data across all stations were sampled for nine students (three good, three average and three poor performers, based on communication scores). A total of 10 expert assessors reviewed the aggregated set of narrative comments for each student. Cognitive (information) processing was captured through think-aloud procedures and verbal protocol analysis. RESULTS Expert assessors primarily made use of two strategies to interpret the narratives, namely comparing and contrasting, and forming mental images of student performance. Assessors appeared to use three different perspectives when interpreting narrative comments, including those of: (i) the student (placing him- or herself in the shoes of the student); (ii) the examiner (adopting the role of examiner and reinterpreting comments according to his or her own standards or beliefs), and (iii) the professional (acting as the profession's gatekeeper by considering the assessment to be a representation of real-life practice). CONCLUSIONS The present findings add to current understandings of assessors' interpretations of narrative performance data by identifying the strategies and different perspectives used by expert assessors to frame and bring meaning to written comments. Assessors' perspectives affect assessors' interpretations of assessment comments and are likely to be influenced by their beliefs, interpretations of the assessment setting and personal performance theories. These results call for the use of multiple assessors to account for variations in assessor perspectives in the interpretation of narrative assessment data.
Collapse
Affiliation(s)
- Kyle John Wilby
- School of Pharmacy, University of Otago, Dunedin, New Zealand
| | - Diana H J M Dolmans
- School of Health Professions Education (SHE), Department of Educational Development and Research, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, the Netherlands
| | - Zubin Austin
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, Ontario, Canada
| | - Marjan J B Govaerts
- School of Health Professions Education (SHE), Department of Educational Development and Research, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, the Netherlands
| |
Collapse
|
59
|
Young JQ. Advancing Our Understanding of Narrative Comments Generated by Direct Observation Tools: Lessons From the Psychopharmacotherapy-Structured Clinical Observation. J Grad Med Educ 2019; 11:570-579. [PMID: 31636828 PMCID: PMC6795331 DOI: 10.4300/jgme-d-19-00207.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Revised: 07/07/2019] [Accepted: 08/05/2019] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND While prior research has focused on the validity of quantitative ratings generated by direct observation tools, much less is known about the written comments. OBJECTIVE This study examines the quality of written comments and their relationship with checklist scores generated by a direct observation tool, the Psychopharmacotherapy-Structured Clinical Observation (P-SCO). METHODS From 2008 to 2012, faculty in a postgraduate year 3 psychiatry outpatient clinic completed 601 P-SCOs. Twenty-five percent were randomly selected from each year; the sample included 8 faculty and 57 residents. To assess quality, comments were coded for valence (reinforcing or corrective), behavioral specificity, and content. To assess the relationship between comments and scores, the authors calculated the correlation between comment and checklist score valence and examined the degree to which comments and checklist scores addressed the same content. RESULTS Ninety-one percent of the comments were behaviorally specific. Sixty percent were reinforcing, and 40% were corrective. Eight themes were identified, including 2 constructs not adequately represented by the checklist. Comment and checklist score valence was moderately correlated (Spearman's rho = 0.57, P < .001). Sixty-seven percent of high and low checklist scores were associated with a comment of the same valence and content. Only 50% of overall comments were associated with a checklist score of the same valence and content. CONCLUSIONS A direct observation tool such as the P-SCO can generate high-quality written comments. Narrative comments both explain checklist scores and convey unique content. Thematic coding of comments can improve the content validity of a checklist.
Collapse
|
60
|
Scarff CE. Towards a greater understanding of narrative data on trainee performance. MEDICAL EDUCATION 2019; 53:962-964. [PMID: 31402480 DOI: 10.1111/medu.13940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Affiliation(s)
- Catherine Elizabeth Scarff
- Department of Medical Education, Melbourne Medical School, University of Melbourne, Parkville, Victoria, Australia
| |
Collapse
|
61
|
Hamstra SJ, Yamazaki K, Barton MA, Santen SA, Beeson MS, Holmboe ES. A National Study of Longitudinal Consistency in ACGME Milestone Ratings by Clinical Competency Committees: Exploring an Aspect of Validity in the Assessment of Residents' Competence. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:1522-1531. [PMID: 31169540 PMCID: PMC6760653 DOI: 10.1097/acm.0000000000002820] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
PURPOSE To investigate whether clinical competency committees (CCCs) were consistent in applying milestone ratings for first-year residents over time or whether ratings increased or decreased. METHOD Beginning in December 2013, the Accreditation Council for Graduate Medical Education (ACGME) initiated a phased-in requirement for reporting milestones; emergency medicine (EM), diagnostic radiology (DR), and urology (UR) were among the earliest reporting specialties. The authors analyzed CCC milestone ratings of first-year residents from 2013 to 2016 from all ACGME-accredited EM, DR, and UR programs for which they had data. The number of first-year residents in these programs ranged from 2,838 to 2,928 over this time period. The program-level average milestone rating for each subcompetency was regressed onto the time of observation using a random coefficient multilevel regression model. RESULTS National average program-level milestone ratings of first-year residents decreased significantly over the observed time period for 32 of the 56 subcompetencies examined. None of the other subcompetencies showed a significant change. National average in-training examination scores for each of the specialties remained essentially unchanged over the time period, suggesting that differences between the cohorts were not likely an explanatory factor. CONCLUSIONS The findings indicate that CCCs tend to become more stringent or maintain consistency in their ratings of beginning residents over time. One explanation for these results is that CCCs may become increasingly comfortable in assigning lower ratings when appropriate. This finding is consistent with an increase in confidence with the milestone rating process and the quality of feedback it provides.
Collapse
Affiliation(s)
- Stanley J. Hamstra
- S.J. Hamstra is vice president, Milestones Research and Evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois, adjunct professor, Faculty of Education, University of Ottawa, Ottawa, Ontario, Canada, and adjunct professor, Department of Medical Education, Feinberg School of Medicine, Northwestern University, Chicago, Illinois; ORCID: https://orcid.org/0000-0002-0680-366X
| | - Kenji Yamazaki
- K. Yamazaki is senior analyst, Milestones Research and Evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois
| | - Melissa A. Barton
- M.A. Barton is director of medical affairs, American Board of Emergency Medicine, East Lansing, Michigan
| | - Sally A. Santen
- S.A. Santen is professor and senior associate dean, Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | - Michael S. Beeson
- M.S. Beeson is director, American Board of Emergency Medicine, East Lansing, Michigan, professor, Department of Emergency Medicine, Northeast Ohio Medical University, Rootstown, Ohio, and program director, Department of Emergency Medicine, Summa Health, Akron, Ohio
| | - Eric S. Holmboe
- E.S. Holmboe is senior vice president, Milestone Development and Evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois
| |
Collapse
|
62
|
Schuwirth LW, van der Vleuten CP. How ‘Testing’ Has Become ‘Programmatic Assessment for Learning’. HEALTH PROFESSIONS EDUCATION 2019. [DOI: 10.1016/j.hpe.2018.06.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022] Open
|
63
|
Milestone Implementation's Impact on Narrative Comments and Perception of Feedback for Internal Medicine Residents: a Mixed Methods Study. J Gen Intern Med 2019; 34:929-935. [PMID: 30891692 PMCID: PMC6544770 DOI: 10.1007/s11606-019-04946-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
BACKGROUND Feedback is a critical element of graduate medical education. Narrative comments on evaluation forms are a source of feedback for residents. As a shared mental model for performance, milestone-based evaluations may impact narrative comments and resident perception of feedback. OBJECTIVE To determine if milestone-based evaluations impacted the quality of faculty members' narrative comments on evaluations and, as an extension, residents' perception of feedback. DESIGN Concurrent mixed methods study, including qualitative analysis of narrative comments and survey of resident perception of feedback. PARTICIPANTS Seventy internal medicine residents and their faculty evaluators at the University of Utah. APPROACH Faculty narrative comments from 248 evaluations pre- and post-milestone implementation were analyzed for quality and Accreditation Council for Graduate Medical Education competency by area of strength and area for improvement. Seventy residents were surveyed regarding quality of feedback pre- and post-milestone implementation. KEY RESULTS Qualitative analysis of narrative comments revealed nearly all evaluations pre- and post-milestone implementation included comments about areas of strength but were frequently vague and not related to competencies. Few evaluations included narrative comments on areas for improvement, but these were of higher quality compared to areas of strength (p < 0.001). Overall resident perception of quality of narrative comments was low and did not change following milestone implementation (p = 0.562) for the 86% of residents (N = 60/70) who completed the pre- and post-surveys. CONCLUSIONS The quality of narrative comments was poor, and there was no evidence of improved quality following introduction of milestone-based evaluations. Comments on areas for improvement were of higher quality than areas of strength, suggesting an area for targeted intervention. Residents' perception of feedback quality did not change following implementation of milestone-based evaluations, suggesting that in the post-milestone era, internal medicine educators need to utilize additional interventions to improve quality of feedback.
Collapse
|
64
|
Ramani S, Könings KD, Ginsburg S, van der Vleuten CPM. Twelve tips to promote a feedback culture with a growth mind-set: Swinging the feedback pendulum from recipes to relationships. MEDICAL TEACHER 2019; 41:625-631. [PMID: 29411668 DOI: 10.1080/0142159x.2018.1432850] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Feedback in medical education has traditionally showcased techniques and skills of giving feedback, and models used in staff development have focused on feedback providers (teachers) not receivers (learners). More recent definitions have questioned this approach, arguing that the impact of feedback lies in learner acceptance and assimilation of feedback with improvement in practice and professional growth. Over the last decade, research findings have emphasized that feedback conversations are complex interpersonal interactions influenced by a multitude of sociocultural factors. However, feedback culture is a concept that is challenging to define, thus strategies to enhance culture are difficult to pin down. In this twelve tips paper, we have attempted to define elements that constitute a feedback culture from four different perspectives and describe distinct strategies that can be used to foster a learning culture with a growth mind-set.
Collapse
Affiliation(s)
- Subha Ramani
- a Department of Medicine , Brigham and Women's Hospital, Harvard Medical School , Boston , MA , USA
| | - Karen D Könings
- b Department of Educational Development and Research, School of Health Professions Education, Faculty of Health, Medicine and Life Sciences , Maastricht University , Maastricht , the Netherlands
| | - Shiphra Ginsburg
- c Department of Medicine , University of Toronto , Toronto , Canada
- d Wilson Centre for Research in Education, Faculty of Medicine , University of Toronto , Toronto , Canada
| | - Cees P M van der Vleuten
- b Department of Educational Development and Research, School of Health Professions Education, Faculty of Health, Medicine and Life Sciences , Maastricht University , Maastricht , the Netherlands
| |
Collapse
|
65
|
Wilby KJ, Govaerts MJB, Dolmans DHJM, Austin Z, van der Vleuten C. Reliability of narrative assessment data on communication skills in a summative OSCE. PATIENT EDUCATION AND COUNSELING 2019; 102:1164-1169. [PMID: 30711383 DOI: 10.1016/j.pec.2019.01.018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Revised: 12/20/2018] [Accepted: 01/24/2019] [Indexed: 06/09/2023]
Abstract
OBJECTIVE To quantitatively estimate the reliability of narrative assessment data regarding student communication skills obtained from a summative OSCE and to compare reliability to that of communication scores obtained from direct observation. METHODS Narrative comments and communication scores (scale 1-5) were obtained for 14 graduating pharmacy students across 6 summative OSCE stations with 2 assessors per station who directly observed student performance. Two assessors who had not observed the OSCE reviewed narratives and independently scored communication skills according to the same 5-point scale. Generalizability theory was used to estimate reliability. Correlation was used to evaluate the relationship between scores from each assessment method. RESULTS A total of 168 narratives and communication scores were obtained. The G-coefficients were 0.571 for scores provided by assessors present during the OSCE and 0.612 for scores from assessors who provided scores based on narratives only. Correlation between the two sets of scores was 0.5. CONCLUSION Reliability of communication scores is not dependent on whether assessors directly observe student performance or assess written narratives, yet both conditions appear to measure communication skills somewhat differently. PRACTICE IMPLICATIONS Narratives may be useful for summative decision-making and help overcome the current limitations of using solely quantitative scores.
Collapse
Affiliation(s)
- Kyle John Wilby
- College of Pharmacy, Qatar University, PO Box 2713, Doha, Qatar.
| | - Marjan J B Govaerts
- School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, Universiteitssingel 60, 6229 ER Maastricht, Netherlands
| | - Diana H J M Dolmans
- School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, Universiteitssingel 60, 6229 ER Maastricht, Netherlands
| | - Zubin Austin
- Leslie Dan Faculty of Pharmacy, University of Toronto, 144 College St., Toronto ON, M5S 3M2, Canada
| | - Cees van der Vleuten
- School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, Universiteitssingel 60, 6229 ER Maastricht, Netherlands
| |
Collapse
|
66
|
Wilby KJ, Govaerts M, Austin Z, Dolmans D. Discriminating Features of Narrative Evaluations of Communication Skills During an OSCE. TEACHING AND LEARNING IN MEDICINE 2019; 31:298-306. [PMID: 30755046 DOI: 10.1080/10401334.2018.1529570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Construct: Authors examined the use of narrative comments for evaluation of student communications skills in a standardized, summative assessment (Objective Structured Clinical Examinations [OSCE]). Background: The use of narrative evaluations in workplace settings is gaining credibility as an assessment tool, but it is unknown how assessors convey judgments using narratives in high-stakes standardized assessments. The aim of this study was to explore constructs (i.e., performance dimensions), as well as linguistic strategies that assessors use to distinguish between poor and good students when writing narrative assessment comments of communication skills during an OSCE. Approach: Eighteen assessors from Qatar University were recruited to write narrative assessment comments of communication skills for 14 students completing a summative OSCE. Assessors scored overall communication performance on a 5-point scale. Narrative evaluations for the top and bottom 2 performing students for each station (based on communication scores) were analyzed for linguistic strategies and constructs that informed assessment decisions. Results: Seventy-two narrative evaluations with 662 comments were analyzed. Most comments (77%) were written without the use of politeness strategies. A further 22% of comments were hedged. Hedging was used more commonly in poor performers, compared to good performers (30% vs. 15%, respectively). Overarching constructs of confidence, adaptability, patient safety, and professionalism were key dimensions that characterized the narrative evaluations of students' performance. Conclusions: Results contribute to our understanding regarding the utility of narrative comments for summative assessment of communication skills. Assessors' comments could be characterized by the constructs of confidence, adaptability, patient safety, and professionalism when distinguishing between levels of student performance. Findings support the notion that judgments are arrived at by clustering sets of behaviors into overarching and meaningful constructs rather than by solely focusing on discrete behaviors. These results call for the development of better-anchored evaluation tools for communication assessment during OSCEs, constructively aligned with assessors' map of the reality of professional practice.
Collapse
Affiliation(s)
| | - Marjan Govaerts
- b Department of Educational Development and Research, Faculty of Health, Medicine, and Life Sciences , Maastricht University , Maastricht , Netherlands
| | - Zubin Austin
- c Leslie Dan Faculty of Pharmacy , University of Toronto , Toronto , Canada
| | - Diana Dolmans
- d School of Health Professions Education, Faculty of Health, Medicine, and Life Sciences , Maastricht University , Maastricht , Netherlands
| |
Collapse
|
67
|
Frank AK, O'Sullivan P, Mills LM, Muller-Juge V, Hauer KE. Clerkship Grading Committees: the Impact of Group Decision-Making for Clerkship Grading. J Gen Intern Med 2019; 34:669-676. [PMID: 30993615 PMCID: PMC6502934 DOI: 10.1007/s11606-019-04879-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
BACKGROUND Faculty and students debate the fairness and accuracy of medical student clerkship grades. Group decision-making is a potential strategy to improve grading. OBJECTIVE To explore how one school's grading committee members integrate assessment data to inform grade decisions and to identify the committees' benefits and challenges. DESIGN This qualitative study used semi-structured interviews with grading committee chairs and members conducted between November 2017 and March 2018. PARTICIPANTS Participants included the eight core clerkship directors, who chaired their grading committees. We randomly selected other committee members to invite, for a maximum of three interviews per clerkship. APPROACH Interviews were recorded, transcribed, and analyzed using inductive content analysis. KEY RESULTS We interviewed 17 committee members. Within and across specialties, committee members had distinct approaches to prioritizing and synthesizing assessment data. Participants expressed concerns about the quality of assessments, necessitating careful scrutiny of language, assessor identity, and other contextual factors. Committee members were concerned about how unconscious bias might impact assessors, but they felt minimally impacted at the committee level. When committee members knew students personally, they felt tension about how to use the information appropriately. Participants described high agreement within their committees; debate was more common when site directors reviewed students' files from other sites prior to meeting. Participants reported multiple committee benefits including faculty development and fulfillment, as well as improved grading consistency, fairness, and transparency. Groupthink and a passive approach to bias emerged as the two main threats to optimal group decision-making. CONCLUSIONS Grading committee members view their practices as advantageous over individual grading, but they feel limited in their ability to address grading fairness and accuracy. Recommendations and support may help committees broaden their scope to address these aspirations.
Collapse
Affiliation(s)
- Annabel K Frank
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Patricia O'Sullivan
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Lynnea M Mills
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Virginie Muller-Juge
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Karen E Hauer
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
68
|
Valentine N, Schuwirth L. Identifying the narrative used by educators in articulating judgement of performance. PERSPECTIVES ON MEDICAL EDUCATION 2019; 8:83-89. [PMID: 30915715 PMCID: PMC6468036 DOI: 10.1007/s40037-019-0500-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
INTRODUCTION Modern assessment in medical education is increasingly reliant on human judgement, as it is clear that quantitative scales have limitations in fully assessing registrars' development of competence and providing them with meaningful feedback to assist learning. For this, possession of an expert vocabulary is essential. AIM This study aims to explore how medical education experts voice their subjective judgements about learners and to what extent they are using clear, information-rich terminology (high-level semantic qualifiers); and to gain a better understanding of the experts' language used in these subjective judgements. METHODS Six experienced medical educators from urban and rural environments were purposefully selected. Each educator reviewed a registrar clinical case analysis in a think out loud manner. The transcribed data were analyzed, codes were identified and ordered into themes. Analysis continued until saturation was reached. RESULTS Five themes with subthemes emerged. The main themes were: (1) Demonstration of expertise; (2) Personal credibility; (3) Professional credibility; (4) Using a predefined structure and (5) Relevance. DISCUSSION Analogous to what experienced clinicians do in clinical reasoning, experienced medical educators verbalize their judgements using high-level semantic qualifiers. In this study, we were able to unpack these. Although there may be individual variability in the exact words used, clear themes emerged. These findings can be used to develop a helpful shared narrative for educators in observation-based assessment. The provision of a rich, detailed narrative will also assist in providing clarity to registrar feedback with areas of weakness clearly articulated to improve learning and remediation.
Collapse
|
69
|
Hauer KE, Lucey CR. Core Clerkship Grading: The Illusion of Objectivity. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:469-472. [PMID: 30113359 DOI: 10.1097/acm.0000000000002413] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Core clerkship grading creates multiple challenges that produce high stress for medical students, interfere with learning, and create inequitable learning environments. Students and faculty alike succumb to the illusion of objectivity-that quantitative ratings converted to grades convey accurate measures of the complexity of clinical performance.Clerkship grading is the first high-stakes assessment within medical school and occurs just as students are newly immersed full-time in an environment in which patient care supersedes their needs as learners. Students earning high marks situate themselves to earn entry into competitive residency programs and selective specialties. However, there is no commonly accepted standard for how to assign clerkship grades, and the process is vulnerable to imprecision and bias. Rewarding learners for the speed with which they adapt inherently favors students who bring advantages acquired before medical school and discounts the goal of all learners achieving competence.The authors propose that, rather than focusing on assigning core clerkship grades, assessment of student performance should incorporate expert judgment of learning progress. Competency-based medical education is predicated on the articulation of stepwise expectations for learners, with the support and time allocated for each learner to meet those expectations. Concurrently, students should ideally review their own performance data with coaches to self-assess areas of relative strength and areas for further growth. Eliminating grades in favor of competency-based assessment for learning holds promise to engage learners in developing essential patient care and teamwork skills and to foster their development of lifelong learning habits.
Collapse
Affiliation(s)
- Karen E Hauer
- K.E. Hauer is associate dean for assessment and professor, Department of Medicine, University of California, San Francisco, San Francisco, California; ORCID: https://orcid.org/0000-0002-8812-4045. C.R. Lucey is vice dean for education and professor, Department of Medicine, University of California, San Francisco, San Francisco, California
| | | |
Collapse
|
70
|
Ali M, Pawluk SA, Rainkie DC, Wilby KJ. Pass-Fail Decisions for Borderline Performers After a Summative Objective Structured Clinical Examination. AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION 2019; 83:6849. [PMID: 30962642 PMCID: PMC6448521 DOI: 10.5688/ajpe6849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 02/17/2018] [Indexed: 05/12/2023]
Abstract
Objective. To determine what expert assessors value when making pass-fail decisions regarding pharmacy students based on summative data from objective structured clinical examinations (OSCE), and to determine the reliability of these judgments between multiple assessors. Methods. All assessment data from 10 exit-from-degree OSCE stations for seven borderline pharmacy students (determined by standard setting methods) and one control was given to three of eight assessors for review. Assessors determined an overall pass-fail decision based on their perception of graduate competency. Assessors were interviewed to determine their decision-making rationale. Intraclass correlation coefficients were used to calculate reliability between assessor judgments. Results. Expert consensus was achieved for three of the eight students, however, the assessors' decisions did not align with standard-setting results. The reliability of assessors' decisions was poor. Assessors focused on ability to make correct recommendations rather than on gathering information or providing follow-up advice. Global evaluations (including a student's communication skills) rarely influenced the assessors' decision-making. Conclusion. When faced with making pass-fail decisions for borderline students, the assessors focus on evaluating the same competencies in the students but differed in their expected performance levels of these competencies. Pass-fail decisions are primarily based on task-focused components instead of global components (eg, communication skills), despite that global components are weighted the same for scoring purposes.
Collapse
Affiliation(s)
- Mayar Ali
- College of Pharmacy, Qatar University, Doha, Qatar
| | | | | | - Kyle John Wilby
- College of Pharmacy, Qatar University, Doha, Qatar
- School of Pharmacy, University of Otago, New Zealand
| |
Collapse
|
71
|
Lefebvre C, Hiestand B, Glass C, Masneri D, Hosmer K, Hunt M, Hartman N. Examining the Effects of Narrative Commentary on Evaluators’ Summative Assessments of Resident Performance. Eval Health Prof 2018; 43:159-161. [DOI: 10.1177/0163278718820415] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Anchor-based, end-of-shift ratings are commonly used to conduct performance assessments of resident physicians. These performance evaluations often include narrative assessments, such as solicited or “free-text” commentary. Although narrative commentary can help to create a more detailed and specific assessment of performance, there are limited data describing the effects of narrative commentary on the global assessment process. This single-group, observational study examined the effect of narrative comments on global performance assessments. A subgroup of the clinical competency committee, blinded to resident identity, assigned a single, consensus-based performance score (1–6) to each resident based solely on end-of-shift milestone scores. De-identified narrative comments from end-of-shift evaluations were then included and the process was repeated. We compared milestone-only scores to milestone plus narrative commentary scores using a nonparametric sign test. During the study period, 953 end-of-shift evaluations were submitted on 41 residents. Of these, 535 evaluations included free-text narrative comments. In 17 of the 41 observations, performance scores changed after the addition of narrative comments. In two cases, scores decreased with the addition of free-text commentary. In 15 cases, scores increased. The frequency of net positive change was significant ( p = .0023). The addition of narrative commentary to anchor-based ratings significantly influenced the global performance assessment of Emergency Medicine residents by a committee of educators. Descriptive commentary collected at the end of shift may inform more meaningful appraisal of a resident’s progress in a milestone-based paradigm. The authors recommend clinical training programs collect unstructured narrative impressions of residents’ performance from supervising faculty.
Collapse
Affiliation(s)
- Cedric Lefebvre
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Brian Hiestand
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Casey Glass
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - David Masneri
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Kathleen Hosmer
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Meagan Hunt
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Nicholas Hartman
- Department of Emergency Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| |
Collapse
|
72
|
What do quantitative ratings and qualitative comments tell us about general surgery residents' progress toward independent practice? Evidence from a 5-year longitudinal cohort. Am J Surg 2018; 217:288-295. [PMID: 30309619 DOI: 10.1016/j.amjsurg.2018.09.031] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Revised: 09/12/2018] [Accepted: 09/28/2018] [Indexed: 11/21/2022]
Abstract
BACKGROUND This study examines the alignment of quantitative and qualitative assessment data in end-of-rotation evaluations using longitudinal cohorts of residents progressing throughout the five-year general surgery residency. METHODS Rotation evaluation data were extracted for 171 residents who trained between July 2011 and July 2016. Data included 6069 rotation evaluations forms completed by 38 faculty members and 164 peer-residents. Qualitative comments mapped to general surgery milestones were coded for positive/negative feedback and relevance. RESULTS Quantitative evaluation scores were significantly correlated with positive/negative feedback, r = 0.52 and relevance, r = -0.20, p < .001. Themes included feedback on leadership, teaching contribution, medical knowledge, work ethic, patient-care, and ability to work in a team-based setting. Faculty comments focused on technical and clinical abilities; comments from peers focused on professionalism and interpersonal relationships. CONCLUSIONS We found differences in themes emphasized as residents progressed. These findings underscore improving our understanding of how faculty synthesize assessment data.
Collapse
|
73
|
Pusic MV, Santen SA, Dekhtyar M, Poncelet AN, Roberts NK, Wilson-Delfosse AL, Cutrer WB. Learning to balance efficiency and innovation for optimal adaptive expertise. MEDICAL TEACHER 2018; 40:820-827. [PMID: 30091659 DOI: 10.1080/0142159x.2018.1485887] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
It is critical for health professionals to continue to learn and this must be supported by health professions education (HPE). Adaptive expert clinicians are not only expert in their work but have the additional capacity to learn and improve in their practices. The authors review a selective aspect of learning to become an adaptive expert: the capacity to optimally balance routine approaches that maximize efficiency with innovative ones where energy and resources are used to customize actions for novel or difficult situations. Optimal transfer of learning, and hence the design of instruction, differs depending on whether the goal is efficient or innovative practice. However, the task is necessarily further complicated when the aspiration is an adaptive expert practitioner who can fluidly balance innovation with efficiency as the situation requires. Using HPE examples at both the individual and organizational level, the authors explore the instructional implications of learning to shift from efficient to innovative expert functioning, and back. They argue that the efficiency-innovation tension is likely to endure deep into the future and therefore warrants important consideration in HPE.
Collapse
Affiliation(s)
| | - Sally A Santen
- b Department of Medicine, Virginia Commonwealth University , Richmond , VA , USA
| | | | - Ann N Poncelet
- d Department of Neurology, University of California San Francisco , San Francisco , CA , USA
| | - Nicole K Roberts
- e Department of Medical Education, City University of New York , New York , NY , USA
| | - Amy L Wilson-Delfosse
- f Department of Pharmacology, Case Western Reserve University , Cleveland , OH , USA
| | - William B Cutrer
- g Department of Pediatrics, Vanderbilt University School of Medicine , Nashville , TN , USA
| |
Collapse
|
74
|
Baines R, Regan de Bere S, Stevens S, Read J, Marshall M, Lalani M, Bryce M, Archer J. The impact of patient feedback on the medical performance of qualified doctors: a systematic review. BMC MEDICAL EDUCATION 2018; 18:173. [PMID: 30064413 PMCID: PMC6069829 DOI: 10.1186/s12909-018-1277-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 07/11/2018] [Indexed: 05/21/2023]
Abstract
BACKGROUND Patient feedback is considered integral to quality improvement and professional development. However, while popular across the educational continuum, evidence to support its efficacy in facilitating positive behaviour change in a postgraduate setting remains unclear. This review therefore aims to explore the evidence that supports, or refutes, the impact of patient feedback on the medical performance of qualified doctors. METHODS Electronic databases PubMed, EMBASE, Medline and PsycINFO were systematically searched for studies assessing the impact of patient feedback on medical performance published in the English language between 2006-2016. Impact was defined as a measured change in behaviour using Barr's (2000) adaptation of Kirkpatrick's four level evaluation model. Papers were quality appraised, thematically analysed and synthesised using a narrative approach. RESULTS From 1,269 initial studies, 20 articles were included (qualitative (n=8); observational (n=6); systematic review (n=3); mixed methodology (n=1); randomised control trial (n=1); and longitudinal (n=1) design). One article identified change at an organisational level (Kirkpatrick level 4); six reported a measured change in behaviour (Kirkpatrick level 3b); 12 identified self-reported change or intention to change (Kirkpatrick level 3a), and one identified knowledge or skill acquisition (Kirkpatrick level 2). No study identified a change at the highest level, an improvement in the health and wellbeing of patients. The main factors found to influence the impact of patient feedback were: specificity; perceived credibility; congruence with physician self-perceptions and performance expectations; presence of facilitation and reflection; and inclusion of narrative comments. The quality of feedback facilitation and local professional cultures also appeared integral to positive behaviour change. CONCLUSION Patient feedback can have an impact on medical performance. However, actionable change is influenced by several contextual factors and cannot simply be guaranteed. Patient feedback is likely to be more influential if it is specific, collected through credible methods and contains narrative information. Data obtained should be fed back in a way that facilitates reflective discussion and encourages the formulation of actionable behaviour change. A supportive cultural understanding of patient feedback and its intended purpose is also essential for its effective use.
Collapse
Affiliation(s)
- Rebecca Baines
- Collaboration for the Advancement of Medical Education Research & Assessment (CAMERA), Faculty of Medicine and Dentistry, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - Sam Regan de Bere
- Collaboration for the Advancement of Medical Education Research & Assessment (CAMERA), Faculty of Medicine and Dentistry, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - Sebastian Stevens
- Collaboration for the Advancement of Medical Education Research & Assessment (CAMERA), Faculty of Medicine and Dentistry, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - Jamie Read
- Collaboration for the Advancement of Medical Education Research & Assessment (CAMERA), Faculty of Medicine and Dentistry, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - Martin Marshall
- Improvement Science London, University College London, London, UK
| | - Mirza Lalani
- Research Department of Primary Care and Population Health, University College London, London, UK
| | - Marie Bryce
- Collaboration for the Advancement of Medical Education Research & Assessment (CAMERA), Faculty of Medicine and Dentistry, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK
| | - Julian Archer
- Collaboration for the Advancement of Medical Education Research & Assessment (CAMERA), Faculty of Medicine and Dentistry, University of Plymouth, Drake Circus, Plymouth, PL4 8AA, UK.
| |
Collapse
|
75
|
Eva KW. Cognitive Influences on Complex Performance Assessment: Lessons from the Interplay between Medicine and Psychology. JOURNAL OF APPLIED RESEARCH IN MEMORY AND COGNITION 2018. [DOI: 10.1016/j.jarmac.2018.03.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
76
|
Franzen D, Cooney R, Chan T, Brown M, Diercks DB. Scholarship by the Clinician-Educator in Emergency Medicine. AEM EDUCATION AND TRAINING 2018; 2:115-120. [PMID: 30051078 PMCID: PMC6001503 DOI: 10.1002/aet2.10084] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/28/2017] [Accepted: 01/23/2018] [Indexed: 05/25/2023]
Abstract
Emergency medicine (EM) continues to grow as an academic specialty. Like most specialties, a large number of academic emergency physicians are focused on education of our graduate student learners. For promotion, clinician-educators (CEs) are required to produce scholarly work and disseminate knowledge. Although promotion requirements may vary by institution, scholarly work is a consistent requirement. Due to the clinical constraints of working in the emergency department, the unique interactions emergency physicians have with their learners, and early adoption of alternative teaching methods, EM CEs' scholarly work may not be adequately described in a traditional curriculum vitae. Using a rubric of established domains around the academic work of CEs, this article describes some of the ways EM educators address these domains. The aim of the article is to provide a guide for academic department leadership, CEs, and promotion committees about the unique ways EM has addressed the work of the CE.
Collapse
Affiliation(s)
- Douglas Franzen
- Division of Emergency MedicineUniversity of WashingtonSeattleWA
| | - Robert Cooney
- Department of Emergency MedicineGeisinger Medical CenterDanvillePA
| | - Teresa Chan
- Division of Emergency MedicineDepartment of MedicineMcMaster UniversityHamiltonOntarioCanada
| | - Michael Brown
- Department of Emergency MedicineMichigan State University College of Human MedicineGrand RapidsMI
| | - Deborah B. Diercks
- Department of Emergency MedicineUniversity of Texas SouthwesternDallasTX
| |
Collapse
|
77
|
Chan T, Sebok‐Syer S, Thoma B, Wise A, Sherbino J, Pusic M. Learning Analytics in Medical Education Assessment: The Past, the Present, and the Future. AEM EDUCATION AND TRAINING 2018; 2:178-187. [PMID: 30051086 PMCID: PMC6001721 DOI: 10.1002/aet2.10087] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 01/30/2018] [Indexed: 05/09/2023]
Abstract
With the implementation of competency-based medical education (CBME) in emergency medicine, residency programs will amass substantial amounts of qualitative and quantitative data about trainees' performances. This increased volume of data will challenge traditional processes for assessing trainees and remediating training deficiencies. At the intersection of trainee performance data and statistical modeling lies the field of medical learning analytics. At a local training program level, learning analytics has the potential to assist program directors and competency committees with interpreting assessment data to inform decision making. On a broader level, learning analytics can be used to explore system questions and identify problems that may impact our educational programs. Scholars outside of health professions education have been exploring the use of learning analytics for years and their theories and applications have the potential to inform our implementation of CBME. The purpose of this review is to characterize the methodologies of learning analytics and explore their potential to guide new forms of assessment within medical education.
Collapse
Affiliation(s)
- Teresa Chan
- McMaster program for Education Research, Innovation, and Theory (MERIT)HamiltonOntarioCanada
| | - Stefanie Sebok‐Syer
- Centre for Education Research & InnovationSchulich School of Medicine and DentistrySaskatoonSaskatchewanCanada
| | - Brent Thoma
- Department of Emergency MedicineUniversity of SaskatchewanSaskatoonSaskatchewanCanada
| | - Alyssa Wise
- Steinhardt School of Culture, Education, and Human DevelopmentNew York UniversityNew YorkNY
| | - Jonathan Sherbino
- Faculty of Health ScienceDivision of Emergency MedicineDepartment of MedicineMcMaster UniversityHamiltonOntarioCanada
- McMaster program for Education Research, Innovation, and Theory (MERIT)HamiltonOntarioCanada
| | - Martin Pusic
- Department of Emergency MedicineNYU School of MedicineNew YorkNY
| |
Collapse
|
78
|
Lockyer JM, Sargeant J, Richards SH, Campbell JL, Rivera LA. Multisource Feedback and Narrative Comments: Polarity, Specificity, Actionability, and CanMEDS Roles. THE JOURNAL OF CONTINUING EDUCATION IN THE HEALTH PROFESSIONS 2018; 38:32-40. [PMID: 29329147 DOI: 10.1097/ceh.0000000000000183] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
INTRODUCTION Multisource feedback is a questionnaire-based assessment tool that provides physicians with data about workplace behaviors and may combine numeric and narrative (free-text) comments. Little attention has been paid to wording of requests for comments, potentially limiting its utility to support physician performance. This study tested the phrasing of two different sets of questions. METHODS Two sets of questions were tested with family physicians, medical and surgical specialists, and their medical colleague and coworker respondents. One set asked respondents to identify one thing the participant physician does well and one thing the physician could target for action. Set 2 questions asked what does the physician do well and what might the physician do to enhance practice. Resulting free-text comments provided by respondents were coded for polarity (positive, neutral, or negative), specificity (precision and detail), actionability (ability to use the feedback to direct future activity), and CanMEDS roles (competencies) and analyzed descriptively. RESULTS Data for 222 physicians (111 physicians per set) were analyzed. A total of 1824 comments (8.2/physician) were submitted, with more comments from coworkers than medical colleagues. Set 1 yielded more comments and were more likely to be positive, semi specific, and very actionable than set 2. However, set 2 generated more very specific comments. Comments covered all CanMEDS roles with more comments for collaborator and leader roles. DISCUSSION The wording of questions inviting free-text responses influences the volume and nature of the comments provided. Individuals designing multisource feedback tools should carefully consider wording of items soliciting narrative responses.
Collapse
Affiliation(s)
- Jocelyn M Lockyer
- Dr. Lockyer: Professor, Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada. Dr. Sargeant: Professor, Medical Education, Dalhousie University, Halifax, Nova Scotia, Canada. Dr. Richards: Academic Unit of Primary Care, Leeds Institute of Health Sciences, University of Leeds, Leeds, United Kingdom. Dr. Campbell: Professor of General Practice and Primary Care and Director, University of Exeter Collaboration for Academic Primary Care (APEx), University of Exeter Medical School, Exeter, United Kingdom. Ms. Rivera: Research Associate, Office of Continuing Medical Education and Professional Development, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | | | | | | | | |
Collapse
|
79
|
Tekian A, Watling CJ, Roberts TE, Steinert Y, Norcini J. Qualitative and quantitative feedback in the context of competency-based education. MEDICAL TEACHER 2017; 39:1245-1249. [PMID: 28927332 DOI: 10.1080/0142159x.2017.1372564] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Research indicates the importance and usefulness of feedback, yet with the shift of medical curricula toward competencies, feedback is not well understood in this context. This paper attempts to identify how feedback fits within a competency-based curriculum. After careful consideration of the literature, the following conclusions are drawn: (1) Because feedback is predicated on assessment, the assessment should be designed to optimize and prevent inaccuracies in feedback; (2) Giving qualitative feedback in the form of a conversation would lend credibility to the feedback, address emotional obstacles and create a context in which feedback is comfortable; (3) Quantitative feedback in the form of individualized data could fulfill the demand for more feedback, help students devise strategies on how to improve, allow students to compare themselves to their peers, recognizing that big data have limitations; and (4) Faculty development needs to incorporate and promote cultural and systems changes with regard to feedback. A better understanding of the role of feedback in competency-based education could result in more efficient learning for students.
Collapse
Affiliation(s)
- Ara Tekian
- a Department of Medical Education , College of Medicine, University of Illinois , Chicago , IL , USA
| | - Christopher J Watling
- b Schulich School of Medicine and Dentistry , Western University , London , ON , Canada
| | - Trudie E Roberts
- c Medical Education Unit , Leeds Institute of Medical Education , Leeds , UK
| | - Yvonne Steinert
- d Center for Medical Education , McGill University , Montreal , QC , Canada
| | - John Norcini
- e Foundation for Advancement of International Medical Education and Research , Philadelphia , PA , USA
| |
Collapse
|
80
|
Cheung WJ, Dudek NL, Wood TJ, Frank JR. Supervisor-trainee continuity and the quality of work-based assessments. MEDICAL EDUCATION 2017; 51:1260-1268. [PMID: 28971502 DOI: 10.1111/medu.13415] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Revised: 05/30/2017] [Accepted: 07/11/2017] [Indexed: 05/12/2023]
Abstract
CONTEXT Work-based assessments (WBAs) represent an increasingly important means of reporting expert judgements of trainee competence in clinical practice. However, the quality of WBAs completed by clinical supervisors is of concern. The episodic and fragmented interaction that often occurs between supervisors and trainees has been proposed as a barrier to the completion of high-quality WBAs. OBJECTIVES The primary purpose of this study was to determine the effect of supervisor-trainee continuity on the quality of assessments documented on daily encounter cards (DECs), a common form of WBA. The relationship between trainee performance and DEC quality was also examined. METHODS Daily encounter cards representing three differing degrees of supervisor-trainee continuity (low, intermediate, high) were scored by two raters using the Completed Clinical Evaluation Report Rating (CCERR), a previously published nine-item quantitative measure of DEC quality. An analysis of variance (anova) was performed to compare mean CCERR scores among the three groups. Linear regression analysis was conducted to examine the relationship between resident performance and DEC quality. RESULTS Differences in mean CCERR scores were observed between the three continuity groups (p = 0.02); however, the magnitude of the absolute differences was small (partial eta-squared = 0.03) and not educationally meaningful. Linear regression analysis demonstrated a significant inverse relationship between resident performance and CCERR score (p < 0.001, r2 = 0.18). This inverse relationship was observed in both groups representing on-service residents (p = 0.001, r2 = 0.25; p = 0.04, r2 = 0.19), but not in the Off-service group (p = 0.62, r2 = 0.05). CONCLUSIONS Supervisor-trainee continuity did not have an educationally meaningful influence on the quality of assessments documented on DECs. However, resident performance was found to affect assessor behaviours in the On-service group, whereas DEC quality remained poor regardless of performance in the Off-service group. The findings suggest that greater attention should be given to determining ways of improving the quality of assessments reported for off-service residents, as well as for those residents demonstrating appropriate clinical competence progression.
Collapse
Affiliation(s)
- Warren J Cheung
- Department of Emergency Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Nancy L Dudek
- Division of Physical Medicine and Rehabilitation, Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Timothy J Wood
- Department of Innovation in Medical Education, University of Ottawa, Ottawa, Ontario, Canada
| | - Jason R Frank
- Department of Emergency Medicine, University of Ottawa, Ottawa, Ontario, Canada
- Royal College of Physicians and Surgeons of Canada, Ottawa, Ontario, Canada
| |
Collapse
|
81
|
Sebok-Syer SS, Klinger DA, Sherbino J, Chan TM. Mixed Messages or Miscommunication? Investigating the Relationship Between Assessors' Workplace-Based Assessment Scores and Written Comments. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2017; 92:1774-1779. [PMID: 28562452 DOI: 10.1097/acm.0000000000001743] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE The shift toward broader, programmatic assessment has revolutionized the approaches that many take in assessing medical competence. To understand the association between quantitative and qualitative evaluations, the authors explored the relationships that exist among assessors' checklist scores, task ratings, global ratings, and written comments. METHOD The authors collected and analyzed, using regression analyses, data from the McMaster Modular Assessment Program. The data were from emergency medicine residents in their first or second year of postgraduate training from 2012 through 2014. Additionally, using content analysis, the authors analyzed narrative comments corresponding to the "done" and "done, but needs attention" checklist score options. RESULTS The regression analyses revealed that the task ratings, provided by faculty assessors, are associated with the use of the "done, but needs attention" checklist score option. Analyses also identified that the "done, but needs attention" option is associated with a narrative comment that is balanced, providing both strengths and areas for improvement. Analysis of qualitative comments revealed differences in the type of comments provided to higher- and lower-performing residents. CONCLUSIONS This study highlights some of the relationships that exist among checklist scores, rating scales, and written comments. The findings highlight that task ratings are associated with checklist options while global ratings are not. Furthermore, analysis of written comments supports the notion of a "hidden code" used to communicate assessors' evaluation of medical competence, especially when communicating areas for improvement or concern. This study has implications for how individuals should interpret information obtained from qualitative assessments.
Collapse
Affiliation(s)
- Stefanie S Sebok-Syer
- S.S. Sebok-Syer is instructor of education, Queen's University, Kingston, Ontario, Canada. D.A. Klinger is professor of education, Queen's University, Kingston, Ontario, Canada. J. Sherbino is associate professor of medicine, McMaster University, Hamilton, Ontario, Canada. T.M. Chan is assistant professor of medicine, McMaster University, Hamilton, Ontario, Canada; ORCID: http://orcid.org/0000-0001-6104-462
| | | | | | | |
Collapse
|
82
|
Wilbur K. Does faculty development influence the quality of in-training evaluation reports in pharmacy? BMC MEDICAL EDUCATION 2017; 17:222. [PMID: 29157239 PMCID: PMC5697106 DOI: 10.1186/s12909-017-1054-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 11/02/2017] [Indexed: 06/02/2023]
Abstract
BACKGROUND In-training evaluation reports (ITERs) of student workplace-based learning are completed by clinical supervisors across various health disciplines. However, outside of medicine, the quality of submitted workplace-based assessments is largely uninvestigated. This study assessed the quality of ITERs in pharmacy and whether clinical supervisors could be trained to complete higher quality reports. METHODS A random sample of ITERs submitted in a pharmacy program during 2013-2014 was evaluated. These ITERs served as a historical control (control group 1) for comparison with ITERs submitted in 2015-2016 by clinical supervisors who participated in an interactive faculty development workshop (intervention group) and those who did not (control group 2). Two trained independent raters scored the ITERs using a previously validated nine-item scale assessing report quality, the Completed Clinical Evaluation Report Rating (CCERR). The scoring scale for each item is anchored at 1 ("not at all") and 5 ("exemplary"), with 3 categorized as "acceptable". RESULTS Mean CCERR score for reports completed after the workshop (22.9 ± 3.39) did not significantly improve when compared to prospective control group 2 (22.7 ± 3.63, p = 0.84) and were worse than historical control group 1 (37.9 ± 8.21, p = 0.001). Mean item scores for individual CCERR items were below acceptable thresholds for 5 of the 9 domains in control group 1, including supervisor documented evidence of specific examples to clearly explain weaknesses and concrete recommendations for student improvement. Mean item scores for individual CCERR items were below acceptable thresholds for 6 and 7 of the 9 domains in control group 2 and the intervention group, respectively. CONCLUSIONS This study is the first using CCERR to evaluate ITER quality outside of medicine. Findings demonstrate low baseline CCERR scores in a pharmacy program not demonstrably changed by a faculty development workshop, but strategies are identified to augment future rater training.
Collapse
Affiliation(s)
- Kerry Wilbur
- College of Pharmacy, Qatar University, PO Box 2713, Doha, Qatar.
| |
Collapse
|
83
|
Bartels J, Mooney CJ, Stone RT. Numerical versus narrative: A comparison between methods to measure medical student performance during clinical clerkships. MEDICAL TEACHER 2017; 39:1154-1158. [PMID: 28845738 DOI: 10.1080/0142159x.2017.1368467] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
BACKGROUND Medical school evaluations typically rely on both language-based narrative descriptions and psychometrically converted numeric scores to convey performance to the grading committee. We evaluated inter-rater reliability and correlation of numeric versus narrative evaluations for students on their Neurology Clerkship. DESIGN/METHODS 50 Neurology Clerkship in-training evaluation reports completed by their residents and faculty members at the University of Rochester School of Medicine were dissected into narrative and numeric components. 5 Clerkship grading committee members retrospectively gave new narrative scores (NNS) while blinded to original numeric scores (ONS). We calculated intra-class correlation coefficients (ICC) and their associated confidence intervals for the ONS and the NNS. In addition, we calculated the correlation between ONS and NNS. RESULTS The ICC was greater for the NNS (ICC = .88 (95% CI = .70-.94)) than the ONS (ICC = .62 (95% CI = .40-.77)) Pearson correlation coefficient showed that the ONS and NNS were highly correlated (r = .81). CONCLUSIONS Narrative evaluations converted by a small group of experienced graders are at least as reliable as numeric scoring by individual evaluators. We could allow evaluators to focus their efforts on creating richer narrative of greater value to trainees.
Collapse
Affiliation(s)
- Josef Bartels
- a Family Medicine , WWAMI Region Practice & Research Network , Boise , ID , USA
| | - Christopher John Mooney
- b Office of Medical Education , University of Rochester School of Medicine and Dentistry , Rochester , NY , USA
| | - Robert Thompson Stone
- c Neurology , University of Rochester School of Medicine and Dentistry , Rochester , NY , USA
| |
Collapse
|
84
|
Ginsburg S, van der Vleuten CPM, Eva KW. The Hidden Value of Narrative Comments for Assessment: A Quantitative Reliability Analysis of Qualitative Data. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2017; 92:1617-1621. [PMID: 28403004 DOI: 10.1097/acm.0000000000001669] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
PURPOSE In-training evaluation reports (ITERs) are ubiquitous in internal medicine (IM) residency. Written comments can provide a rich data source, yet are often overlooked. This study determined the reliability of using variable amounts of commentary to discriminate between residents. METHOD ITER comments from two cohorts of PGY-1s in IM at the University of Toronto (graduating 2010 and 2011; n = 46-48) were put into sets containing 15 to 16 residents. Parallel sets were created: one with comments from the full year and one with comments from only the first three assessments. Each set was rank-ordered by four internists external to the program between April 2014 and May 2015 (n = 24). Generalizability analyses and a decision study were performed. RESULTS For the full year of comments, reliability coefficients averaged across four rankers were G = 0.85 and G = 0.91 for the two cohorts. For a single ranker, G = 0.60 and G = 0.73. Using only the first three assessments, reliabilities remained high at G = 0.66 and G = 0.60 for a single ranker. In a decision study, if two internists ranked the first three assessments, reliability would be G = 0.80 and G = 0.75 for the two cohorts. CONCLUSIONS Using written comments to discriminate between residents can be extremely reliable even after only several reports are collected. This suggests a way to identify residents early on who may require attention. These findings contribute evidence to support the validity argument for using qualitative data for assessment.
Collapse
Affiliation(s)
- Shiphra Ginsburg
- S. Ginsburg is professor, Department of Medicine, and scientist, Wilson Centre for Research in Education, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada. C.P.M. van der Vleuten is professor of education, Department of Educational Development and Research, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, the Netherlands. K.W. Eva is associate director and senior scientist, Centre for Health Education Scholarship, and professor and director of educational research and scholarship, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | |
Collapse
|
85
|
Chang TP, Schrager SM, Rake AJ, Chan MW, Pham PK, Christman G. The effect of multimedia replacing text in resident clinical decision-making assessment. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2017; 22:901-914. [PMID: 27752842 DOI: 10.1007/s10459-016-9719-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 10/06/2016] [Indexed: 06/06/2023]
Abstract
Multimedia in assessing clinical decision-making skills (CDMS) has been poorly studied, particularly in comparison to traditional text-based assessments. The literature suggests multimedia is more difficult for trainees. We hypothesize that pediatric residents score lower in diagnostic skill when clinical vignettes use multimedia rather than text for patient findings. A standardized method was developed to write text-based questions from 60 high-resolution, quality multimedia; a series of expert panels selected 40 questions with both a multimedia and text-based counterpart, and two online tests were developed. Each test featured 40 identical questions with reciprocal and alternating modality (multimedia vs. text). Pediatric residents and rising 4th year medical students (MS-IV) at a single residency were randomized to complete either test stratified by postgraduate training year (PGY). A mixed between-within subjects ANOVA analyzed differences in score due to modality and PGY. Secondary analyses ascertained modality effect in dermatology and respiratory questions using Mann-Whitney U tests, and correlations on test performance to In-service Training Exam (ITE) scores using Spearman rank. Eighty-eight residents and rising interns completed the study. Overall multimedia scores were lower than text-based scores (p = 0.047, η p2 = 0.04), with highest disparity in rising interns (MS-IV); however, PGY had a greater effect on scores (p = 0.001, η p2 = 0.16). Respiratory questions were not significantly lower with multimedia (n = 9, median 0.71 vs. 0.86, p = 0.09) nor dermatology questions (n = 13, p = 0.41). ITEs correlated significantly with text-based scores (ρ = 0.23-0.25, p = 0.04-0.06) but not with multimedia scores. In physician trainees with less clinical experience, multimedia-based case vignettes are associated with significantly lower scores. These results help shed light on the role of multimedia versus text-based information in CDMS, particularly in less experienced clinicians.
Collapse
Affiliation(s)
- Todd P Chang
- Division of Emergency Medicine and Transport, Children's Hospital Los Angeles and University of Southern California Keck School of Medicine, Los Angeles, CA, USA.
| | - Sheree M Schrager
- Division of Hospital Medicine, Children's Hospital Los Angeles and University of Southern California Keck School of Medicine, Los Angeles, CA, USA
| | - Alyssa J Rake
- Department of Critical Care and Anesthesiology, Children's Hospital Los Angeles and University of Southern California Keck School of Medicine, Los Angeles, CA, USA
| | - Michael W Chan
- Division of Emergency Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago and Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Phung K Pham
- Division of Emergency Medicine and Transport, Children's Hospital Los Angeles and University of Southern California Keck School of Medicine, Los Angeles, CA, USA
- Division of Behavioral and Organizational Sciences, Claremont Graduate University, Claremont, CA, USA
| | - Grant Christman
- Division of Hospital Medicine, Children's Hospital Los Angeles and University of Southern California Keck School of Medicine, Los Angeles, CA, USA
| |
Collapse
|
86
|
Hatala R, Sawatsky AP, Dudek N, Ginsburg S, Cook DA. Using In-Training Evaluation Report (ITER) Qualitative Comments to Assess Medical Students and Residents: A Systematic Review. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2017; 92:868-879. [PMID: 28557953 DOI: 10.1097/acm.0000000000001506] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE In-training evaluation reports (ITERs) constitute an integral component of medical student and postgraduate physician trainee (resident) assessment. ITER narrative comments have received less attention than the numeric scores. The authors sought both to determine what validity evidence informs the use of narrative comments from ITERs for assessing medical students and residents and to identify evidence gaps. METHOD Reviewers searched for relevant English-language studies in MEDLINE, EMBASE, Scopus, and ERIC (last search June 5, 2015), and in reference lists and author files. They included all original studies that evaluated ITERs for qualitative assessment of medical students and residents. Working in duplicate, they selected articles for inclusion, evaluated quality, and abstracted information on validity evidence using Kane's framework (inferences of scoring, generalization, extrapolation, and implications). RESULTS Of 777 potential articles, 22 met inclusion criteria. The scoring inference is supported by studies showing that rich narratives are possible, that changing the prompt can stimulate more robust narratives, and that comments vary by context. Generalization is supported by studies showing that narratives reach thematic saturation and that analysts make consistent judgments. Extrapolation is supported by favorable relationships between ITER narratives and numeric scores from ITERs and non-ITER performance measures, and by studies confirming that narratives reflect constructs deemed important in clinical work. Evidence supporting implications is scant. CONCLUSIONS The use of ITER narratives for trainee assessment is generally supported, except that evidence is lacking for implications and decisions. Future research should seek to confirm implicit assumptions and evaluate the impact of decisions.
Collapse
Affiliation(s)
- Rose Hatala
- R. Hatala is associate professor of medicine, Faculty of Medicine, and director, Clinical Educator Fellowship, Centre for Health Education Scholarship, University of British Columbia, Vancouver, British Columbia, Canada. A.P. Sawatsky is assistant professor of medicine and senior associate consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota. N. Dudek is associate professor, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada. S. Ginsburg is professor, Department of Medicine, Faculty of Medicine, University of Toronto, scientist, Wilson Centre for Research in Education, University Health Network/University of Toronto, and staff physician, Mount Sinai Hospital, Toronto, Ontario, Canada. D.A. Cook is professor of medicine and medical education, associate director, Mayo Clinic Online Learning, and consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota
| | | | | | | | | |
Collapse
|
87
|
Ramani S, Post SE, Könings K, Mann K, Katz JT, van der Vleuten C. "It's Just Not the Culture": A Qualitative Study Exploring Residents' Perceptions of the Impact of Institutional Culture on Feedback. TEACHING AND LEARNING IN MEDICINE 2017; 29:153-161. [PMID: 28001442 DOI: 10.1080/10401334.2016.1244014] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
UNLABELLED Phenomenon: Competency-based medical education requires ongoing performance-based feedback for professional growth. In several studies, medical trainees report that the quality of faculty feedback is inadequate. Sociocultural barriers to feedback exchanges are further amplified in graduate and postgraduate medical education settings, where trainees serve as frontline providers of patient care. Factors that affect institutional feedback culture, enhance feedback seeking, acceptance, and bidirectional feedback warrant further exploration in these settings. APPROACH Using a constructivist grounded theory approach, we sought to examine residents' perspectives on institutional factors that affect the quality of feedback, factors that influence receptivity to feedback, and quality and impact of faculty feedback. Four focus group discussions were conducted, with two investigators present at each. One facilitated the discussion, and the other observed the interactions and took field notes. We audiotaped and transcribed the discussions, and performed a thematic analysis. Measures to ensure rigor included thick descriptions, independent coding by two investigators, and attention to reflexivity. FINDINGS We identified five key themes, dominated by resident perceptions regarding the influence of institutional feedback culture. The theme labels are taken from direct participant quotes: (a) the cultural norm lacks clear expectations and messages around feedback, (b) the prevailing culture of niceness does not facilitate honest feedback, (c) bidirectional feedback is not part of the culture, (d) faculty-resident relationships impact credibility and receptivity to feedback, and (e) there is a need to establish a culture of longitudinal professional growth. Insights: Institutional culture could play a key role in influencing the quality, credibility, and acceptability of feedback. A polite culture promotes a positive learning environment but can be a barrier to honest feedback. Feedback initiatives focusing solely on techniques of feedback giving may not enhance meaningful feedback. Further research on factors that promote feedback seeking, receptivity to constructive feedback, and bidirectional feedback would provide valuable insights.
Collapse
Affiliation(s)
- Subha Ramani
- a Department of Medicine , Brigham and Women's Hospital and Harvard Medical School , Boston , Massachusetts , USA
| | - Sarah E Post
- a Department of Medicine , Brigham and Women's Hospital and Harvard Medical School , Boston , Massachusetts , USA
| | - Karen Könings
- b Education Development and Research , Maastricht University , Maastricht , The Netherlands
| | - Karen Mann
- c Department of Medical Education , Dalhousie University , Halifax , Nova Scotia , Canada
| | - Joel T Katz
- a Department of Medicine , Brigham and Women's Hospital and Harvard Medical School , Boston , Massachusetts , USA
| | - Cees van der Vleuten
- b Education Development and Research , Maastricht University , Maastricht , The Netherlands
| |
Collapse
|
88
|
Ginsburg S, van der Vleuten CP, Eva KW, Lingard L. Cracking the code: residents' interpretations of written assessment comments. MEDICAL EDUCATION 2017; 51:401-410. [PMID: 28093833 DOI: 10.1111/medu.13158] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Revised: 02/26/2016] [Accepted: 07/18/2016] [Indexed: 05/09/2023]
Abstract
CONTEXT Interest is growing in the use of qualitative data for assessment. Written comments on residents' in-training evaluation reports (ITERs) can be reliably rank-ordered by faculty attendings, who are adept at interpreting these narratives. However, if residents do not interpret assessment comments in the same way, a valuable educational opportunity may be lost. OBJECTIVES Our purpose was to explore residents' interpretations of written assessment comments using mixed methods. METHODS Twelve internal medicine (IM) postgraduate year 2 (PGY2) residents were asked to rank-order a set of anonymised PGY1 residents (n = 48) from a previous year in IM based solely on their ITER comments. Each PGY1 was ranked by four PGY2s; generalisability theory was used to assess inter-rater reliability. The PGY2s were then interviewed separately about their rank-ordering process, how they made sense of the comments and how they viewed ITERs in general. Interviews were analysed using constructivist grounded theory. RESULTS Across four PGY2 residents, the G coefficient was 0.84; for a single resident it was 0.56. Resident rankings correlated extremely well with faculty member rankings (r = 0.90). Residents were equally adept at reading between the lines to construct meaning from the comments and used language cues in ways similarly reported in faculty attendings. Participants discussed the difficulties of interpreting vague language and provided perspectives on why they thought it occurs (time, discomfort, memorability and the permanency of written records). They emphasised the importance of face-to-face discussions, the relative value of comments over scores, staff-dependent variability of assessment and the perceived purpose and value of ITERs. They saw particular value in opportunities to review an aggregated set of comments. CONCLUSIONS Residents understood the 'hidden code' in assessment language and their ability to rank-order residents based on comments matched that of faculty. Residents seemed to accept staff-dependent variability as a reality. These findings add to the growing evidence that supports the use of narrative comments and subjectivity in assessment.
Collapse
Affiliation(s)
- Shiphra Ginsburg
- Department of Medicine, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Cees Pm van der Vleuten
- Department of Educational Development and Research, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, the Netherlands
| | - Kevin W Eva
- Department of Medicine, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Lorelei Lingard
- Department of Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| |
Collapse
|
89
|
Wilbur K, Hassaballa N, Mahmood OS, Black EK. Describing student performance: a comparison among clinical preceptors across cultural contexts. MEDICAL EDUCATION 2017; 51:411-422. [PMID: 28220518 DOI: 10.1111/medu.13223] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Revised: 02/26/2016] [Accepted: 09/09/2016] [Indexed: 06/06/2023]
Abstract
CONTEXT Health professional student evaluation during experiential training is notably subjective and assessor judgements may be affected by socio-cultural influences. OBJECTIVES This study sought to explore how clinical preceptors in pharmacy conceptualise varying levels of student performance and to identify any contextual differences that may exist across different countries. METHODS The qualitative research design employed semi-structured interviews. A sample of 20 clinical preceptors for post-baccalaureate Doctor of Pharmacy programmes in Canada and the Middle East gave personal accounts of how students they had supervised fell below, met or exceeded their expectations. Discussions were analysed following constructivist grounded theory principles. RESULTS Seven major themes encompassing how clinical pharmacy preceptors categorise levels of student performance and behaviour were identified: knowledge; team interaction; motivation; skills; patient care; communication, and professionalism. Expectations were outlined using both positive and negative descriptions. Pharmacists typically described supervisory experiences representing a series of these categories, but arrived at concluding judgements in a holistic fashion: if valued traits of motivation and positive attitude were present, overall favourable impressions of a student could be maintained despite observations of a few deficiencies. Some prioritised dimensions could not be mapped to defined existing educational outcomes. There was no difference in thresholds for how student performance was distinguished by participants in the two regions. CONCLUSIONS The present research findings are congruent with current literature related to the constructs used by clinical supervisors in health professional student workplace-based assessment and provide additional insight into cross-national perspectives in pharmacy. As previously determined in social work and medicine, further study of how evaluation instruments and associated processes can integrate these judgements should be pursued in this discipline.
Collapse
Affiliation(s)
- Kerry Wilbur
- College of Pharmacy, Qatar University, Doha, Qatar
| | | | | | - Emily K Black
- College of Pharmacy, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
90
|
Wilbur K, Mousa Bacha R, Abdelaziz S. How does culture affect experiential training feedback in exported Canadian health professional curricula? INTERNATIONAL JOURNAL OF MEDICAL EDUCATION 2017; 8:91-98. [PMID: 28315858 PMCID: PMC5376492 DOI: 10.5116/ijme.58ba.7c68] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 03/04/2017] [Indexed: 06/06/2023]
Abstract
OBJECTIVES To explore feedback processes of Western-based health professional student training curricula conducted in an Arab clinical teaching setting. METHODS This qualitative study employed document analysis of in-training evaluation reports (ITERs) used by Canadian nursing, pharmacy, respiratory therapy, paramedic, dental hygiene, and pharmacy technician programs established in Qatar. Six experiential training program coordinators were interviewed between February and May 2016 to explore how national cultural differences are perceived to affect feedback processes between students and clinical supervisors. Interviews were recorded, transcribed, and coded according to a priori cultural themes. RESULTS Document analysis found all programs' ITERs outlined competency items for students to achieve. Clinical supervisors choose a response option corresponding to their judgment of student performance and may provide additional written feedback in spaces provided. Only one program required formal face-to-face feedback exchange between students and clinical supervisors. Experiential training program coordinators identified that no ITER was expressly culturally adapted, although in some instances, modifications were made for differences in scopes of practice between Canada and Qatar. Power distance was recognized by all coordinators who also identified both student and supervisor reluctance to document potentially negative feedback in ITERs. Instances of collectivism were described as more lenient student assessment by clinical supervisors of the same cultural background. Uncertainty avoidance did not appear to impact feedback processes. CONCLUSIONS Our findings suggest that differences in specific cultural dimensions between Qatar and Canada have implications on the feedback process in experiential training which may be addressed through simple measures to accommodate communication preferences.
Collapse
Affiliation(s)
- Kerry Wilbur
- College of Pharmacy, Qatar University, Doha, Qatar
| | | | | |
Collapse
|
91
|
Boscardin CK, Wijnen-Meijer M, Cate OT. Taking Rater Exposure to Trainees Into Account When Explaining Rater Variability. J Grad Med Educ 2016; 8:726-730. [PMID: 28018538 PMCID: PMC5180528 DOI: 10.4300/jgme-d-16-00122.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Rater-based judgments are widely used in graduate medical education to provide more meaningful assessments, despite concerns about rater reliability. OBJECTIVE We introduced a statistical modeling technique that corresponds to the new rater reliability framework, and present a case example to provide an illustration of the utility of this new approach to assessing rater reliability. METHODS We used mixed-effects models to simultaneously incorporate random effects for raters and systematic effects of rater role as fixed effects. Study data are clinical performance ratings collected from medical school graduates who were evaluated for their readiness for supervised clinical practice in authentic simulation settings at 2 medical schools in the Netherlands and Germany. RESULTS The medical schools recruited a maximum of 30 graduates out of 60 (50%) and 180 (17%) eligible candidates, respectively. Clinician raters (n = 25) for the study were selected based on their level of expertise and experience. Graduates were assessed on 7 facets of competence (FOCs) that are considered important in supervisors' entrustment decisions across the 5 cases used. Rater role was significantly associated with 2 FOCs: (1) teamwork and collegiality, and (2) verbal communication with colleagues/supervisors. For another 2 FOCs, rater variability was only partially explained by the role of the rater (a proxy for the amount of direct interaction with the trainee). CONCLUSIONS Consideration of raters as meaningfully idiosyncratic provides a new framework to explore their influence on assessment scores, which goes beyond considering them as random sources of variability.
Collapse
Affiliation(s)
- Christy K. Boscardin
- Corresponding author: Christy K. Boscardin, PhD, UCSF School of Medicine, Department of Medicine, Office of Medical Education, 533 Parnassus Avenue, Suite U-80, San Francisco, CA 94143-3202, 415.519.3570,
| | | | | |
Collapse
|
92
|
Cook DA, Kuper A, Hatala R, Ginsburg S. When Assessment Data Are Words: Validity Evidence for Qualitative Educational Assessments. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2016; 91:1359-1369. [PMID: 27049538 DOI: 10.1097/acm.0000000000001175] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Quantitative scores fail to capture all important features of learner performance. This awareness has led to increased use of qualitative data when assessing health professionals. Yet the use of qualitative assessments is hampered by incomplete understanding of their role in forming judgments, and lack of consensus in how to appraise the rigor of judgments therein derived. The authors articulate the role of qualitative assessment as part of a comprehensive program of assessment, and translate the concept of validity to apply to judgments arising from qualitative assessments. They first identify standards for rigor in qualitative research, and then use two contemporary assessment validity frameworks to reorganize these standards for application to qualitative assessment.Standards for rigor in qualitative research include responsiveness, reflexivity, purposive sampling, thick description, triangulation, transparency, and transferability. These standards can be reframed using Messick's five sources of validity evidence (content, response process, internal structure, relationships with other variables, and consequences) and Kane's four inferences in validation (scoring, generalization, extrapolation, and implications). Evidence can be collected and evaluated for each evidence source or inference. The authors illustrate this approach using published research on learning portfolios.The authors advocate a "methods-neutral" approach to assessment, in which a clearly stated purpose determines the nature of and approach to data collection and analysis. Increased use of qualitative assessments will necessitate more rigorous judgments of the defensibility (validity) of inferences and decisions. Evidence should be strategically sought to inform a coherent validity argument.
Collapse
Affiliation(s)
- David A Cook
- D.A. Cook is professor of medicine and medical education, associate director, Mayo Clinic Online Learning, and consultant, Division of General Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota.A. Kuper is assistant professor, Department of Medicine, Faculty of Medicine, University of Toronto, scientist, Wilson Centre for Research in Education, University Health Network/University of Toronto, and staff physician, Division of General Internal Medicine, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada.R. Hatala is associate professor of medicine and director, Clinical Educator Fellowship, University of British Columbia, Vancouver, British Columbia, Canada.S. Ginsburg is professor, Department of Medicine, Faculty of Medicine, University of Toronto, scientist, Wilson Centre for Research in Education, University Health Network/University of Toronto, and staff physician, Mount Sinai Hospital, Toronto, Ontario, Canada
| | | | | | | |
Collapse
|
93
|
Calhoun AW, Bhanji F, Sherbino J, Hatala R. Simulation for High-Stakes Assessment in Pediatric Emergency Medicine. CLINICAL PEDIATRIC EMERGENCY MEDICINE 2016. [DOI: 10.1016/j.cpem.2016.05.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
94
|
Patel M, Agius S, Wilkinson J, Patel L, Baker P. Value of supervised learning events in predicting doctors in difficulty. MEDICAL EDUCATION 2016; 50:746-756. [PMID: 27295479 DOI: 10.1111/medu.12996] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 09/01/2015] [Accepted: 01/03/2016] [Indexed: 06/06/2023]
Abstract
CONTEXT In the UK, supervised learning events (SLE) replaced traditional workplace-based assessments for foundation-year trainees in 2012. A key element of SLEs was to incorporate trainee reflection and assessor feedback in order to drive learning and identify training issues early. Few studies, however, have investigated the value of SLEs in predicting doctors in difficulty. This study aimed to identify principles that would inform understanding about how and why SLEs work or not in identifying doctors in difficulty (DiD). METHODS A retrospective case-control study of North West Foundation School trainees' electronic portfolios was conducted. Cases comprised all known DiD. Controls were randomly selected from the same cohort. Free-text supervisor comments from each SLE were assessed for the four domains defined in the General Medical Council's Good Medical Practice Guidelines and each scored blindly for level of concern using a three-point ordinal scale. Cumulative scores for each SLE were then analysed quantitatively for their predictive value of actual DiD. A qualitative thematic analysis was also conducted. RESULTS The prevalence of DiD in this sample was 6.5%. Receiver operator characteristic curve analysis showed that Team Assessment of Behaviour (TAB) was the only SLE strongly predictive of actual DiD status. The Educational Supervisor Report (ESR) was also strongly predictive of DiD status. Fisher's test showed significant associations of TAB and ESR for both predicted and actual DiD status and also the health and performance subtypes. None of the other SLEs showed significant associations. Qualitative data analysis revealed inadequate completion and lack of constructive, particularly negative, feedback. This indicated that SLEs were not used to their full potential. CONCLUSIONS TAB and the ESR are strongly predictive of DiD. However, SLEs are not being used to their full potential, and the quality of completion of reports on SLEs and feedback needs to be improved in order to better identify and manage DiD.
Collapse
Affiliation(s)
- Mumtaz Patel
- Department of Renal Medicine, Manchester Royal Infirmary, Central Manchester University Hospitals NHS Foundation Trust, Manchester, UK
| | - Steven Agius
- Health Education England (North West Office), Manchester, UK
| | | | | | - Paul Baker
- Health Education England (North West Office), Manchester, UK
| |
Collapse
|
95
|
Gulbas L, Guerin W, Ryder HF. Does what we write matter? Determining the features of high- and low-quality summative written comments of students on the internal medicine clerkship using pile-sort and consensus analysis: a mixed-methods study. BMC MEDICAL EDUCATION 2016; 16:145. [PMID: 27177917 PMCID: PMC4866272 DOI: 10.1186/s12909-016-0660-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 05/02/2016] [Indexed: 06/05/2023]
Abstract
BACKGROUND Written comments by medical student supervisors provide written foundation for grade narratives and deans' letters and play an important role in student's professional development. Written comments are widely used but little has been published about the quality of written comments. We hypothesized that medical students share an understanding of qualities inherent to a high-quality and a low-quality narrative comment and we aimed to determine the features that define high- and low-quality comments. METHODS Using the well-established anthropological pile-sort method, medical students sorted written comments into 'helpful' and 'unhelpful' piles, then were interviewed to determine how they evaluated comments. We used multidimensional scaling and cluster analysis to analyze data, revealing how written comments were sorted across student participants. We calculated the degree of shared knowledge to determine the level of internal validity in the data. We transcribed and coded data elicited during the structured interview to contextualize the student's answers. Length of comment was compared using one-way analysis of variance; valence and frequency comments were thought of as helpful were analyzed by chi-square. RESULTS Analysis of written comments revealed four distinct clusters. Cluster A comments reinforced good behaviors or gave constructive criticism for how changes could be made. Cluster B comments exhorted students to continue non-specific behaviors already exhibited. Cluster C comments used grading rubric terms without giving student-specific examples. Cluster D comments used sentence fragments lacking verbs and punctuation. Student data exhibited a strong fit to the consensus model, demonstrating that medical students share a robust model of attributes of helpful and unhelpful comments. There was no correlation between valence of comment and perceived helpfulness. CONCLUSIONS Students find comments demonstrating knowledge of the student and providing specific examples of appropriate behavior to be reinforced or inappropriate behavior to be eliminated helpful, and comments that are non-actionable and non-specific to be least helpful. Our research and analysis allow us to make recommendations helpful for faculty development around written feedback.
Collapse
Affiliation(s)
- Lauren Gulbas
- School of Social Work, The University of Texas, Austin, TX, USA
| | - William Guerin
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Hilary F Ryder
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Department of Medicine, Dartmouth-Hitchcock Medical Center, One Medical Center Drive, Lebanon, NH, 03784, USA.
| |
Collapse
|
96
|
Gauthier G, St-Onge C, Tavares W. Rater cognition: review and integration of research findings. MEDICAL EDUCATION 2016; 50:511-22. [PMID: 27072440 DOI: 10.1111/medu.12973] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 07/20/2015] [Accepted: 11/13/2015] [Indexed: 05/21/2023]
Abstract
BACKGROUND Given the complexity of competency frameworks, associated skills and abilities, and contexts in which they are to be assessed in competency-based education (CBE), there is an increased reliance on rater judgements when considering trainee performance. This increased dependence on rater-based assessment has led to the emergence of rater cognition as a field of research in health professions education. The topic, however, is often conceptualised and ultimately investigated using many different perspectives and theoretical frameworks. Critically analysing how researchers think about, study and discuss rater cognition or the judgement processes in assessment frameworks may provide meaningful and efficient directions in how the field continues to explore the topic. METHODS We conducted a critical and integrative review of the literature to explore common conceptualisations and unified terminology associated with rater cognition research. We identified 1045 articles on rater-based assessment in health professions education using Scorpus, Medline and ERIC and 78 articles were included in our review. RESULTS We propose a three-phase framework of observation, processing and integration. We situate nine specific mechanisms and sub-mechanisms described across the literature within these phases: (i) generating automatic impressions about the person; (ii) formulating high-level inferences; (iii) focusing on different dimensions of competencies; (iv) categorising through well-developed schemata based on (a) personal concept of competence, (b) comparison with various exemplars and (c) task and context specificity; (v) weighting and synthesising information differently, (vi) producing narrative judgements; and (vii) translating narrative judgements into scales. CONCLUSION Our review has allowed us to identify common underlying conceptualisations of observed rater mechanisms and subsequently propose a comprehensive, although complex, framework for the dynamic and contextual nature of the rating process. This framework could help bridge the gap between researchers adopting different perspectives when studying rater cognition and enable the interpretation of contradictory findings of raters' performance by determining which mechanism is enabled or disabled in any given context.
Collapse
Affiliation(s)
| | - Christina St-Onge
- Medecine interne, Universite de Sherbrooke, Sherbrooke, Quebec, Canada
| | - Walter Tavares
- Division of Emergency Medicine, McMaster University, Hamilton, Ontario, Canada
- Centennial College, School of Community and Health Studies, Toronto, Ontario, Canada
- ORNGE Transport Medicine, Faculty of Medicine, Mississauga, Ontario, Canada
| |
Collapse
|
97
|
Ginsburg S, van der Vleuten C, Eva KW, Lingard L. Hedging to save face: a linguistic analysis of written comments on in-training evaluation reports. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2016; 21:175-88. [PMID: 26184115 DOI: 10.1007/s10459-015-9622-0] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 07/06/2015] [Indexed: 05/07/2023]
Abstract
Written comments on residents' evaluations can be useful, yet the literature suggests that the language used by assessors is often vague and indirect. The branch of linguistics called pragmatics argues that much of our day to day language is not meant to be interpreted literally. Within pragmatics, the theory of 'politeness' suggests that non-literal language and other strategies are employed in order to 'save face'. We conducted a rigorous, in-depth analysis of a set of written in-training evaluation report (ITER) comments using Brown and Levinson's influential theory of 'politeness' to shed light on the phenomenon of vague language use in assessment. We coded text from 637 comment boxes from first year residents in internal medicine at one institution according to politeness theory. Non-literal language use was common and 'hedging', a key politeness strategy, was pervasive in comments about both high and low rated residents, suggesting that faculty may be working to 'save face' for themselves and their residents. Hedging and other politeness strategies are considered essential to smooth social functioning; their prevalence in our ITERs may reflect the difficult social context in which written assessments occur. This research raises questions regarding the 'optimal' construction of written comments by faculty.
Collapse
Affiliation(s)
- Shiphra Ginsburg
- Department of Medicine and Wilson Centre for Research in Education, University of Toronto, Toronto, ON, Canada.
- Mount Sinai Hospital, 600 University Ave, Ste. 433, Toronto, ON, M5G1X5, Canada.
| | - Cees van der Vleuten
- School for Health Professions Education, Maastricht University, Maastricht, Netherlands
| | - Kevin W Eva
- Faculty of Medicine, Centre for Health Education Scholarship, University of British Columbia, Vancouver, BC, Canada
| | - Lorelei Lingard
- Centre for Education Research and Innovation, Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
| |
Collapse
|
98
|
On the Assessment of Paramedic Competence: A Narrative Review with Practice Implications. Prehosp Disaster Med 2015; 31:64-73. [DOI: 10.1017/s1049023x15005166] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
AbstractIntroductionParamedicine is experiencing significant growth in scope of practice, autonomy, and role in the health care system. Despite clinical governance models, the degree to which paramedicine ultimately can be safe and effective will be dependent on the individuals the profession deems suited to practice. This creates an imperative for those responsible for these decisions to ensure that assessments of paramedic competence are indeed accurate, trustworthy, and defensible.PurposeThe purpose of this study was to explore and synthesize relevant theoretical foundations and literature informing best practices in performance-based assessment (PBA) of competence, as it might be applied to paramedicine, for design or evaluation of assessment programs.MethodsA narrative review methodology was applied to focus intentionally, but broadly, on purpose relevant, theoretically derived research that could inform assessment protocols in paramedicine. Primary and secondary studies from a number of health professions that contributed to and informed best practices related to the assessment of paramedic clinical competence were included and synthesized.ResultsMultiple conceptual frameworks, psychometric requirements, and emerging lines of research are forwarded. Seventeen practice implications are derived to promote understanding as well as best practices and evaluation criteria for educators, employers, and/or licensing/certifying bodies when considering the assessment of paramedic competence.ConclusionsThe assessment of paramedic competence is a complex process requiring an understanding, appreciation for, and integration of conceptual and psychometric principles. The field of PBA is advancing rapidly with numerous opportunities for research.TavaresW,BoetS.On the assessment of paramedic competence: a narrative review with practice implications.Prehosp Disaster Med.2016;31(1):64–73.
Collapse
|
99
|
Lim DW, White JS. How Do Surgery Students Use Written Language to Say What They See? A Framework to Understand Medical Students' Written Evaluations of Their Teachers. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2015; 90:S98-S106. [PMID: 26505109 DOI: 10.1097/acm.0000000000000895] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
BACKGROUND There remains debate regarding the value of the written comments that medical students are traditionally asked to provide to evaluate the teaching they receive. The purpose of this study was to examine written teaching evaluations to understand how medical students conceptualize teachers' behaviors and performance. METHOD All written comments collected from medical students about teachers in the two surgery clerkships at the University of Alberta in 2009-2010 and 2010-2011 were collated and anonymized. A grounded theory approach was used for analysis, with iterative reading and open coding to identify recurring themes. A framework capturing variations observed in the data was generated until data saturation was achieved. Domains and subdomains were named using an in situ coding approach. RESULTS The conceptual framework contained three main domains: "Physician as Teacher," "Physician as Person," and "Physician as Physician." Under "Physician as Teacher," students commented on specific acts of teaching and subjective perceptions of an educator's teaching values. Under the "Physician as Physician" domain, students commented on elements of their educator's physicianship, including communication and collaborative skills, medical expertise, professionalism, and role modeling. Under "Physician as Person," students commented on how both positive and negative personality traits impacted their learning. CONCLUSIONS This framework describes how medical students perceive their teachers and how they use written language to attach meaning to the behaviors they observe. Such a framework can be used to help students provide more constructive feedback to teachers and to assist in faculty development efforts aimed at improving teaching performance.
Collapse
|