101
|
Yeates P, Cardell J, Byrne G, Eva KW. Relatively speaking: contrast effects influence assessors' scores and narrative feedback. MEDICAL EDUCATION 2015; 49:909-919. [PMID: 26296407 DOI: 10.1111/medu.12777] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Revised: 12/22/2014] [Accepted: 04/27/2015] [Indexed: 06/04/2023]
Abstract
CONTEXT In prior research, the scores assessors assign can be biased away from the standard of preceding performances (i.e. 'contrast effects' occur). OBJECTIVES This study examines the mechanism and robustness of these findings to advance understanding of assessor cognition. We test the influence of the immediately preceding performance relative to that of a series of prior performances. Further, we examine whether assessors' narrative comments are similarly influenced by contrast effects. METHODS Clinicians (n = 61) were randomised to three groups in a blinded, Internet-based experiment. Participants viewed identical videos of good, borderline and poor performances by first-year doctors in varied orders. They provided scores and written feedback after each video. Narrative comments were blindly content-analysed to generate measures of valence and content. Variability of narrative comments and scores was compared between groups. RESULTS Comparisons indicated contrast effects after a single performance. When a good performance was preceded by a poor performance, ratings were higher (mean 5.01, 95% confidence interval [CI] 4.79-5.24) than when observation of the good performance was unbiased (mean 4.36, 95% CI 4.14-4.60; p < 0.05, d = 1.3). Similarly, borderline performance was rated lower when preceded by good performance (mean 2.96, 95% CI 2.56-3.37) than when viewed without preceding bias (mean 3.55, 95% CI 3.17-3.92; p < 0.05, d = 0.7). The series of ratings participants assigned suggested that the magnitude of contrast effects is determined by an averaging of recent experiences. The valence (but not content) of narrative comments showed contrast effects similar to those found in numerical scores. CONCLUSIONS These findings are consistent with research from behavioural economics and psychology that suggests judgement tends to be relative in nature. Observing that the valence of narrative comments is similarly influenced suggests these effects represent more than difficulty in translating impressions into a number. The extent to which such factors impact upon assessment in practice remains to be determined as the influence is likely to depend on context.
Collapse
Affiliation(s)
- Peter Yeates
- Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of Manchester, Manchester, UK
| | - Jenna Cardell
- Royal Bolton Hospital, Bolton NHS Foundation Trust, Bolton, Lancashire, UK
| | - Gerard Byrne
- Health Education North West, Health Education England, Manchester, UK
| | - Kevin W Eva
- Centre for Health Education Scholarship, Division of Medicine, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
102
|
Tricio J, Woolford M, Thomas M, Lewis-Greene H, Georghiou L, Andiappan M, Escudier M. Dental students' peer assessment: a prospective pilot study. EUROPEAN JOURNAL OF DENTAL EDUCATION : OFFICIAL JOURNAL OF THE ASSOCIATION FOR DENTAL EDUCATION IN EUROPE 2015; 19:140-8. [PMID: 25168409 DOI: 10.1111/eje.12114] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 07/08/2014] [Indexed: 05/11/2023]
Abstract
INTRODUCTION Peer assessment is increasingly used in health education. The aims of this study were to evaluate the reliability, accuracy, educational impact and student's perceptions of undergraduate pre-clinical and clinical dental students' structured and prospective Peer assessment and peer feedback protocol. MATERIALS AND METHODS Two Direct Observation of Procedural Skills (DOPS) forms were modified for use in pre-clinical and clinical peer assessment. Ten year two dental students working in a phantom-heads skills laboratory and 16-year five dental students attending a comprehensive care clinic piloted both peer DOPS forms. After training, pairs of students observed, assessed and provided immediate feedback to each other using their respective peer DOPS forms as frameworks. At the end of the 3-month study period, students anonymously provided their perceptions of the protocol. RESULTS Year 2 and year 5 students completed 57 and 104 peer DOPS forms, respectively. The generalizability coefficient was 0.62 for year 2 (six encounters) and 0.67 for year 5 (seven encounters). Both groups were able to differentiate amongst peer-assessed domains and so detect improvement in peers' performance over time. Peer DOPS scores of both groups showed a positive correlation with their mean end-of-year examination marks (r ≥ 0.505, P ≥ 0.051) although this was not statistically significant. There was no difference (P ≥ 0.094) between the end-of-year examination marks of the participating students and the rest of their respective classes. The vast majority of both groups expressed positive perceptions of the piloted protocol. DISCUSSION There are no data in the literature on the prospective use of peer assessment in the dental undergraduate setting. In the current study, both pre-clinical and clinical students demonstrated the ability to identify those domains where peers performed better, as well as those which needed improvement. Despite no observable educational impact, most students reported positive perceptions of the peer DOPS protocol. CONCLUSIONS The results of this pilot study support the need for and the potential benefit of a larger- and longer-term follow-up study utilising the protocol.
Collapse
Affiliation(s)
- J Tricio
- King's College London Dental Institute, London, UK
- Faculty of Dentistry, University of los Andes, Santiago, Chile
| | - M Woolford
- King's College London Dental Institute, London, UK
| | - M Thomas
- King's College London Dental Institute, London, UK
| | | | - L Georghiou
- King's College London Dental Institute, London, UK
| | - M Andiappan
- King's College London Dental Institute, London, UK
| | - M Escudier
- King's College London Dental Institute, London, UK
| |
Collapse
|
103
|
Moonen-van Loon JMW, Overeem K, Govaerts MJB, Verhoeven BH, van der Vleuten CPM, Driessen EW. The reliability of multisource feedback in competency-based assessment programs: the effects of multiple occasions and assessor groups. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2015; 90:1093-9. [PMID: 25993283 DOI: 10.1097/acm.0000000000000763] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
PURPOSE Residency programs around the world use multisource feedback (MSF) to evaluate learners' performance. Studies of the reliability of MSF show mixed results. This study aimed to identify the reliability of MSF as practiced across occasions with varying numbers of assessors from different professional groups (physicians and nonphysicians) and the effect on the reliability of the assessment for different competencies when completed by both groups. METHOD The authors collected data from 2008 to 2012 from electronically completed MSF questionnaires. In total, 428 residents completed 586 MSF occasions, and 5,020 assessors provided feedback. The authors used generalizability theory to analyze the reliability of MSF for multiple occasions, different competencies, and varying numbers of assessors and assessor groups across multiple occasions. RESULTS A reliability coefficient of 0.800 can be achieved with two MSF occasions completed by at least 10 assessors per group or with three MSF occasions completed by 5 assessors per group. Nonphysicians' scores for the "Scholar" and "Health advocate" competencies and physicians' scores for the "Health advocate" competency had a negative effect on the composite reliability. CONCLUSIONS A feasible number of assessors per MSF occasion can reliably assess residents' performance. Scores from a single occasion should be interpreted cautiously. However, every occasion can provide valuable feedback for learning. This research confirms that the (unique) characteristics of different assessor groups should be considered when interpreting MSF results. Reliability seems to be influenced by the included assessor groups and competencies. These findings will enhance the utility of MSF during residency training.
Collapse
Affiliation(s)
- Joyce M W Moonen-van Loon
- J.M.W. Moonen-van Loon is postdoctoral researcher, Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands. K. Overeem is postdoctoral researcher, Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands. M.J.B. Govaerts is assistant professor, Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands. B.H. Verhoeven is pediatric surgeon, Department of Surgery, Radboud University Medical Center, Nijmegen, and assistant professor, Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands. C.P.M. van der Vleuten is professor of education, Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands. E.W. Driessen is associate professor of education, Department of Educational Development and Research, Maastricht University, Maastricht, The Netherlands
| | | | | | | | | | | |
Collapse
|
104
|
Hauer KE, Chesluk B, Iobst W, Holmboe E, Baron RB, Boscardin CK, Cate OT, O'Sullivan PS. Reviewing residents' competence: a qualitative study of the role of clinical competency committees in performance assessment. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2015; 90:1084-92. [PMID: 25901876 DOI: 10.1097/acm.0000000000000736] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
PURPOSE Clinical competency committees (CCCs) are now required in graduate medical education. This study examined how residency programs understand and operationalize this mandate for resident performance review. METHOD In 2013, the investigators conducted semistructured interviews with 34 residency program directors at five public institutions in California, asking about each institution's CCCs and resident performance review processes. They used conventional content analysis to identify major themes from the verbatim interview transcripts. RESULTS The purpose of resident performance review at all institutions was oriented toward one of two paradigms: a problem identification model, which predominated; or a developmental model. The problem identification model, which focused on identifying and addressing performance concerns, used performance data such as red-flag alerts and informal information shared with program directors to identify struggling residents.In the developmental model, the timely acquisition and synthesis of data to inform each resident's developmental trajectory was challenging. Participants highly valued CCC members' expertise as educators to corroborate the identification of struggling residents and to enhance credibility of the committee's outcomes. Training in applying the milestones to the CCC's work was minimal.Participants were highly committed to performance review and perceived the current process as adequate for struggling residents but potentially not for others. CONCLUSIONS Institutions orient resident performance review toward problem identification; a developmental approach is uncommon. Clarifying the purpose of resident performance review and employing efficient information systems that synthesize performance data and engage residents and faculty in purposeful feedback discussions could enable the meaningful implementation of milestones-based assessment.
Collapse
Affiliation(s)
- Karen E Hauer
- K.E. Hauer is professor, Department of Medicine, University of California, San Francisco, School of Medicine, San Francisco, California. B. Chesluk is clinical research associate, Evaluation, Research, and Development, American Board of Internal Medicine, Philadelphia, Pennsylvania. W. Iobst is vice president for academic and clinical affairs and vice dean, Commonwealth Medical College, Scranton, Pennsylvania. E. Holmboe is senior vice president, Accreditation Council for Graduate Medical Education, Chicago, Illinois, and adjunct professor of medicine, Yale School of Medicine, New Haven, Connecticut. R.B. Baron is professor of medicine and associate dean for graduate and continuing medical education, Division of General Internal Medicine, Department of Medicine, University of California, San Francisco, School of Medicine, San Francisco, California. C.K. Boscardin is associate professor, Department of Medicine, University of California, San Francisco, School of Medicine, San Francisco, California. O. ten Cate is professor of medical education and director, Center for Research and Development of Education, University Medical Center Utrecht, Utrecht, The Netherlands. P.S. O'Sullivan is professor of medicine and director of research and development in medical education, Office of Medical Education, University of California, San Francisco, School of Medicine, San Francisco, California
| | | | | | | | | | | | | | | |
Collapse
|
105
|
Kogan JR, Conforti LN, Bernabeo E, Iobst W, Holmboe E. How faculty members experience workplace-based assessment rater training: a qualitative study. MEDICAL EDUCATION 2015; 49:692-708. [PMID: 26077217 DOI: 10.1111/medu.12733] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2014] [Revised: 11/13/2014] [Accepted: 02/11/2015] [Indexed: 05/09/2023]
Abstract
CONTEXT Direct observation of clinical skills is a common approach in workplace-based assessment (WBA). Despite widespread use of the mini-clinical evaluation exercise (mini-CEX), faculty development efforts are typically required to improve assessment quality. Little consensus exists regarding the most effective training methods, and few studies explore faculty members' reactions to rater training. OBJECTIVES This study was conducted to qualitatively explore the experiences of faculty staff with two rater training approaches - performance dimension training (PDT) and a modified approach to frame of reference training (FoRT) - to elucidate how such faculty development can be optimally designed. METHODS In a qualitative study of a multifaceted intervention using complex intervention principles, 45 out-patient resident faculty preceptors from 26 US internal medicine residency programmes participated in a rater training faculty development programme. All participants were interviewed individually and in focus groups during and after the programme to elicit how the training influenced their approach to assessment. A constructivist grounded theory approach was used to analyse the data. RESULTS Many participants perceived that rater training positively influenced their approach to direct observation and feedback, their ability to use entrustment as the standard for assessment, and their own clinical skills. However, barriers to implementation and change included: (i) a preference for holistic assessment over frameworks; (ii) challenges in defining competence; (iii) difficulty in changing one's approach to assessment, and (iv) concerns about institutional culture and buy-in. CONCLUSIONS Rater training using PDT and a modified approach to FoRT can provide faculty staff with assessment skills that are congruent with principles of criterion-referenced assessment and entrustment, and foundational principles of competency-based education, while providing them with opportunities to reflect on their own clinical skills. However, multiple challenges to incorporating new forms of training exist. Ongoing efforts to improve WBA are needed to address institutional and cultural contexts, and systems of care delivery.
Collapse
Affiliation(s)
- Jennifer R Kogan
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Lisa N Conforti
- Milestones Development and Evaluation, Accreditation Council of Graduate Medical Education, Chicago, Illinois, USA
| | - Elizabeth Bernabeo
- Evaluation Research and Development, American Board of Internal Medicine, Philadelphia, Pennsylvania, USA
| | - William Iobst
- Academic and Clinical Affairs, Commonwealth Medical College, Scranton, Pennsylvania, USA
| | - Eric Holmboe
- Milestones Development and Evaluation, Accreditation Council of Graduate Medical Education, Chicago, Illinois, USA
| |
Collapse
|
106
|
Palermo C, Capra S, Beck EJ, Ash S, Jolly B, Truby H. Are dietetics educators' attitudes to assessment a barrier to expanding placement opportunities? Results of a Delphi study. Nutr Diet 2015. [DOI: 10.1111/1747-0080.12205] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Claire Palermo
- Department of Nutrition and Dietetics; Monash University; Notting Hill Victoria Australia
| | - Sandra Capra
- School of Human Movement Studies; The University of Queensland; Brisbane Queensland Australia
| | - Eleanor J. Beck
- School of Medicine; University of Wollongong; Wollongong New South Wales Australia
| | - Susan Ash
- School of Exercise and Nutrition Sciences; Queensland University of Technology; Brisbane Queensland Australia
| | - Brian Jolly
- School of Medicine and Public Health; Faculty of Health and Medicine University of Newcastle; Newcastle New South Wales Australia
| | - Helen Truby
- Department of Nutrition and Dietetics; Monash University; Notting Hill Victoria Australia
| |
Collapse
|
107
|
Donato AA, Park YS, George DL, Schwartz A, Yudkowsky R. Validity and Feasibility of the Minicard Direct Observation Tool in 1 Training Program. J Grad Med Educ 2015. [PMID: 26221439 PMCID: PMC4512794 DOI: 10.4300/jgme-d-14-00532.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Availability of reliable, valid, and feasible workplace-based assessment (WBA) tools is important to allow faculty to make important and complex judgments about resident competence. The Minicard is a WBA direct observation tool designed to provide formative feedback while supporting critical competency decisions. OBJECTIVE The purpose of this study was to collect validity and feasibility evidence for use of the Minicard for formative assessment of internal medicine residents. METHODS We conducted a retrospective cohort analysis of Minicard observations from 2005-2011 in 1 institution to obtain validity evidence, including content (settings, observation rates, independent raters); response process (rating distributions across the scale and ratings by month in the program); consequences (qualitative assessment of action plans); and feasibility (time to collect observations). RESULTS Eighty faculty observers recorded 3715 observations of 73 residents in the inpatient ward (43%), clinic (39%), intensive care (15%), and emergency department (3%) settings. Internal medicine residents averaged 28 (SD=8.4) observations per year from 9 (SD=4.1) independent observers. Minicards had an average of 5 (SD=5.1) discrete recorded observations per card. Rating distributions covered the entire rating scale, and increased significantly over the time in training. Half of the observations included action plans with action-oriented feedback, 11% had observational feedback, 9% had minimal feedback, and 30% had no recorded plan. Observations averaged 15.6 (SD=9.5) minutes. CONCLUSIONS Validity evidence for the Minicard direct observation tool demonstrates its ability to facilitate identification of "struggling" residents and provide feedback, supporting its use for the formative assessment of internal medicine residents.
Collapse
Affiliation(s)
- Anthony A. Donato
- Corresponding author: Anthony A. Donato, MD, Jefferson Medical College, Department of Medicine, B-2, 6th and Spruce Streets, West Reading, PA 19612, 610.988.8133,
| | | | | | | | | |
Collapse
|
108
|
Mitchell ML, Henderson A, Jeffrey C, Nulty D, Groves M, Kelly M, Knight S, Glover P. Application of best practice guidelines for OSCEs-An Australian evaluation of their feasibility and value. NURSE EDUCATION TODAY 2015; 35:700-705. [PMID: 25660268 DOI: 10.1016/j.nedt.2015.01.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Revised: 01/11/2015] [Accepted: 01/19/2015] [Indexed: 06/04/2023]
Abstract
BACKGROUND Objective Structured Clinical Examinations (OSCEs) are widely used in health professional education and should be based on sound pedagogical foundations. OBJECTIVES The aim of this study is to evaluate the feasibility and utility of using Best Practice Guidelines (BPGs) within an OSCE format in a broad range of tertiary education settings with under-graduate and post-graduate nursing and midwifery students. We evaluated how feasible it was to apply the BPGs to modify OSCEs in a course; students' perspective of the OSCE; and finally, if the BPG-revised OSCEs better prepared students for clinical practice when compared with the original OSCEs. DESIGN A mixed method with surveys, focus groups and semi-structured interviews evaluated the BPGs within an OSCE. SETTINGS Four maximally different contexts across four sites in Australia were used. PARTICIPANTS Participants included lecturers and undergraduate nursing students in high and low fidelity simulation settings; under-graduate midwifery students; and post-graduate rural and remote area nursing students. RESULTS 691 students participated in revised OSCEs. Surveys were completed by 557 students; 91 students gave further feedback through focus groups and 14 lecturers participated in interviews. At all sites the BPGs were successfully used to modify and implement OSCEs. Students valued the realistic nature of the modified OSCEs which contributed to students' confidence and preparation for clinical practice. The lecturers considered the revised OSCEs enhanced student preparedness for their clinical placements. DISCUSSION AND CONCLUSIONS The BPGs have a broad applicability to OSCEs in a wide range of educational contexts with improved student outcomes. Students and lecturers identified the revised OSCEs enhanced student preparation for clinical practice. Subsequent examination of the BPGs saw further refinement to a set of eight BPGs that provide a sequential guide to their application in a way that is consistent with best practice curriculum design principles.
Collapse
Affiliation(s)
- Marion L Mitchell
- School of Nursing and Midwifery, NHMRC Centre for Research Excellence in Nursing, Centre for Health Practice Innovation, Griffith Health Institute, Griffith University and Princess Alexandra Hospital, Health Sciences (N48), 170 Kessels Rd, Nathan, Queensland 4111, Australia.
| | - Amanda Henderson
- Griffith University, School of Nursing and Midwifery and Princess Alexandra Hospital, Queensland Health Research, Ipswich Road Woolloongabba, Queensland 4102, Australia.
| | - Carol Jeffrey
- Nurse Practice Development Unit, Princess Alexandra Hospital and School of Nursing and Midwifery, Griffith University, Ipswich Road, Woolloongabba, Queensland 4102, Australia.
| | - Duncan Nulty
- Griffith Institute for Educational Research, Griffith University, Mt Gravatt Campus, 170 Kessels Road, Nathan, Queensland 4111, Australia.
| | - Michele Groves
- Medical School, University of Queensland, St Lucia, Queensland 4072, Australia.
| | - Michelle Kelly
- University of Technology Sydney, PO Box 123, Broadway, NSW 2007, Australia.
| | - Sabina Knight
- Mount Isa Centre for Rural and Remote Health, James Cook University, PO Box 2572, Mount Isa, Queensland 4825, Australia.
| | - Pauline Glover
- Flinders University, GPO Box 2100, Adelaide, South Australia 5001, Australia.
| |
Collapse
|
109
|
Read EK, Bell C, Rhind S, Hecker KG. The use of global rating scales for OSCEs in veterinary medicine. PLoS One 2015; 10:e0121000. [PMID: 25822258 PMCID: PMC4379077 DOI: 10.1371/journal.pone.0121000] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Accepted: 02/09/2015] [Indexed: 11/19/2022] Open
Abstract
OSCEs (Objective Structured Clinical Examinations) are widely used in health professions to assess clinical skills competence. Raters use standardized binary checklists (CL) or multi-dimensional global rating scales (GRS) to score candidates performing specific tasks. This study assessed the reliability of CL and GRS scores in the assessment of veterinary students, and is the first study to demonstrate the reliability of GRS within veterinary medical education. Twelve raters from two different schools (6 from University of Calgary [UCVM] and 6 from Royal (Dick) School of Veterinary Studies [R(D)SVS] were asked to score 12 students (6 from each school). All raters assessed all students (video recordings) during 4 OSCE stations (bovine haltering, gowning and gloving, equine bandaging and skin suturing). Raters scored students using a CL, followed by the GRS. Novice raters (6 R(D)SVS) were assessed independently of expert raters (6 UCVM). Generalizability theory (G theory), analysis of variance (ANOVA) and t-tests were used to determine the reliability of rater scores, assess any between school differences (by student, by rater), and determine if there were differences between CL and GRS scores. There was no significant difference in rater performance with use of the CL or the GRS. Scores from the CL were significantly higher than scores from the GRS. The reliability of checklist scores were .42 and .76 for novice and expert raters respectively. The reliability of the global rating scale scores were .7 and .86 for novice and expert raters respectively. A decision study (D-study) showed that once trained using CL, GRS could be utilized to reliably score clinical skills in veterinary medicine with both novice and experienced raters.
Collapse
Affiliation(s)
- Emma K. Read
- Department of Veterinary Clinical and Diagnostic Sciences, University of Calgary Faculty of Veterinary Medicine, Calgary, Alberta, Canada
- * E-mail:
| | - Catriona Bell
- Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, Scotland
| | - Susan Rhind
- Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, Scotland
| | - Kent G. Hecker
- Department of Veterinary Clinical and Diagnostic Sciences, University of Calgary Faculty of Veterinary Medicine, Calgary, Alberta, Canada
| |
Collapse
|
110
|
Gingerich A, Kogan J, Yeates P, Govaerts M, Holmboe E. Seeing the 'black box' differently: assessor cognition from three research perspectives. MEDICAL EDUCATION 2014; 48:1055-68. [PMID: 25307633 DOI: 10.1111/medu.12546] [Citation(s) in RCA: 165] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Revised: 06/04/2014] [Accepted: 07/11/2014] [Indexed: 05/09/2023]
Abstract
CONTEXT Performance assessments, such as workplace-based assessments (WBAs), represent a crucial component of assessment strategy in medical education. Persistent concerns about rater variability in performance assessments have resulted in a new field of study focusing on the cognitive processes used by raters, or more inclusively, by assessors. METHODS An international group of researchers met regularly to share and critique key findings in assessor cognition research. Through iterative discussions, they identified the prevailing approaches to assessor cognition research and noted that each of them were based on nearly disparate theoretical frameworks and literatures. This paper aims to provide a conceptual review of the different perspectives used by researchers in this field using the specific example of WBA. RESULTS Three distinct, but not mutually exclusive, perspectives on the origins and possible solutions to variability in assessment judgements emerged from the discussions within the group of researchers: (i) the assessor as trainable: assessors vary because they do not apply assessment criteria correctly, use varied frames of reference and make unjustified inferences; (ii) the assessor as fallible: variations arise as a result of fundamental limitations in human cognition that mean assessors are readily and haphazardly influenced by their immediate context, and (iii) the assessor as meaningfully idiosyncratic: experts are capable of making sense of highly complex and nuanced scenarios through inference and contextual sensitivity, which suggests assessor differences may represent legitimate experience-based interpretations. CONCLUSIONS Although each of the perspectives discussed in this paper advances our understanding of assessor cognition and its impact on WBA, every perspective has its limitations. Following a discussion of areas of concordance and discordance across the perspectives, we propose a coexistent view in which researchers and practitioners utilise aspects of all three perspectives with the goal of advancing assessment quality and ultimately improving patient care.
Collapse
Affiliation(s)
- Andrea Gingerich
- Northern Medical Program, University of Northern British Columbia, Prince George, British Columbia, Canada
| | | | | | | | | |
Collapse
|
111
|
Parry-Smith W, Mahmud A, Landau A, Hayes K. Workplace-based assessment: a new approach to existing tools. ACTA ACUST UNITED AC 2014. [DOI: 10.1111/tog.12133] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
| | - Ayesha Mahmud
- University of Birmingham; Birmingham Women's NHS Foundation trust; Birmingham B15 2TG UK
| | | | | |
Collapse
|
112
|
Abstract
Feedback should be a key support for optimizing on-the-job learning in clinical medicine. Often, however, feedback fails to live up to its potential to productively direct and shape learning. In this article, two key influences on how and why feedback becomes meaningful are examined: the individual learner's perception of and response to feedback and the learning culture within which feedback is exchanged. Feedback must compete for learners' attention with a range of other learning cues that are available in clinical settings and must survive a learner's judgment of its credibility in order to become influential. These judgments, in turn, occur within a specific context--a distinct learning culture--that both shapes learners' definitions of credibility and facilitates or constrains the exchange of good feedback. By highlighting these important blind spots in the process by which feedback becomes meaningful, concrete and necessary steps toward a robust feedback culture within medical education are revealed.
Collapse
|
113
|
Palermo C, Chung A, Beck EJ, Ash S, Capra S, Truby H, Jolly B. Evaluation of assessment in the context of work-based learning: Qualitative perspectives of new graduates. Nutr Diet 2014. [DOI: 10.1111/1747-0080.12126] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Claire Palermo
- Department of Nutrition and Dietetics; Monash University; Notting Hill Victoria Australia
| | - Alexandra Chung
- Department of Nutrition and Dietetics; Monash University; Notting Hill Victoria Australia
| | - Eleanor J. Beck
- School of Health Sciences; University of Wollongong; Wollongong New South Wales Australia
| | - Susan Ash
- School of Exercise and Nutrition Science; Queensland University of Technology; Brisbane Queensland Australia
| | - Sandra Capra
- Centre for Dietetics Research; School of Human Movement Studies; University of Queensland; Brisbane Queensland Australia
| | - Helen Truby
- Department of Nutrition and Dietetics; Monash University; Notting Hill Victoria Australia
| | - Brian Jolly
- School of Medicine & Public Health; Faculty of Health and Medicine; University of Newcastle; Callaghan New South Wales Australia
| |
Collapse
|
114
|
Mortsiefer A, Immecke J, Rotthoff T, Karger A, Schmelzer R, Raski B, Schmitten JID, Altiner A, Pentzek M. Summative assessment of undergraduates' communication competence in challenging doctor-patient encounters. Evaluation of the Düsseldorf CoMeD-OSCE. PATIENT EDUCATION AND COUNSELING 2014; 95:348-355. [PMID: 24637164 DOI: 10.1016/j.pec.2014.02.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Revised: 02/18/2014] [Accepted: 02/23/2014] [Indexed: 06/03/2023]
Abstract
OBJECTIVE To evaluate the summative assessment (OSCE) of a communication training programme for dealing with challenging doctor-patient encounters in the 4th study year. METHODS Our OSCE consists of 4 stations (breaking bad news, guilt and shame, aggressive patients, shared decision making), using a 4-item global rating (GR) instrument. We calculated reliability coefficients for different levels, discriminability of single items and interrater reliability. Validity was estimated by gender differences and accordance between GR and a checklist. RESULTS In a pooled sample of 456 students in 3 OSCEs over 3 terms, total reliability was α=0.64, reliability coefficients for single stations were >0.80, and discriminability in 3 of 4 stations was within the range of 0.4-0.7. Except for one station, interrater reliability was moderate to strong. Reliability on item level was poor and pointed to some problems with the use of the GR. CONCLUSION The application of the GR on regular undergraduate medical education shows moderate reliability in need of improvement and some traits of validity. Ongoing development and evaluation is needed with particular regard to the training of the examiners. PRACTICE IMPLICATIONS Our CoMeD-OSCE proved suitable for the summative assessment of communication skills in challenging doctor-patient encounters.
Collapse
Affiliation(s)
- Achim Mortsiefer
- Institute of General Practice, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany.
| | - Janine Immecke
- Institute of General Practice, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - Thomas Rotthoff
- Deanery of Study and Department for Endocrinology and Diabetes, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - André Karger
- Clinical Institute of Psychosomatic Medicine and Psychotherapy, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - Regine Schmelzer
- Clinical Institute of Psychosomatic Medicine and Psychotherapy, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - Bianca Raski
- Clinical Institute of Psychosomatic Medicine and Psychotherapy, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - Jürgen In der Schmitten
- Institute of General Practice, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - Attila Altiner
- Institute of General Practice, Medical Faculty of the University of Rostock, Rostock 18057, Germany
| | - Michael Pentzek
- Institute of General Practice, Medical Faculty of the Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| |
Collapse
|
115
|
Kogan JR, Conforti LN, Iobst WF, Holmboe ES. Reconceptualizing variable rater assessments as both an educational and clinical care problem. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2014; 89:721-7. [PMID: 24667513 DOI: 10.1097/acm.0000000000000221] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The public is calling for the U.S. health care and medical education system to be accountable for ensuring high-quality, safe, effective, patient-centered care. As medical education shifts to a competency-based training paradigm, clinician educators' assessment of and feedback to trainees about their developing clinical skills becomes paramount. However, there is substantial variability in the accuracy, reliability, and validity of the assessments faculty make when they directly observe trainees with patients. These difficulties have been treated primarily as a rater cognition problem focusing on the inability of the assessor to make reliable and valid assessments of the trainee.The authors' purpose is to reconceptualize the rater cognition problem as both an educational and clinical care problem. The variable quality of faculty assessments is not just a psychometric predicament but also an issue that has implications for decisions regarding trainee supervision and the delivery of quality patient care. The authors suggest that the frame of reference for rating performance during workplace-based assessments be the ability to provide safe, effective, patient-centered care. The authors developed the Accountable Assessment for Quality Care and Supervision equation to remind faculty that supervision is a dynamic, complex process essential for patients to receive high-quality care. This fundamental shift in how assessment is conceptualized requires new models of faculty development and emphasizes the essential and irreplaceable importance of the clinician educator in trainee assessment.
Collapse
Affiliation(s)
- Jennifer R Kogan
- Dr. Kogan is associate professor, Department of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania. Ms. Conforti is research associate for academic programs, American Board of Internal Medicine, Philadelphia, Pennsylvania. Dr. Iobst is vice president of academic affairs and clinical affairs and vice dean, Commonwealth Medical College, Scranton, Pennsylvania. During the preparation of this article, he was vice president of academic affairs, American Board of Internal Medicine, Philadelphia, Pennsylvania. Dr. Holmboe is senior vice president of milestones development and evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois. During the preparation of this article, he was chief medical officer, American Board of Internal Medicine, Philadelphia, Pennsylvania
| | | | | | | |
Collapse
|
116
|
Donato AA. Direct observation of residents: a model for an assessment system. Am J Med 2014; 127:455-60. [PMID: 24491387 DOI: 10.1016/j.amjmed.2014.01.016] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 01/24/2014] [Indexed: 11/24/2022]
|
117
|
Weller JM, Misur M, Nicolson S, Morris J, Ure S, Crossley J, Jolly B. Can I leave the theatre? A key to more reliable workplace-based assessment. Br J Anaesth 2014; 112:1083-91. [PMID: 24638231 DOI: 10.1093/bja/aeu052] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND The value of workplace-based assessments such as the mini-clinical evaluation exercise (mini-CEX), and clinicians' confidence and engagement in the process, has been constrained by low reliability and limited capacity to identify underperforming trainees. We proposed that changing the way supervisors make judgements about trainees would improve score reliability and identification of underperformers. Anaesthetists regularly make decisions about the level of trainee independence with a case, based on how closely they need to supervise them. We therefore used this as the basis for a new scoring system. METHODS We analysed 338 mini-CEXs where supervisors scored trainees using the conventional system, and also scored trainee independence, based on the need for direct, or more distant, supervision. As supervisory requirements depend on case difficulty, we then compared the actual trainee independence score and the expected trainee independence score obtained externally. RESULTS Compared with the conventional scoring system used in previous studies, reliability was very substantially improved using a system based on a trainee's level of independence with a case. Reliability improved further when this score was corrected for case difficulty. Furthermore, the new scoring system overcame the previously identified problem of assessor leniency and identified a number of trainees performing below expectations. CONCLUSIONS Supervisors' judgements on trainee independence with a case, based on the need for direct or more distant supervision, can generate reliable scores of trainee ability without the need for an onerous number of assessments, identify trainees performing below expectations, and track trainee progress towards independent specialist practice.
Collapse
Affiliation(s)
- J M Weller
- Centre for Medical and Health Sciences Education, Faculty of Medical and Health Sciences, University of Auckland, 2 Park Rd, Grafton, Auckland 1010, New Zealand Department of Anaesthesia, Auckland City Hospital, 2 Park Rd, Grafton, Auckland 1010, New Zealand
| | - M Misur
- Department of Anaesthesia, Auckland City Hospital, 2 Park Rd, Grafton, Auckland 1010, New Zealand
| | - S Nicolson
- Department of Anaesthesia, Auckland City Hospital, 2 Park Rd, Grafton, Auckland 1010, New Zealand
| | - J Morris
- Department of Anaesthesia, Royal Melbourne Hospital, Grattan Street, Parkville, VIC 3052, Australia
| | - S Ure
- Department of Anaesthesia, Wellington Hospital, Riddiford Street, Newtown, Wellington 6021, New Zealand
| | - J Crossley
- Academic Unit of Medical Education, University of Sheffield, 85 Wilkinson Street, Sheffield S10 2GJ, UK
| | - B Jolly
- University of Newcastle, University Drive, Callaghan, Newcastle, NSW 2308, Australia
| |
Collapse
|
118
|
Olupeliyawa AM, O'Sullivan AJ, Hughes C, Balasooriya CD. The Teamwork Mini-Clinical Evaluation Exercise (T-MEX): a workplace-based assessment focusing on collaborative competencies in health care. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2014; 89:359-65. [PMID: 24362380 DOI: 10.1097/acm.0000000000000115] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
PURPOSE Teamwork is an important and challenging area of learning during the transition from medical graduate to intern. This preliminary investigation examined the psychometric and logistic properties of the Teamwork Mini-Clinical Evaluation Exercise (T-MEX) for the workplace-based assessment of key competencies in working with health care teams. METHOD The authors designed the T-MEX for direct observation and assessment of six collaborative behaviors in seven clinical situations important for teamwork, feedback, and reflection. In 2010, they tested it on University of New South Wales senior medical students during their last six-week clinical term to investigate its overall utility, including validity and reliability. Assessors rated students in different situations on the extent to which they met expectations for interns for each collaborative behavior. Both assessors and students rated the tool's usefulness and feasibility. RESULTS Assessment forms for 88 observed encounters were submitted by 25 students. The T-MEX was suited to a broad range of collaborative clinical practice situations, as evidenced by the encounter types and the behaviors assessed by health care team members. The internal structure of the behavior ratings indicated construct validity. A generalizability study found that eight encounters were adequate for high-stakes measurement purposes. The mean times for observation and feedback and the participants' perceptions suggested usefulness for feedback and feasibility in busy clinical settings. CONCLUSIONS Findings suggest that the T-MEX has good utility for assessing trainee competence in working with health care teams. It fills a gap within the suite of existing tools for workplace-based assessment of professional attributes.
Collapse
Affiliation(s)
- Asela M Olupeliyawa
- Dr. Olupeliyawa is lecturer, Medical Education Development and Research Centre, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka. At the time of writing, he was a doctoral candidate, School of Public Health and Community Medicine, UNSW Medicine, University of New South Wales, Sydney, Australia. Dr. O'Sullivan is program authority, UNSW Medicine, and associate professor, Department of Medicine, St. George Clinical School, UNSW Medicine, University of New South Wales, Sydney, Australia. Dr. Hughes is associate professor, Rural Clinical School, UNSW Medicine, University of New South Wales, Sydney. Australia. Dr. Balasooriya is director, Medical Education Development, and senior lecturer, School of Public Health and Community Medicine, UNSW Medicine, University of New South Wales, Sydney, Australia
| | | | | | | |
Collapse
|
119
|
Southgate L, van der Vleuten CPM. A conversation about the role of medical regulators. MEDICAL EDUCATION 2014; 48:215-218. [PMID: 24528403 DOI: 10.1111/medu.12309] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
|
120
|
Govaerts M, van der Vleuten CPM. Validity in work-based assessment: expanding our horizons. MEDICAL EDUCATION 2013; 47:1164-74. [PMID: 24206150 DOI: 10.1111/medu.12289] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2013] [Revised: 05/08/2013] [Accepted: 06/14/2013] [Indexed: 05/13/2023]
Abstract
CONTEXT Although work-based assessments (WBA) may come closest to assessing habitual performance, their use for summative purposes is not undisputed. Most criticism of WBA stems from approaches to validity consistent with the quantitative psychometric framework. However, there is increasing research evidence that indicates that the assumptions underlying the predictive, deterministic framework of psychometrics may no longer hold. In this discussion paper we argue that meaningfulness and appropriateness of current validity evidence can be called into question and that we need alternative strategies to assessment and validity inquiry that build on current theories of learning and performance in complex and dynamic workplace settings. METHODS Drawing from research in various professional fields we outline key issues within the mechanisms of learning, competence and performance in the context of complex social environments and illustrate their relevance to WBA. In reviewing recent socio-cultural learning theory and research on performance and performance interpretations in work settings, we demonstrate that learning, competence (as inferred from performance) as well as performance interpretations are to be seen as inherently contextualised, and can only be under-stood 'in situ'. Assessment in the context of work settings may, therefore, be more usefully viewed as a socially situated interpretive act. DISCUSSION We propose constructivist-interpretivist approaches towards WBA in order to capture and understand contextualised learning and performance in work settings. Theoretical assumptions underlying interpretivist assessment approaches call for a validity theory that provides the theoretical framework and conceptual tools to guide the validation process in the qualitative assessment inquiry. Basic principles of rigour specific to qualitative research have been established, and they can and should be used to determine validity in interpretivist assessment approaches. If used properly, these strategies generate trustworthy evidence that is needed to develop the validity argument in WBA, allowing for in-depth and meaningful information about professional competence.
Collapse
Affiliation(s)
- Marjan Govaerts
- Educational Development and Research, Maastricht University, Maastricht, the Netherlands
| | | |
Collapse
|
121
|
Moonen-van Loon JMW, Overeem K, Donkers HHLM, van der Vleuten CPM, Driessen EW. Composite reliability of a workplace-based assessment toolbox for postgraduate medical education. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2013; 18:1087-102. [PMID: 23494202 DOI: 10.1007/s10459-013-9450-z] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 02/21/2013] [Indexed: 05/16/2023]
Abstract
In recent years, postgraduate assessment programmes around the world have embraced workplace-based assessment (WBA) and its related tools. Despite their widespread use, results of studies on the validity and reliability of these tools have been variable. Although in many countries decisions about residents' continuation of training and certification as a specialist are based on the composite results of different WBAs collected in a portfolio, to our knowledge, the reliability of such a WBA toolbox has never been investigated. Using generalisability theory, we analysed the separate and composite reliability of three WBA tools [mini-Clinical Evaluation Exercise (mini-CEX), direct observation of procedural skills (DOPS), and multisource feedback (MSF)] included in a resident portfolio. G-studies and D-studies of 12,779 WBAs from a total of 953 residents showed that a reliability coefficient of 0.80 was obtained for eight mini-CEXs, nine DOPS, and nine MSF rounds, whilst the same reliability was found for seven mini-CEXs, eight DOPS, and one MSF when combined in a portfolio. At the end of the first year of residency a portfolio with five mini-CEXs, six DOPS, and one MSF afforded reliable judgement. The results support the conclusion that several WBA tools combined in a portfolio can be a feasible and reliable method for high-stakes judgements.
Collapse
Affiliation(s)
- J M W Moonen-van Loon
- Department of Educational Research and Development, Faculty of Health, Medicine, and Life Sciences, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands,
| | | | | | | | | |
Collapse
|
122
|
van der Lee N, Fokkema JPI, Westerman M, Driessen EW, van der Vleuten CPM, Scherpbier AJJA, Scheele F. The CanMEDS framework: relevant but not quite the whole story. MEDICAL TEACHER 2013; 35:949-55. [PMID: 24003989 DOI: 10.3109/0142159x.2013.827329] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
BACKGROUND Despite acknowledgement that the Canadian Medical Educational Directives for Specialists (CanMEDS) framework covers the relevant competencies of physicians, many educators and medical professionals struggle to translate the CanMEDS roles into comprehensive training programmes for specific specialties. AIM To gain insight into the applicability of the CanMEDS framework to guide the design of educational programmes for specific specialties by exploring stakeholders' perceptions of specialty specific competencies and examining differences between those competencies and the CanMEDS framework. METHODS This case study is a sequel to a study among ObsGyn specialists. It explores the perspectives of patients, midwives, nurses, general practitioners, and hospital boards on gynaecological competencies and compares these with the CanMEDS framework. RESULTS Clinical expertise, reflective practice, collaboration, a holistic view, and involvement in practice management were perceived to be important competencies for gynaecological practice. Although all the competencies were covered by the CanMEDS framework, there were some mismatches between stakeholders' perceptions of the importance of some competencies and their position in the framework. CONCLUSION The CanMEDS framework appears to offer relevant building blocks for specialty specific postgraduate training, which should be combined with the results of an exploration of specialty specific competencies to arrive at a postgraduate curriculum that is in alignment with professional practice.
Collapse
|
123
|
Palermo C, Beck EJ, Chung A, Ash S, Capra S, Truby H, Jolly B. Work-based assessment: qualitative perspectives of novice nutrition and dietetics educators. J Hum Nutr Diet 2013; 27:513-21. [DOI: 10.1111/jhn.12174] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- C. Palermo
- Department of Nutrition and Dietetics; Monash University; Notting Hill VIC Australia
| | - E. J. Beck
- University of Wollongong; Wollongong NSW Australia
| | - A. Chung
- Department of Nutrition and Dietetics; Monash University; Notting Hill VIC Australia
| | - S. Ash
- Queensland University of Technology; Brisbane QLD Australia
| | - S. Capra
- The University of Queensland; Brisbane QLD Australia
| | - H. Truby
- Department of Nutrition and Dietetics; Monash University; Notting Hill VIC Australia
| | - B. Jolly
- The University of Newcastle; Newcastle NSW Australia
| |
Collapse
|
124
|
Homer M, Setna Z, Jha V, Higham J, Roberts T, Boursicot K. Estimating and comparing the reliability of a suite of workplace-based assessments: an obstetrics and gynaecology setting. MEDICAL TEACHER 2013; 35:684-691. [PMID: 23782043 DOI: 10.3109/0142159x.2013.801548] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This paper reports on a study that compares estimates of the reliability of a suite of workplace based assessment forms as employed to formatively assess the progress of trainee obstetricians and gynaecologists. The use of such forms of assessment is growing nationally and internationally in many specialties, but there is little research evidence on comparisons by procedure/competency and form-type across an entire specialty. Generalisability theory combined with a multilevel modelling approach is used to estimate variance components, G-coefficients and standard errors of measurement across 13 procedures and three form-types (mini-CEX, OSATS and CbD). The main finding is that there are wide variations in the estimates of reliability across the forms, and that therefore the guidance on assessment within the specialty does not always allow for enough forms per trainee to ensure that the levels of reliability of the process is adequate. There is, however, little evidence that reliability varies systematically by form-type. Methodologically, the problems of accurately estimating reliability in these contexts through the calculation of variance components and, crucially, their associated standard errors are considered. The importance of the use of appropriate methods in such calculations is emphasised, and the unavoidable limitations of research in naturalistic settings are discussed.
Collapse
Affiliation(s)
- Matt Homer
- School of Medicine, University of Leeds, Leeds, LS2 9JT, UK.
| | | | | | | | | | | |
Collapse
|
125
|
Driessen E, Scheele F. What is wrong with assessment in postgraduate training? Lessons from clinical practice and educational research. MEDICAL TEACHER 2013; 35:569-74. [PMID: 23701250 DOI: 10.3109/0142159x.2013.798403] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Workplace-based assessment is more commonly given a lukewarm than a warm welcome by its prospective users. In this article, we summarise the workplace-based assessment literature as well as our own experiences with workplace-based assessment to derive lessons that can facilitate acceptance of workplace-based assessment in postgraduate specialty training. We propose to shift the emphasis in workplace-based assessment from assessment of trainee performance to the learning of trainees. Workplace-based assessment should focus on supporting supervisors in taking entrustment decisions by complementing their "gut feeling" with information from assessments and focus less on assessment and testability. One of the most stubborn problems with workplace-based assessment is the absence of observation of trainees and the lack of feedback based on observations. Non-standardised observations are used to organise feedback. To make these assessments meaningful for learning, it is essential that they are not perceived as summative by their users, that they provide narrative feedback for the learner and that there is a form of facilitation that helps to integrate the feedback in trainees' self-assessments.
Collapse
|
126
|
Oldfield Z, Beasley SW, Smith J, Anthony A, Watt A. Correlation of selection scores with subsequent assessment scores during surgical training. ANZ J Surg 2013; 83:412-6. [PMID: 23647783 DOI: 10.1111/ans.12176] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/06/2013] [Indexed: 11/28/2022]
Abstract
BACKGROUND Determining admission criteria to select candidates most likely to succeed in surgical training in Australia and New Zealand has been an imprecise art with little empirical evidence informing decisions. Selection to the Royal Australasian College of Surgeons' Surgical Education and Training programme is based entirely on applicants' performance in structured curriculum vitae (CV), referees' reports and interviews. This retrospective review compared General Surgery (GS) trainees' performance in selection with subsequent performance in assessments during training. METHODS Data from three cohorts of GS trainees were sourced. Scores for four selection items were compared with scores from six training assessments. Interrelationships within each of the sets of selection and assessment variables were determined. RESULTS A single significant relationship was found between scores on the three selection tools. High scores in the CV did not correlate with higher scores in any subsequent assessments. The structured referee report score, multi-station interview score and total selection score all correlated with performance in subsequent work-based assessments and examinations. Direct observation of procedural skills (DOPS) scores appear to reflect increasing acquisition of operative skills. Performance in mini clinical examinations (Mini-CEX) was variable, perhaps reflecting limitations of this assessment. Candidates who perform well in one examination tend to perform well in all three examinations. CONCLUSIONS No selection tool demonstrated strong relationships with scores in all subsequent assessments; however referee reports, multi-station interviews and total selection scores are indicators for performance in particular assessments. This may engender confidence that candidates admitted into the GS training programme are likely to progress successfully through the programme.
Collapse
Affiliation(s)
- Zaita Oldfield
- Education Development and Research Department, Royal Australasian College of Surgeons, Melbourne, Victoria, Australia.
| | | | | | | | | |
Collapse
|
127
|
Tavares W, Eva KW. Exploring the impact of mental workload on rater-based assessments. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2013; 18:291-303. [PMID: 22484964 DOI: 10.1007/s10459-012-9370-3] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Accepted: 03/26/2012] [Indexed: 05/14/2023]
Abstract
When appraising the performance of others, assessors must acquire relevant information and process it in a meaningful way in order to translate it effectively into ratings, comments, or judgments about how well the performance meets appropriate standards. Rater-based assessment strategies in health professional education, including scale and faculty development strategies aimed at improving them have generally been implemented with limited consideration of human cognitive and perceptual limitations. However, the extent to which the task assigned to raters aligns with their cognitive and perceptual capacities will determine the extent to which reliance on human judgment threatens assessment quality. It is well recognized in medical decision making that, as the amount of information to be processed increases, judges may engage mental shortcuts through the application of schemas, heuristics, or the adoption of solutions that satisfy rather than optimize the judge's needs. Further, these shortcuts may fundamentally limit/bias the information perceived or processed. Thinking of the challenges inherent in rater-based assessments in an analogous way may yield novel insights regarding the limits of rater-based assessment and may point to greater understanding of ways in which raters can be supported to facilitate sound judgment. This paper presents an initial exploration of various cognitive and perceptual limitations associated with rater-based assessment tasks. We hope to highlight how the inherent cognitive architecture of raters might beneficially be taken into account when designing rater-based assessment protocols.
Collapse
Affiliation(s)
- Walter Tavares
- School of Community and Health Studies, Centennial College, Station A, P.O. Box 631, Toronto, ON, M1K 5E9, Canada.
| | | |
Collapse
|
128
|
Kirton JA, Palmer NOA, Grieveson B, Balmer MC. A national evaluation of workplace-based assessment tools (WPBAs) in foundation dental training: a UK study. Effective and useful but do they provide an equitable training experience? Br Dent J 2013; 214:305-9. [DOI: 10.1038/sj.bdj.2013.302] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/21/2012] [Indexed: 11/09/2022]
|
129
|
Dawson SD, Miller T, Goddard SF, Miller LM. Impact of outcome-based assessment on student learning and faculty instructional practices. JOURNAL OF VETERINARY MEDICAL EDUCATION 2013; 40:128-138. [PMID: 23709109 DOI: 10.3138/jvme.1112-100r] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Increased accountability has been a catalyst for the reformation of curriculum and assessment practices in postsecondary schools throughout North America, including veterinary schools. There is a call for a shift in assessment practices in clinical rotations, from a focus on content to a focus on assessing student performance. Learning is subsequently articulated in terms of observable outcomes and indicators that describe what the learner can do after engaging in a learning experience. The purpose of this study was to examine the ways in which a competency-based program in an early phase of implementation impacted student learning and faculty instructional practices. Findings revealed that negative student perceptions of the assessment instrument's reliability had a detrimental effect on the face validity of the instrument and, subsequently, on students' engagement with competency-based assessment and promotion of student-centered learning. While the examination of faculty practices echoed findings from other studies that cited the need for faculty development to improve rater reliability and for a better data management system, our study found that faculty members' instructional practices improved through the alignment of instruction and curriculum. This snap-shot of the early stages of implementing a competency-based program has been instrumental in refining and advancing the program.
Collapse
Affiliation(s)
- Susan D Dawson
- Department of Biomedical Sciences, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PE Canada.
| | | | | | | |
Collapse
|
130
|
Kogan JR, Holmboe E. Realizing the promise and importance of performance-based assessment. TEACHING AND LEARNING IN MEDICINE 2013; 25 Suppl 1:S68-74. [PMID: 24246110 DOI: 10.1080/10401334.2013.842912] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Work-based assessment (WBA) is the assessment of trainees and physicians across the educational continuum of day-to-day competencies and practices in authentic, clinical environments. What distinguishes WBA from other assessment modalities is that it enables the evaluation of performance in context. In this perspective, we describe the growing importance, relevance, and evolution of WBA as it relates to competency-based medical education, supervision, and entrustment. Although a systematic review is beyond the purview of this perspective, we highlight specific methods and needed shifts to WBA that (a) consider patient outcomes, (b) use nonphysician assessors, and (c) assess the care provided to populations of patients. We briefly describe strategies for the effective implementation of WBA and identify outstanding research questions related to its use.
Collapse
Affiliation(s)
- Jennifer R Kogan
- a Division of General Internal Medicine , Raymond and Ruth Perelman School of Medicine at the University of Pennsylvania , Philadelphia , Pennsylvania , USA
| | | |
Collapse
|
131
|
Tavares W, Boet S, Theriault R, Mallette T, Eva KW. Global rating scale for the assessment of paramedic clinical competence. PREHOSP EMERG CARE 2012; 17:57-67. [PMID: 22834959 DOI: 10.3109/10903127.2012.702194] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE The aim of this study was to develop and critically appraise a global rating scale (GRS) for the assessment of individual paramedic clinical competence at the entry-to-practice level. METHODS The development phase of this study involved task analysis by experts, contributions from a focus group, and a modified Delphi process using a national expert panel to establish evidence of content validity. The critical appraisal phase had two raters apply the GRS, developed in the first phase, to a series of sample performances from three groups: novice paramedic students (group 1), paramedic students at the entry-to-practice level (group 2), and experienced paramedics (group 3). Using data from this process, we examined the tool's reliability within each group and tested the discriminative validity hypothesis that higher scores would be associated with higher levels of training and experience. RESULTS The development phase resulted in a seven-dimension, seven-point adjectival GRS. The two independent blinded raters scored 81 recorded sample performances (n = 25 in group 1, n = 33 in group 2, n = 23 in group 3) using the GRS. For groups 1, 2, and 3, respectively, interrater reliability reached 0.75, 0.88, and 0.94. Intrarater reliability reached 0.94 and the internal consistency ranged from 0.53 to 0.89. Rater differences contributed 0-5.7% of the total variance. The GRS scores assigned to each group increased with level of experience, both using the overall rating (means = 2.3, 4.1, 5.0; p < 0.001) and considering each dimension separately. Applying a modified borderline group method, 54.9% of group 1, 13.4% of group 2, and 2.9% of group 3 were below the cut score. CONCLUSION The results of this study provide evidence that the scores generated using this scale can be valid for the purpose of making decisions regarding paramedic clinical competence.
Collapse
Affiliation(s)
- Walter Tavares
- Centennial College Paramedic Program, Toronto, Ontario, Canada.
| | | | | | | | | |
Collapse
|
132
|
Walsh K. Money is the root of all progress. MEDICAL EDUCATION 2012; 46:625. [PMID: 22626054 DOI: 10.1111/j.1365-2923.2012.04213.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
|
133
|
Wilkinson TJ. Making sense of work-based assessments. MEDICAL EDUCATION 2012; 46:436. [PMID: 22429180 DOI: 10.1111/j.1365-2923.2011.04202.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
|
134
|
Affiliation(s)
- Brian Jolly
- Health Workforce Education and Assessment Research, Monash University, Melbourne, Victoria 3800, Australia.
| |
Collapse
|
135
|
|