Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Eva KW, Hodges BD. Scylla or Charybdis? Can we navigate between objectification and judgement in assessment? Med Educ 2012;46:914-9. [PMID: 22891912 DOI: 10.1111/j.1365-2923.2012.04310.x] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Number

Cited by Other Article(s)

Ginsburg S, Stroud L, Brydges R, Melvin L, Hatala R. Dual purposes by design: exploring alignment between residents' and academic advisors' documents in a longitudinal program. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2024;29:1631-1647. [PMID: 38438699 DOI: 10.1007/s10459-024-10318-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/04/2024] [Indexed: 03/06/2024]

Nair BR, Moonen - van Loon JMW, van Lierop M, Govaerts M. Leveraging Narrative Feedback in Programmatic Assessment: The Potential of Automated Text Analysis to Support Coaching and Decision-Making in Programmatic Assessment. ADVANCES IN MEDICAL EDUCATION AND PRACTICE 2024;15:671-683. [PMID: 39050116 PMCID: PMC11268569 DOI: 10.2147/amep.s465259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 06/27/2024] [Indexed: 07/27/2024]

Abstract

Introduction

Current assessment approaches increasingly use narratives to support learning, coaching and high-stakes decision-making. Interpretation of narratives, however, can be challenging and time-consuming, potentially resulting in suboptimal or inadequate use of assessment data. Support for learners, coaches as well as decision-makers in the use and interpretation of these narratives therefore seems essential.

Methods

We explored the utility of automated text analysis techniques to support interpretation of narrative assessment data, collected across 926 clinical assessments of 80 trainees, in an International Medical Graduates' licensing program in Australia. We employed topic modelling and sentiment analysis techniques to automatically identify predominant feedback themes as well as the sentiment polarity of feedback messages. We furthermore sought to examine the associations between feedback polarity, numerical performance scores, and overall judgments about task performance.

Results

Topic modelling yielded three distinctive feedback themes: Medical Skills, Knowledge, and Communication & Professionalism. The volume of feedback varied across topics and clinical settings, but assessors used more words when providing feedback to trainees who did not meet competence standards. Although sentiment polarity and performance scores did not seem to correlate at the level of single assessments, findings showed a strong positive correlation between the average performance scores and the average algorithmically assigned sentiment polarity.

Discussion

This study shows that use of automated text analysis techniques can pave the way for a more efficient, structured, and meaningful learning, coaching, and assessment experience for learners, coaches and decision-makers alike. When used appropriately, these techniques may facilitate more meaningful and in-depth conversations about assessment data, by supporting stakeholders in interpretation of large amounts of feedback. Future research is vital to fully unlock the potential of automated text analysis, to support meaningful integration into educational practices.

Collapse

Woodworth GE, Goldstein ZT, Ambardekar AP, Arthur ME, Bailey CF, Booth GJ, Carney PA, Chen F, Duncan MJ, Fromer IR, Hallman MR, Hoang T, Isaak R, Klesius LL, Ladlie BL, Mitchell SA, Miller Juve AK, Mitchell JD, McGrath BJ, Shepler JA, Sims CR, Spofford CM, Tanaka PP, Maniker RB. Development and Pilot Testing of a Programmatic System for Competency Assessment in US Anesthesiology Residency Training. Anesth Analg 2024;138:1081-1093. [PMID: 37801598 DOI: 10.1213/ane.0000000000006667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]

Abstract

BACKGROUND

In 2018, a set of entrustable professional activities (EPAs) and procedural skills assessments were developed for anesthesiology training, but they did not assess all the Accreditation Council for Graduate Medical Education (ACGME) milestones. The aims of this study were to (1) remap the 2018 EPA and procedural skills assessments to the revised ACGME Anesthesiology Milestones 2.0, (2) develop new assessments that combined with the original assessments to create a system of assessment that addresses all level 1 to 4 milestones, and (3) provide evidence for the validity of the assessments.

METHODS

Using a modified Delphi process, a panel of anesthesiology education experts remapped the original assessments developed in 2018 to the Anesthesiology Milestones 2.0 and developed new assessments to create a system that assessed all level 1 through 4 milestones. Following a 24-month pilot at 7 institutions, the number of EPA and procedural skill assessments and mean scores were computed at the end of the academic year. Milestone achievement and subcompetency data for assessments from a single institution were compared to scores assigned by the institution's clinical competency committee (CCC).

RESULTS

New assessment development, 2 months of testing and feedback, and revisions resulted in 5 new EPAs, 11 nontechnical skills assessments (NTSAs), and 6 objective structured clinical examinations (OSCEs). Combined with the original 20 EPAs and procedural skills assessments, the new system of assessment addresses 99% of level 1 to 4 Anesthesiology Milestones 2.0. During the 24-month pilot, aggregate mean EPA and procedural skill scores significantly increased with year in training. System subcompetency scores correlated significantly with 15 of 23 (65.2%) corresponding CCC scores at a single institution, but 8 correlations (36.4%) were <30.0, illustrating poor correlation.

CONCLUSIONS

A panel of experts developed a set of EPAs, procedural skill assessment, NTSAs, and OSCEs to form a programmatic system of assessment for anesthesiology residency training in the United States. The method used to develop and pilot test the assessments, the progression of assessment scores with time in training, and the correlation of assessment scores with CCC scoring of milestone achievement provide evidence for the validity of the assessments.

Collapse

Affiliation(s)

Glenn E Woodworth From the Department of Anesthesiology and Perioperative Medicine, Oregon Health & Science University, Portland, Oregon
Zachary T Goldstein Department of Anesthesiology, Cedars Sinai Medical Center, Los Angeles, California
Aditee P Ambardekar Department of Anesthesiology and Pain Management, University of Texas, Southwestern Medical Center, Dallas, Texas
Mary E Arthur Department of Anesthesiology and Perioperative Medicine, Medical College of Georgia at Augusta University, Augusta, Georgia
Caryl F Bailey Department of Anesthesiology and Perioperative Medicine, Medical College of Georgia at Augusta University, Augusta, Georgia
Gregory J Booth Uniformed Services University of the Health Sciences, Department of Anesthesiology and Pain Medicine, Naval Medical Center Portsmouth, Portsmouth, Virginia
Patricia A Carney Division of Hospital Medicine, Department of Family Medicine and Internal Medicine, Oregon Health & Science University, Portland, Oregon
Fei Chen Department of Anesthesiology, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, North Carolina
Michael J Duncan Department of Anesthesiology, University of Missouri-Kansas City, Kansas City, Missouri
Ilana R Fromer Department of Anesthesiology, University of Minnesota, Minneapolis, Minnesota
Matthew R Hallman Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, Washington
Thomas Hoang From the Department of Anesthesiology and Perioperative Medicine, Oregon Health & Science University, Portland, Oregon
Robert Isaak Department of Anesthesiology, University of North Carolina, Chapel Hill, North Carolina
Lisa L Klesius Department of Anesthesiology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
Beth L Ladlie Department of Anesthesiology, Mayo Clinic, Rochester, Minnesota
Sally Ann Mitchell Department of Anesthesiology, Indiana University, Indianapolis, Indiana
Amy K Miller Juve From the Department of Anesthesiology and Perioperative Medicine, Oregon Health & Science University, Portland, Oregon
John D Mitchell Department of Anesthesiology, Critical Care, and Perioperative Medicine, Henry Ford Health, Detroit, Michigan
Brian J McGrath Department of Anesthesiology, University of Florida College of Medicine-Jacksonville, Jacksonville, Florida
John A Shepler Department of Anesthesiology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
Charles R Sims Department of Anesthesiology & Perioperative Medicine, Mayo Clinic, Rochester, Minnesota
Christina M Spofford Department of Anesthesiology, Medical College of Wisconsin, Milwaukee, Wisconsin
Pedro P Tanaka Department of Anesthesiology, Stanford University, Stanford, California
Robert B Maniker Department of Anesthesiology, Columbia University, New York, New York

Collapse

Luu K, Sidhu R, Chadha NK, Eva KW. An exploration of "real time" assessments as a means to better understand preceptors' judgments of student performance. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2022:10.1007/s10459-022-10189-5. [PMID: 36441287 DOI: 10.1007/s10459-022-10189-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 11/19/2022] [Indexed: 06/16/2023]

Abstract

Clinical supervisors are known to assess trainee performance idiosyncratically, causing concern about the validity of their ratings. The literature on this issue relies heavily on retrospective collection of decisions, resulting in the risk of inaccurate information regarding what actually drives raters' perceptions. Capturing in-the-moment information about supervisors' impressions could yield better insight into how to intervene. The purpose of this study, therefore, was to gather "real-time" judgments to explore what drives preceptors' judgments of student performance. We performed a prospective study in which physicians were asked to adjust a rating scale in real-time while watching two video-recordings of trainee clinical performances. Scores were captured in 1-s increments, examined for frequency, direction, and magnitude of adjustments, and compared to assessors' final entrustability judgment as measured by the modified Ottawa Clinic Assessment Tool. The standard deviation in raters' judgment was examined as a function of time to determine how long it takes impressions to begin to vary. 20 participants viewed 2 clinical vignettes. Considerable variability in ratings was observed with different behaviours triggering scale adjustments for different raters. That idiosyncrasy occurred very quickly, with the standard deviation in raters' judgments rapidly increasing within 30 s of case onset. Particular moments appeared to generally be influential, but their degree of influence still varied. Correlations between the final assessment and (a) score assigned upon first adjustment of the scale, (b) upon last adjustment, and (c) the mean score, were r = 0.13, 0.32, and 0.57 for one video and r = 0.30, 0.50, and 0.52 for the other, indicating the degree to which overall impressions reflected accumulation of raters' idiosyncratic moment-by-moment observations. Our results demonstrated that variability in raters' impressions begins very early in a case presentation and is associated with different behaviours having different influence on different raters. More generally, this study outlines a novel methodology that offers a new path for gaining insight into factors influencing assessor judgments.

Collapse

Rotthoff T, Kadmon M, Harendza S. It does not have to be either or! Assessing competence in medicine should be a continuum between an analytic and a holistic approach. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2021;26:1659-1673. [PMID: 33779895 PMCID: PMC8610945 DOI: 10.1007/s10459-021-10043-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 03/07/2021] [Indexed: 06/12/2023]

Kelleher M, Kinnear B, Sall DR, Weber DE, DeCoursey B, Nelson J, Klein M, Warm EJ, Schumacher DJ. Warnings in early narrative assessment that might predict performance in residency: signal from an internal medicine residency program. PERSPECTIVES ON MEDICAL EDUCATION 2021;10:334-340. [PMID: 34476730 PMCID: PMC8633188 DOI: 10.1007/s40037-021-00681-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 07/08/2021] [Accepted: 07/11/2021] [Indexed: 05/10/2023]

Valentine N, Shanahan EM, Durning SJ, Schuwirth L. Making it fair: Learners' and assessors' perspectives of the attributes of fair judgement. MEDICAL EDUCATION 2021;55:1056-1066. [PMID: 34060124 DOI: 10.1111/medu.14574] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 05/19/2021] [Accepted: 05/26/2021] [Indexed: 06/12/2023]

Abstract

INTRODUCTION

Optimising the use of subjective human judgement in assessment requires understanding what makes judgement fair. Whilst fairness cannot be simplistically defined, the underpinnings of fair judgement within the literature have been previously combined to create a theoretically-constructed conceptual model. However understanding assessors' and learners' perceptions of what is fair human judgement is also necessary. The aim of this study is to explore assessors' and learners' perceptions of fair human judgement, and to compare these to the conceptual model.

METHODS

A thematic analysis approach was used. A purposive sample of twelve assessors and eight post-graduate trainees undertook semi-structured interviews using vignettes. Themes were identified using the process of constant comparison. Collection, analysis and coding of the data occurred simultaneously in an iterative manner until saturation was reached.

RESULTS

This study supported the literature-derived conceptual model suggesting fairness is a multi-dimensional construct with components at individual, system and environmental levels. At an individual level, contextual, longitudinally-collected evidence, which is supported by narrative, and falls within ill-defined boundaries is essential for fair judgement. Assessor agility and expertise are needed to interpret and interrogate evidence, identify boundaries and provide narrative feedback to allow for improvement. At a system level, factors such as multiple opportunities to demonstrate competence and improvement, multiple assessors to allow for different perspectives to be triangulated, and documentation are needed for fair judgement. These system features can be optimized through procedural fairness. Finally, appropriate learning and working environments which considers patient needs and learners personal circumstances are needed for fair judgments.

DISCUSSION

This study builds on the theory-derived conceptual model demonstrating the components of fair judgement can be explicitly articulated whilst embracing the complexity and contextual nature of health-professions assessment. Thus it provides a narrative to support dialogue between learner, assessor and institutions about ensuring fair judgements in assessment.

Collapse

Hartman ND, Manthey DE, Strowd LC, Potisek NM, Vallevand A, Tooze J, Goforth J, McDonough K, Askew KL. Effect of Perceived Level of Interaction on Faculty Evaluations of 3rd Year Medical Students. MEDICAL SCIENCE EDUCATOR 2021;31:1327-1332. [PMID: 34457975 PMCID: PMC8368453 DOI: 10.1007/s40670-021-01307-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/11/2021] [Indexed: 06/13/2023]

Holm EA, Al-Bayati SJL, Barfod TS, Lembeck MA, Pedersen H, Ramberg E, Klemmensen ÅK, Sorensen JL. Feasibility, quality and validity of narrative multisource feedback in postgraduate training: a mixed-method study. BMJ Open 2021;11:e047019. [PMID: 34321296 PMCID: PMC8319975 DOI: 10.1136/bmjopen-2020-047019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open

Abstract

OBJECTIVES

To examine a narrative multisource feedback (MSF) instrument concerning feasibility, quality of narrative comments, perceptions of users (face validity), consequential validity, discriminating capacity and number of assessors needed.

DESIGN

Qualitative text analysis supplemented by quantitative descriptive analysis.

SETTING

Internal Medicine Departments in Zealand, Denmark.

PARTICIPANTS

48 postgraduate trainees in internal medicine specialties, 1 clinical supervisor for each trainee and 376 feedback givers (respondents).

INTERVENTION

This study examines the use of an electronic, purely narrative MSF instrument. After the MSF process, the trainee and the supervisor answered a postquestionnaire concerning their perception of the process. The authors coded the comments in the MSF reports for valence (positive or negative), specificity, relation to behaviour and whether the comment suggested a strategy for improvement. Four of the authors independently classified the MSF reports as either 'no reasons for concern' or 'possibly some concern', thereby examining discriminating capacity. Through iterative readings, the authors furthermore tried to identify how many respondents were needed in order to get a reliable impression of a trainee.

RESULTS

Out of all comments coded for valence (n=1935), 89% were positive and 11% negative. Out of all coded comments (n=4684), 3.8% were suggesting ways to improve. 92% of trainees and supervisors preferred a narrative MSF to a numerical MSF, and 82% of the trainees discovered performance in need of development, but only 53% had made a specific plan for development. Kappa coefficients for inter-rater correlations between four authors were 0.7-1. There was a significant association (p<0.001) between the number of negative comments and the qualitative judgement by the four authors. It was not possible to define a specific number of respondents needed.

CONCLUSIONS

A purely narrative MSF contributes with educational value and experienced supervisors can discriminate between trainees' performances based on the MSF reports.

Collapse

Roberts S, MacPherson B. Perceptions of the impact of annual review of competence progression (ARCP): a mixed methods case study. Clin Med (Lond) 2021;21:e257-e262. [PMID: 34001581 DOI: 10.7861/clinmed.2020-0890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Abdel-Razig S, Ling JOE, MBBS TH, Smitasin N, Lum LHW, Ibrahim H. Challenges and Solutions in Running Effective Clinical Competency Committees in the International Context. J Grad Med Educ 2021;13:70-74. [PMID: 33936536 PMCID: PMC8078082 DOI: 10.4300/jgme-d-20-00844.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Takamura A, Imafuku R. What is the impact of the Rashomon approach in primary care education?: An educational case report of implementing dialogue and improvisation into medical education. BMC MEDICAL EDUCATION 2021;21:143. [PMID: 33663483 PMCID: PMC7934433 DOI: 10.1186/s12909-021-02570-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 02/19/2021] [Indexed: 06/12/2023]

Schauber SK, Hecht M. How sure can we be that a student really failed? On the measurement precision of individual pass-fail decisions from the perspective of Item Response Theory. MEDICAL TEACHER 2020;42:1374-1384. [PMID: 32857621 DOI: 10.1080/0142159x.2020.1811844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Hodwitz K, Kuper A, Brydges R. Realizing One's Own Subjectivity: Assessors' Perceptions of the Influence of Training on Their Conduct of Workplace-Based Assessments. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019;94:1970-1979. [PMID: 31397710 DOI: 10.1097/acm.0000000000002943] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

PURPOSE

Assessor training is essential for defensible assessments of physician performance, yet research on the effectiveness of training programs for promoting assessor consistency has produced mixed results. This study explored assessors' perceptions of the influence of training and assessment tools on their conduct of workplace-based assessments of physicians.

METHOD

In 2017, the authors used a constructivist grounded theory approach to interview 13 physician assessors about their perceptions of the effects of training and tool development on their conduct of assessments.

RESULTS

Participants reported that training led them to realize that there is a potential for variability in assessors' judgments, prompting them to change their scoring and feedback behaviors to enhance consistency. However, many participants noted they had not substantially changed their numerical scoring. Nonetheless, most thought training would lead to increased standardization and consistency among assessors, highlighting a "standardization paradox" in which participants perceived a programmatic shift toward standardization but minimal changes in their own ratings. An "engagement effect" was also found in which participants involved in both tool development and training cited more substantial learnings than participants involved only in training.

CONCLUSIONS

Findings suggest that training may help assessors recognize their own subjectivity when judging performance, which may prompt behaviors that support rigorous and consistent scoring but may not lead to perceptible changes in assessors' numeric ratings. Results also suggest that participating in tool development may help assessors align their judgments with the scoring criteria. Overall, results support the continued study of assessor training programs as a means of enhancing assessor consistency.

Collapse

Pack R, Lingard L, Watling CJ, Chahine S, Cristancho SM. Some assembly required: tracing the interpretative work of Clinical Competency Committees. MEDICAL EDUCATION 2019;53:723-734. [PMID: 31037748 DOI: 10.1111/medu.13884] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 01/17/2019] [Accepted: 02/22/2019] [Indexed: 05/24/2023]

Abstract

OBJECTIVES

This qualitative study describes the social processes of evidence interpretation employed by Clinical Competency Committees (CCCs), explicating how they interpret, grapple with and weigh assessment data.

METHODS

Over 8 months, two researchers observed 10 CCC meetings across four postgraduate programmes at a Canadian medical school, spanning over 25 hours and 100 individual decisions. After each CCC meeting, a semi-structured interview was conducted with one member. Following constructivist grounded theory methodology, data collection and inductive analysis were conducted iteratively.

RESULTS

Members of the CCCs held an assumption that they would be presented with high-quality assessment data that would enable them to make systematic and transparent decisions. This assumption was frequently challenged by the discovery of what we have termed 'problematic evidence' (evidence that CCC members struggled to meaningful interpret) within the catalogue of learner data. When CCCs were confronted with 'problematic evidence', they engaged in lengthy, effortful discussions aided by contextual data in order to make meaning of the evidence in question. This process of effortful discussion enabled CCCs to arrive at progression decisions that were informed by, rather than ignored, problematic evidence.

CONCLUSIONS

Small groups involved in the review of trainee assessment data should be prepared to encounter evidence that is uncertain, absent, incomplete, or otherwise difficult to interpret, and should openly discuss strategies for addressing these challenges. The answer to the problem of effortful processes of data interpretation and problematic evidence is not as simple as generating more data with strong psychometric properties. Rather, it involves grappling with the discrepancies between our interpretive frameworks and the inescapably subjective nature of assessment data and judgement.

Collapse

Eva KW, Macala C, Fleming B. Twelve tips for constructing a multiple mini-interview. MEDICAL TEACHER 2019;41:510-516. [PMID: 29373943 DOI: 10.1080/0142159x.2018.1429586] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Duitsman ME, Fluit CRMG, van Alfen-van der Velden JAEM, de Visser M, Ten Kate-Booij M, Dolmans DHJM, Jaarsma DADC, de Graaf J. Design and evaluation of a clinical competency committee. PERSPECTIVES ON MEDICAL EDUCATION 2019;8:1-8. [PMID: 30656533 PMCID: PMC6382624 DOI: 10.1007/s40037-018-0490-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]

Abstract

INTRODUCTION

In postgraduate medical education, group decision-making has emerged as an essential tool to evaluate the clinical progress of residents. Clinical competency committees (CCCs) have been set up to ensure informed decision-making and provide feedback regarding performance of residents. Despite this important task, it remains unclear how CCCs actually function in practice and how their performance should be evaluated.

METHODS

In the prototyping phase of a design-based approach, a CCC meeting was developed, using three theoretical design principles: (1) data from multiple assessment tools and multiple perspectives, (2) a shared mental model and (3) structured discussions. The meetings were held in a university children's hospital and evaluated using observations, interviews with CCC members and an open-ended questionnaire among residents.

RESULTS

The structured discussions during the meetings provided a broad outline of resident performance, including identification of problematic and excellent residents. A shared mental model about the assessment criteria had developed over time. Residents were not always satisfied with the feedback they received after the meeting. Feedback that had been provided to a resident after the first CCC meeting was not addressed in the second meeting.

DISCUSSION

The principles that were used to design the CCC meeting were feasible in practice. Structured discussions, based on data from multiple assessment tools and multiple perspectives, provided a broad outline of resident performance. Residency programs that wish to implement CCCs can build on our design principles and adjust the prototype to their particular context. When running a CCC, it is important to consider feedback that has been provided to a resident after the previous meeting and to evaluate whether it has improved the resident's performance.

Collapse

Govaerts MJB, van der Vleuten CPM, Holmboe ES. Managing tensions in assessment: moving beyond either-or thinking. MEDICAL EDUCATION 2019;53:64-75. [PMID: 30289171 PMCID: PMC6586064 DOI: 10.1111/medu.13656] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 04/16/2018] [Accepted: 06/08/2018] [Indexed: 05/09/2023]

Abstract

CONTEXT

In health professions education, assessment systems are bound to be rife with tensions as they must fulfil formative and summative assessment purposes, be efficient and effective, and meet the needs of learners and education institutes, as well as those of patients and health care organisations. The way we respond to these tensions determines the fate of assessment practices and reform. In this study, we argue that traditional 'fix-the-problem' approaches (i.e. either-or solutions) are generally inadequate and that we need alternative strategies to help us further understand, accept and actually engage with the multiple recurring tensions in assessment programmes.

METHODS

Drawing from research in organisation science and health care, we outline how the Polarity Thinking™ model and its 'both-and' approach offer ways to systematically leverage assessment tensions as opportunities to drive improvement, rather than as intractable problems. In reviewing the assessment literature, we highlight and discuss exemplars of specific assessment polarities and tensions in educational settings. Using key concepts and principles of the Polarity Thinking™ model, and two examples of common tensions in assessment design, we describe how the model can be applied in a stepwise approach to the management of key polarities in assessment.

DISCUSSION

Assessment polarities and tensions are likely to surface with the continued rise of complexity and change in education and health care organisations. With increasing pressures of accountability in times of stretched resources, assessment tensions and dilemmas will become more pronounced. We propose to add to our repertoire of strategies for managing key dilemmas in education and assessment design through the adoption of the polarity framework. Its 'both-and' approach may advance our efforts to transform assessment systems to meet complex 21st century education, health and health care needs.

Collapse

Reid L. Scientism in Medical Education and the Improvement of Medical Care: Opioids, Competencies, and Social Accountability. HEALTH CARE ANALYSIS 2018;26:155-170. [PMID: 28986710 DOI: 10.1007/s10728-017-0351-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

O'Connor A, Cantillon P, McGarr O, McCurtin A. Navigating the system: Physiotherapy student perceptions of performance-based assessment. MEDICAL TEACHER 2018;40:928-933. [PMID: 29256736 DOI: 10.1080/0142159x.2017.1416071] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]

Eva KW. Cognitive Influences on Complex Performance Assessment: Lessons from the Interplay between Medicine and Psychology. JOURNAL OF APPLIED RESEARCH IN MEMORY AND COGNITION 2018. [DOI: 10.1016/j.jarmac.2018.03.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Kulasegaram K, Rangachari PK. Beyond "formative": assessments to enrich student learning. ADVANCES IN PHYSIOLOGY EDUCATION 2018;42:5-14. [PMID: 29341810 DOI: 10.1152/advan.00122.2017] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

O’Connor A, McGarr O, Cantillon P, McCurtin A, Clifford A. Clinical performance assessment tools in physiotherapy practice education: a systematic review. Physiotherapy 2018;104:46-53. [DOI: 10.1016/j.physio.2017.01.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 01/27/2017] [Indexed: 10/20/2022]

Schauber SK, Hecht M, Nouns ZM. Why assessment in medical education needs a solid foundation in modern test theory. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2018;23:217-232. [PMID: 28303398 DOI: 10.1007/s10459-017-9771-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 03/09/2017] [Indexed: 06/06/2023]

Sarkar SN, Young AH. Going beyond ‘good enough’ teaching in psychiatric training. BJPSYCH ADVANCES 2018. [DOI: 10.1192/apt.bp.115.015107] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

de Jonge LPJWM, Timmerman AA, Govaerts MJB, Muris JWM, Muijtjens AMM, Kramer AWM, van der Vleuten CPM. Stakeholder perspectives on workplace-based performance assessment: towards a better understanding of assessor behaviour. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2017;22:1213-1243. [PMID: 28155004 PMCID: PMC5663793 DOI: 10.1007/s10459-017-9760-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Accepted: 01/24/2017] [Indexed: 05/13/2023]

Bartels J, Mooney CJ, Stone RT. Numerical versus narrative: A comparison between methods to measure medical student performance during clinical clerkships. MEDICAL TEACHER 2017;39:1154-1158. [PMID: 28845738 DOI: 10.1080/0142159x.2017.1368467] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Colbert CY, French JC, Herring ME, Dannefer EF. Fairness: the hidden challenge for competency-based postgraduate medical education programs. PERSPECTIVES ON MEDICAL EDUCATION 2017;6:347-355. [PMID: 28516341 PMCID: PMC5630529 DOI: 10.1007/s40037-017-0359-8] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Lockyer J, Carraccio C, Chan MK, Hart D, Smee S, Touchie C, Holmboe ES, Frank JR. Core principles of assessment in competency-based medical education. MEDICAL TEACHER 2017;39:609-616. [PMID: 28598746 DOI: 10.1080/0142159x.2017.1315082] [Citation(s) in RCA: 265] [Impact Index Per Article: 37.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Fuller R, Homer M, Pell G, Hallam J. Managing extremes of assessor judgment within the OSCE. MEDICAL TEACHER 2017;39:58-66. [PMID: 27670246 DOI: 10.1080/0142159x.2016.1230189] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Cook DA, Hatala R. Validation of educational assessments: a primer for simulation and beyond. Adv Simul (Lond) 2016;1:31. [PMID: 29450000 PMCID: PMC5806296 DOI: 10.1186/s41077-016-0033-y] [Citation(s) in RCA: 172] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 11/16/2016] [Indexed: 12/22/2022] Open

Abstract

BACKGROUND

Simulation plays a vital role in health professions assessment. This review provides a primer on assessment validation for educators and education researchers. We focus on simulation-based assessment of health professionals, but the principles apply broadly to other assessment approaches and topics.

KEY PRINCIPLES

Validation refers to the process of collecting validity evidence to evaluate the appropriateness of the interpretations, uses, and decisions based on assessment results. Contemporary frameworks view validity as a hypothesis, and validity evidence is collected to support or refute the validity hypothesis (i.e., that the proposed interpretations and decisions are defensible). In validation, the educator or researcher defines the proposed interpretations and decisions, identifies and prioritizes the most questionable assumptions in making these interpretations and decisions (the "interpretation-use argument"), empirically tests those assumptions using existing or newly-collected evidence, and then summarizes the evidence as a coherent "validity argument." A framework proposed by Messick identifies potential evidence sources: content, response process, internal structure, relationships with other variables, and consequences. Another framework proposed by Kane identifies key inferences in generating useful interpretations: scoring, generalization, extrapolation, and implications/decision. We propose an eight-step approach to validation that applies to either framework: Define the construct and proposed interpretation, make explicit the intended decision(s), define the interpretation-use argument and prioritize needed validity evidence, identify candidate instruments and/or create/adapt a new instrument, appraise existing evidence and collect new evidence as needed, keep track of practical issues, formulate the validity argument, and make a judgment: does the evidence support the intended use?

CONCLUSIONS

Rigorous validation first prioritizes and then empirically evaluates key assumptions in the interpretation and use of assessment scores. Validation science would be improved by more explicit articulation and prioritization of the interpretation-use argument, greater use of formal validation frameworks, and more evidence informing the consequences and implications of assessment.

Collapse

Mellinger JD, Williams RG, Sanfey H, Fryer JP, DaRosa D, George BC, Bohnen JD, Schuller MC, Sandhu G, Minter RM, Gardner AK, Scott DJ. Teaching and assessing operative skills: From theory to practice. Curr Probl Surg 2016;54:44-81. [PMID: 28212782 DOI: 10.1067/j.cpsurg.2016.11.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2016] [Accepted: 11/22/2016] [Indexed: 11/22/2022]

Duvivier R, Veysey M. Is the long case dead? 'Uh, I don't think so': the Uh/Um Index. MEDICAL EDUCATION 2016;50:1245-1248. [PMID: 27873409 DOI: 10.1111/medu.13091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Revised: 12/09/2015] [Accepted: 01/03/2016] [Indexed: 06/06/2023]

Cook DA, Kuper A, Hatala R, Ginsburg S. When Assessment Data Are Words: Validity Evidence for Qualitative Educational Assessments. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2016;91:1359-1369. [PMID: 27049538 DOI: 10.1097/acm.0000000000001175] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Eva KW, Bordage G, Campbell C, Galbraith R, Ginsburg S, Holmboe E, Regehr G. Towards a program of assessment for health professionals: from training into practice. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2016;21:897-913. [PMID: 26590984 DOI: 10.1007/s10459-015-9653-6] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 11/16/2015] [Indexed: 05/14/2023]

Abstract

Despite multifaceted attempts to "protect the public," including the implementation of various assessment practices designed to identify individuals at all stages of training and practice who underperform, profound deficiencies in quality and safety continue to plague the healthcare system. The purpose of this reflections paper is to cast a critical lens on current assessment practices and to offer insights into ways in which they might be adapted to ensure alignment with modern conceptions of health professional education for the ultimate goal of improved healthcare. Three dominant themes will be addressed: (1) The need to redress unintended consequences of competency-based assessment; (2) The potential to design assessment systems that facilitate performance improvement; and (3) The importance of ensuring authentic linkage between assessment and practice. Several principles cut across each of these themes and represent the foundational goals we would put forward as signposts for decision making about the continued evolution of assessment practices in the health professions: (1) Increasing opportunities to promote learning rather than simply measuring performance; (2) Enabling integration across stages of training and practice; and (3) Reinforcing point-in-time assessments with continuous professional development in a way that enhances shared responsibility and accountability between practitioners, educational programs, and testing organizations. Many of the ideas generated represent suggestions for strategies to pilot test, for infrastructure to build, and for harmonization across groups to be enabled. These include novel strategies for OSCE station development, formative (diagnostic) assessment protocols tailored to shed light on the practices of individual clinicians, the use of continuous workplace-based assessment, and broadening the focus of high-stakes decision making beyond determining who passes and who fails. We conclude with reflections on systemic (i.e., cultural) barriers that may need to be overcome to move towards a more integrated, efficient, and effective system of assessment.

Collapse

Tavares W, Eva KW. Impact of rating demands on rater-based assessments of clinical competence. EDUCATION FOR PRIMARY CARE 2016;25:308-18. [DOI: 10.1080/14739879.2014.11730760] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Walsh CM, Ling SC, Khanna N, Grover SC, Yu JJ, Cooper MA, Yong E, Nguyen GC, May G, Walters TD, Reznick R, Rabeneck L, Carnahan H. Gastrointestinal Endoscopy Competency Assessment Tool: reliability and validity evidence. Gastrointest Endosc 2016;81:1417-1424.e2. [PMID: 25753836 DOI: 10.1016/j.gie.2014.11.030] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 11/12/2014] [Indexed: 02/08/2023]

Abstract

BACKGROUND

Rigorously developed and validated direct observational assessment tools are required to support competency-based colonoscopy training to facilitate skill acquisition, optimize learning, and ensure readiness for unsupervised practice.

OBJECTIVE

To examine reliability and validity evidence of the Gastrointestinal Endoscopy Competency Assessment Tool (GiECAT) for colonoscopy for use within the clinical setting.

DESIGN

Prospective, observational, multicenter validation study. Sixty-one endoscopists performing 116 colonoscopies were assessed using the GiECAT, which consists of a 7-item global rating scale (GRS) and 19-item checklist (CL). A second rater assessed procedures to determine interrater reliability by using intraclass correlation coefficients (ICCs). Endoscopists' first and second procedure scores were compared to determine test-retest reliability by using ICCs. Discriminative validity was examined by comparing novice, intermediate, and experienced endoscopists' scores. Concurrent validity was measured by correlating scores with colonoscopy experience, cecal and terminal ileal intubation rates, and physician global assessment.

SETTING

A total of 116 colonoscopies performed by 33 novice (<50 previous procedures), 18 intermediate (50-500 previous procedures), and 10 experienced (>1000 previous procedures) endoscopists from 6 Canadian hospitals.

MAIN OUTCOME MEASUREMENTS

Interrater and test-retest reliability, discriminative, and concurrent validity.

RESULTS

Interrater reliability was high (total: ICC=0.85; GRS: ICC=0.85; CL: ICC=0.81). Test-retest reliability was excellent (total: ICC=0.91; GRS: ICC=0.93; CL: ICC=0.80). Significant differences in GiECAT scores among novice, intermediate, and experienced endoscopists were noted (P<.001). There was a significant positive correlation (P<.001) between scores and number of previous colonoscopies (total: ρ=0.78, GRS: ρ=0.80, CL: Spearman's ρ=0.71); cecal intubation rate (total: ρ=0.81, GRS: Spearman's ρ=0.82, CL: Spearman's ρ=0.75); ileal intubation rate (total: Spearman's ρ=0.82, GRS: Spearman's ρ=0.82, CL: Spearman's ρ=0.77); and physician global assessment (total: Spearman's ρ=0.90, GRS: Spearman's ρ=0.94, CL: Spearman's ρ=0.77).

LIMITATIONS

Nonblinded assessments.

CONCLUSION

This study provides evidence supporting the reliability and validity of the GiECAT for use in assessing the performance of live colonoscopies in the clinical setting.

Collapse

Affiliation(s)

Catharine M Walsh Division of Gastroenterology, Hepatology and Nutrition, Hospital for Sick Children, Toronto, Ontario, Canada; Department of Paediatrics, University of Toronto, Toronto, Ontario, Canada; Wilson Centre, University of Toronto, Toronto, Ontario, Canada
Simon C Ling Division of Gastroenterology, Hepatology and Nutrition, Hospital for Sick Children, Toronto, Ontario, Canada; Department of Paediatrics, University of Toronto, Toronto, Ontario, Canada
Nitin Khanna Division of Gastroenterology, St. Joseph's Health Centre, University of Western Ontario, London, Ontario, Canada
Samir C Grover Division of Gastroenterology, St. Michael's Hospital, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Jeffrey J Yu Wilson Centre, University of Toronto, Toronto, Ontario, Canada
Mary Anne Cooper Division of Gastroenterology, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Elaine Yong Division of Gastroenterology, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Geoffrey C Nguyen Division of Gastroenterology, Mount Sinai Hospital, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Gary May Division of Gastroenterology, St. Michael's Hospital, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Thomas D Walters Division of Gastroenterology, Hepatology and Nutrition, Hospital for Sick Children, Toronto, Ontario, Canada; Department of Paediatrics, University of Toronto, Toronto, Ontario, Canada
Richard Reznick Faculty of Health Sciences, Queen's University Kingston, Ontario, Canada
Linda Rabeneck Division of Gastroenterology, Mount Sinai Hospital, Toronto, Ontario, Canada; Cancer Care Ontario, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
Heather Carnahan School of Human Kinetics and Recreation, Memorial University of Newfoundland, St. John's, Newfoundland, Canada

Collapse

Tavares W, Ginsburg S, Eva KW. Selecting and Simplifying: Rater Performance and Behavior When Considering Multiple Competencies. TEACHING AND LEARNING IN MEDICINE 2016;28:41-51. [PMID: 26787084 DOI: 10.1080/10401334.2015.1107489] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]

Abstract

THEORY

Assessment of clinical competence is a complex cognitive task with many mental demands often imposed on raters unintentionally. We were interested in whether this burden might contribute to well-described limitations in assessment judgments. In this study we examine the effect on indicators of rating quality of asking raters to (a) consider multiple competencies and (b) attend to multiple issues. In addition, we explored the cognitive strategies raters engage when asked to consider multiple competencies simultaneously.

HYPOTHESES

We hypothesized that indications of rating quality (e.g., interrater reliability) would decline as the number of dimensions raters are expected to consider increases.

METHOD

Experienced faculty examiners rated prerecorded clinical performances within a 2 (number of dimensions) × 2 (presence of distracting task) × 3 (number of videos) factorial design. Half of the participants were asked to rate 7 dimensions of performance (7D), and half were asked to rate only 2 (2D). The second factor involved the requirement (or lack thereof) to rate the performance of actors participating in the simulation. We calculated the interrater reliability of the scores assigned and counted the number of relevant behaviors participants identified as informing their ratings. Second, we analyzed data from semistructured posttask interviews to explore the rater strategies associated with rating under conditions designed to broaden raters' focus.

RESULTS

Generalizability analyses revealed that the 2D group achieved higher interrater reliability relative to the 7D group (G = .56 and .42, respectively, when the average of 10 raters is calculated). The requirement to complete an additional rating task did not have an effect. Using the 2 dimensions common to both groups, an analysis of variance revealed that participants who were asked to rate only 2 dimensions identified more behaviors of relevance to the focal dimensions than those asked to rate 7 dimensions: procedural skill = 36.2%, 95% confidence interval (CI) [32.5, 40.0] versus 23.5%, 95% CI [20.8, 26.3], respectively; history gathering = 38.6%, 95% CI [33.5, 42.9] versus 24.0%, 95% CI [21.1, 26.9], respectively; ps < .05. During posttask interviews, raters identified many sources of cognitive load and idiosyncratic cognitive strategies used to reduce cognitive load during the rating task.

CONCLUSIONS

As intrinsic rating demands increase, indicators of rating quality decline. The strategies that raters engage when asked to rate many dimensions simultaneously are varied and appear to yield idiosyncratic efforts to reduce cognitive effort, which may affect the degree to which raters make judgments based on comparable information.

Collapse

Rangel JC, Cartmill C, Kuper A, Martimianakis MA, Whitehead CR. Setting the standard: Medical Education's first 50 years. MEDICAL EDUCATION 2016;50:24-35. [PMID: 26695464 DOI: 10.1111/medu.12765] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Revised: 03/03/2015] [Accepted: 03/20/2015] [Indexed: 05/15/2023]

Abstract

CONTEXT

By understanding its history, the medical education community gains insight into why it thinks and acts as it does. This piece provides a Foucauldian archaeological critical discourse analysis (CDA) of the journal Medical Education on the publication of its 50th Volume. This analysis draws upon critical social science perspectives to allow the examination of unstated assumptions that underpin and shape educational tools and practices.

METHODS

A Foucauldian form of CDA was utilised to examine the journal over its first half-century. This approach emphasises the importance of language, and the ways in which words used affect and are affected by educational practices and priorities. An iterative methodology was used to organise the very large dataset (12,000 articles). A distilled dataset, within which particular focus was placed on the editorial pieces in the journal, was analysed.

RESULTS

A major finding was the diversity of the journal as a site that has permitted multiple - and sometimes contradictory - discursive trends to emerge. One particularly dominant discursive tension across the time span of the journal is that between a persistent drive for standardisation and a continued questioning of the desirability of standardisation. This tension was traced across three prominent areas of focus in the journal: objectivity and the nature of medical education knowledge; universality and local contexts, and the place of medical education between academia and the community.

CONCLUSIONS

The journal has provided the medical education community with a place in which to both discuss practical pedagogical concerns and ponder conceptual and social issues affecting the medical education community. This dual nature of the journal brings together educators and researchers; it also gives particular focus to a major and rarely cited tension in medical education between the quest for objective standards and the limitations of standard measures.

Collapse

On the Assessment of Paramedic Competence: A Narrative Review with Practice Implications. Prehosp Disaster Med 2015;31:64-73. [DOI: 10.1017/s1049023x15005166] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Abstract AbstractIntroductionParamedicine is experiencing significant growth in scope of practice, autonomy, and role in the health care system. Despite clinical governance models, the degree to which paramedicine ultimately can be safe and effective will be dependent on the individuals the profession deems suited to practice. This creates an imperative for those responsible for these decisions to ensure that assessments of paramedic competence are indeed accurate, trustworthy, and defensible.PurposeThe purpose of this study was to explore and synthesize relevant theoretical foundations and literature informing best practices in performance-based assessment (PBA) of competence, as it might be applied to paramedicine, for design or evaluation of assessment programs.MethodsA narrative review methodology was applied to focus intentionally, but broadly, on purpose relevant, theoretically derived research that could inform assessment protocols in paramedicine. Primary and secondary studies from a number of health professions that contributed to and informed best practices related to the assessment of paramedic clinical competence were included and synthesized.ResultsMultiple conceptual frameworks, psychometric requirements, and emerging lines of research are forwarded. Seventeen practice implications are derived to promote understanding as well as best practices and evaluation criteria for educators, employers, and/or licensing/certifying bodies when considering the assessment of paramedic competence.ConclusionsThe assessment of paramedic competence is a complex process requiring an understanding, appreciation for, and integration of conceptual and psychometric principles. The field of PBA is advancing rapidly with numerous opportunities for research.

Tavares

,Boet

.On the assessment of paramedic competence: a narrative review with practice implications.Prehosp Disaster Med.2016;31(1):64–73.

Collapse

Ilgen JS, Ma IWY, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. MEDICAL EDUCATION 2015;49:161-73. [PMID: 25626747 DOI: 10.1111/medu.12621] [Citation(s) in RCA: 207] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Revised: 08/01/2014] [Accepted: 09/09/2014] [Indexed: 05/14/2023]

Abstract

CONTEXT

The relative advantages and disadvantages of checklists and global rating scales (GRSs) have long been debated. To compare the merits of these scale types, we conducted a systematic review of the validity evidence for checklists and GRSs in the context of simulation-based assessment of health professionals.

METHODS

We conducted a systematic review of multiple databases including MEDLINE, EMBASE and Scopus to February 2013. We selected studies that used both a GRS and checklist in the simulation-based assessment of health professionals. Reviewers working in duplicate evaluated five domains of validity evidence, including correlation between scales and reliability. We collected information about raters, instrument characteristics, assessment context, and task. We pooled reliability and correlation coefficients using random-effects meta-analysis.

RESULTS

We found 45 studies that used a checklist and GRS in simulation-based assessment. All studies included physicians or physicians in training; one study also included nurse anaesthetists. Topics of assessment included open and laparoscopic surgery (n = 22), endoscopy (n = 8), resuscitation (n = 7) and anaesthesiology (n = 4). The pooled GRS-checklist correlation was 0.76 (95% confidence interval [CI] 0.69-0.81, n = 16 studies). Inter-rater reliability was similar between scales (GRS 0.78, 95% CI 0.71-0.83, n = 23; checklist 0.81, 95% CI 0.75-0.85, n = 21), whereas GRS inter-item reliabilities (0.92, 95% CI 0.84-0.95, n = 6) and inter-station reliabilities (0.80, 95% CI 0.73-0.85, n = 10) were higher than those for checklists (0.66, 95% CI 0-0.84, n = 4 and 0.69, 95% CI 0.56-0.77, n = 10, respectively). Content evidence for GRSs usually referenced previously reported instruments (n = 33), whereas content evidence for checklists usually described expert consensus (n = 26). Checklists and GRSs usually had similar evidence for relations to other variables.

CONCLUSIONS

Checklist inter-rater reliability and trainee discrimination were more favourable than suggested in earlier work, but each task requires a separate checklist. Compared with the checklist, the GRS has higher average inter-item and inter-station reliability, can be used across multiple tasks, and may better capture nuanced elements of expertise.

Collapse

Durning SJ, Lubarsky S, Torre D, Dory V, Holmboe E. Considering "Nonlinearity" Across the Continuum in Medical Education Assessment: Supporting Theory, Practice, and Future Research Directions. THE JOURNAL OF CONTINUING EDUCATION IN THE HEALTH PROFESSIONS 2015;35:232-243. [PMID: 26378429 DOI: 10.1002/chp.21298] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Yeung E, Woods N, Dubrowski A, Hodges B, Carnahan H. Sensibility of a new instrument to assess clinical reasoning in post-graduate orthopaedic manual physical therapy education. ACTA ACUST UNITED AC 2014;20:303-12. [PMID: 25456273 DOI: 10.1016/j.math.2014.10.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Revised: 07/24/2014] [Accepted: 10/03/2014] [Indexed: 02/07/2023]

Abstract

Sound application of clinical reasoning (CR) by the physical therapist is critical to achieving optimal patient outcomes. As such, it is important for institutions granting certification in orthopaedic manual physical therapy (OMPT) to ensure that the assessment of CR is sufficiently robust. At present, the dearth of validated instruments to assess CR in OMPT presents a serious challenge to certifying institutions. Moreover, the lack of documentation of the development process for instruments that measure CR pose additional challenges. The purpose of this study is to evaluate the sensibility of a newly developed instrument for assessing written responses to a test of CR in OMPT; a 'pilot' phase that examines instrument feasibility and acceptability. Using a sequential mixed-methods approach, Canadian OMPT examiners were recruited to first review and use the instrument. Participants completed a sensibility questionnaire followed by semi-structured interviews, the latter of which were used to elaborate on questionnaire responses regarding feasibility and acceptability. Eleven examiners completed the questionnaire and interviews. Questionnaire results met previously established sensibility criteria, while interview data revealed participants' (dis)comfort with exerting their own judgment and with the rating scale. Quantitative and qualitative data provided valuable insight regarding content validity and issues related to efficiency in assessing CR competence; all of which will ultimately inform further psychometric testing. While results suggest that the new instrument for assessing clinical reasoning in the Canadian certification context is sensible, future research should explore how rater judgment can be utilized effectively and the mental workload associated with appraising clinical reasoning.

Collapse

Norman G. When I say … reliability. MEDICAL EDUCATION 2014;48:946-947. [PMID: 25200014 DOI: 10.1111/medu.12511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Revised: 04/22/2014] [Accepted: 04/22/2014] [Indexed: 06/03/2023]

Boniface K, Yarris LM. Emergency ultrasound: Leveling the training and assessment landscape. Acad Emerg Med 2014;21:803-5. [PMID: 25117083 DOI: 10.1111/acem.12406] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Eva KW, Macala C. Multiple mini-interview test characteristics: 'tis better to ask candidates to recall than to imagine. MEDICAL EDUCATION 2014;48:604-613. [PMID: 24807436 DOI: 10.1111/medu.12402] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2013] [Revised: 07/17/2013] [Accepted: 10/18/2013] [Indexed: 06/03/2023]

Abstract

CONTEXT

The multiple mini-interview (MMI), used to facilitate the selection of applicants in health professional programmes, has been shown to be capable of generating reliable data predictive of success. It is a process rather than a single instrument and therefore its psychometric properties can be expected to vary according to the stations generated, the alignment between the stations and the qualities an institution prioritises, and the outcomes used. The purpose of this study was to explore the MMI's test characteristics when station type is manipulated.

METHODS

A 12-station MMI was established in which four stations were presented in three different ways. These included: situational judgement (SJ) stations, in which applicants were asked to imagine what they would do in specific situations; behavioural interview (BI) stations, in which applicants were asked to recall what they did in experienced situations, and free form (FF) stations, which were unstructured in that the examiner was simply given a brief explanation of the intent of the station without further guidance on how to conduct the discussion. Four circuits of the 12 stations were run with one examiner within each station. Candidates and examiners were surveyed regarding their experience. The reliability of the scores derived from the assessment was analysed separately for each station type.

RESULTS

A total of 41 medical school candidates participated after completing the regular admission process. Although the score assigned did not differ across station type, BI stations more reliably differentiated between candidates (g = 0.77) than did the other station types (SJ, g = 0.69; FF, g = 0.66). The correlation between actual MMI scores and BI stations was also greatest (BI, r = 0.57; SJ, r = 0.45; FF, r = 0.42). Candidates' opinions indicated that FF stations were more anxiety-provoking, less clear, and more difficult than structured stations (SJ and BI stations). Examiner opinions indicated equivalence on these measures.

CONCLUSIONS

The results suggest that structuring stations has value, although that value was gained only through the use of BI stations, in which candidates were asked to recall and discuss a specific experience of relevance to the purpose of the interview station.

Collapse

Sherbino J, Kulasegaram K, Worster A, Norman GR. The reliability of encounter cards to assess the CanMEDS roles. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2013;18:987-96. [PMID: 23307097 DOI: 10.1007/s10459-012-9440-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Accepted: 12/18/2012] [Indexed: 05/25/2023]

Johnston JL, Lundy G, McCullough M, Gormley GJ. The view from over there: reframing the OSCE through the experience of standardised patient raters. MEDICAL EDUCATION 2013;47:899-909. [PMID: 23931539 DOI: 10.1111/medu.12243] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Revised: 12/15/2012] [Accepted: 03/15/2013] [Indexed: 06/02/2023]

Abstract

CONTEXT

Ratings awarded by standardised patients (SPs) in UK objective structured clinical examinations (OSCEs) are typically based on humanistic (non-technical) skills and are complementary to clinician-examiner ratings. In psychometric terms, SP ratings appear to differ from examiner ratings and improve reliability. For the first time, we used qualitative methods from a constructivist perspective to explore SP experiences of rating, and consider how these impact our understanding of assessment.

METHODS

We used constructivist grounded theory to analyse data from focus groups and individual semi-structured interviews with 38 SPs and four examiners. Inductive coding, theoretical sampling and constant comparison continued until theoretical saturation was achieved.

RESULTS

Standardised patients assessed students on the core process of relationship building. Three theoretical categories informed this process. The SP identity was strongly vocational and was both enacted and reinforced through rating as SPs exerted their agency to protect future patients by promoting student learning. Expectations of performance drew on individual life experiences in formulating expectations of doctors against which students were measured, and the patient experience was a lens through which all interactions were refracted. Standardised patients experienced the examination as real rather than simulated. They rated holistically, prioritised individuality and person-centredness, and included technical skill because the perception of clinical competence was an inextricable part of the patient experience.

CONCLUSIONS

The results can be used to reframe understanding of the SP role and of the psychometric discourse of assessment. Ratings awarded by SPs are socially constructed and reveal the complexity of the OSCE process and the unfeasibility of absolute objectivity or standardisation. Standardised patients valued individuality, subjective experience and assessment for learning. The potential of SPs is under-used their greater involvement should be used to promote real partnership as educators move into a post-psychometric era. New-generation assessments should strive to value subjective experience as well as psychometric data in order to utilise the significant potential for learning within assessment.

Collapse

Hodges B. Assessment in the post-psychometric era: learning to love the subjective and collective. MEDICAL TEACHER 2013;35:564-8. [PMID: 23631408 DOI: 10.3109/0142159x.2013.789134] [Citation(s) in RCA: 166] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Cianciolo AT, Eva KW, Colliver JA. Theory development and application in medical education. TEACHING AND LEARNING IN MEDICINE 2013;25 Suppl 1:S75-80. [PMID: 24246111 DOI: 10.1080/10401334.2013.842907] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]