1
|
Ding H, Homer M. Tailoring support following summative assessments: a latent profile analysis of student outcomes across five medical specialities. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2024:10.1007/s10459-024-10357-9. [PMID: 39042360 DOI: 10.1007/s10459-024-10357-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 06/30/2024] [Indexed: 07/24/2024]
Abstract
Summative assessments are often underused for feedback, despite them being rich with data of students' applied knowledge and clinical and professional skills. To better inform teaching and student support, this study aims to gain insights from summative assessments through profiling students' performance patterns and identify those students missing the basic knowledge and skills in medical specialities essential for their future career. We use Latent Profile Analysis to classify a senior undergraduate year group (n = 295) based on their performance in applied knowledge test (AKT) and OSCE, in which items and stations are pre-classified across five specialities (e.g. Acute and Critical Care, Paediatrics,…). Four distinct groups of students with increasing average performance levels in the AKT, and three such groups in the OSCE are identified. Overall, these two classifications are positively correlated. However, some students do well in one assessment format but not in the other. Importantly, in both the AKT and the OSCE there is a mixed group containing students who have met the required standard to pass, and those who have not. This suggests that a conception of a borderline group at the exam-level can be overly simplistic. There is little literature relating AKT and OSCE performance in this way, and the paper discusses how our analysis gives placement tutors key insights into providing tailored support for distinct student groups needing remediation. It also gives additional information to assessment writers about the performance and difficulty of their assessment items/stations, and to wider faculty about student overall performance and across specialities.
Collapse
Affiliation(s)
- Huiming Ding
- School of Medicine, University of Leeds, Leeds, UK.
| | - Matt Homer
- School of Medicine, University of Leeds, Leeds, UK
| |
Collapse
|
2
|
Abed RA, Elaraby SE. Measuring the Effect of Using a Borderline Students’ Characteristics Model on Reliability of Objective Structured Clinical Examination. Cureus 2022; 14:e25156. [PMID: 35733486 PMCID: PMC9205447 DOI: 10.7759/cureus.25156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/19/2022] [Indexed: 11/21/2022] Open
Abstract
Defining borderline group is a crucial step when applying standard setting methods in objective structured clinical examination (OSCE). However, it is the most challenging and demanding task. This study aimed to measure the effect of using a model describing characteristics of borderline students on the reliability and metrics of OSCE. This model was adopted from a qualitative study based on conducted semi-structured interviews with experienced raters. The model identifies several themes categorized under four items which are gathering patient information, examining patients, communicating with patients, and general personal characteristics. In the current study, two groups of evaluators were investigated: one as the experimental group that received orientation about the used model and the other as the control group that did not receive any orientation. We applied the model in two mirrored OSCE circuits. Using the model enhanced raters' global rating. Consequently, the cut scores between the two OSCE circuits were different, and the examination reliability and quality metrics were improved.
Collapse
|
3
|
Homer M. Re-conceptualising and accounting for examiner (cut-score) stringency in a 'high frequency, small cohort' performance test. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2021; 26:369-383. [PMID: 32876815 PMCID: PMC8041694 DOI: 10.1007/s10459-020-09990-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/24/2020] [Indexed: 06/11/2023]
Abstract
Variation in examiner stringency is an ongoing problem in many performance settings such as in OSCEs, and usually is conceptualised and measured based on scores/grades examiners award. Under borderline regression, the standard within a station is set using checklist/domain scores and global grades acting in combination. This complexity requires a more nuanced view of what stringency might mean when considering sources of variation of cut-scores in stations. This study uses data from 349 administrations of an 18-station, 36 candidate single circuit OSCE for international medical graduates wanting to practice in the UK (PLAB2). The station-level data was gathered over a 34-month period up to July 2019. Linear mixed models are used to estimate and then separate out examiner (n = 547), station (n = 330) and examination (n = 349) effects on borderline regression cut-scores. Examiners are the largest source of variation in cut-scores accounting for 56% of variance in cut-scores, compared to 6% for stations, < 1% for exam and 37% residual. Aggregating to the exam level tends to ameliorate this effect. For 96% of examinations, a 'fair' cut-score, equalising out variation in examiner stringency that candidates experience, is within one standard error of measurement (SEM) of the actual cut-score. The addition of the SEM to produce the final pass mark generally ensures the public is protected from almost all false positives in the examination caused by examiner cut-score stringency acting in candidates' favour.
Collapse
Affiliation(s)
- Matt Homer
- Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, LS2 9JT, UK.
| |
Collapse
|
4
|
Homer M, Russell J. Conjunctive standards in OSCEs: The why and the how of number of stations passed criteria. MEDICAL TEACHER 2021; 43:448-455. [PMID: 33290124 DOI: 10.1080/0142159x.2020.1856353] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
INTRODUCTION Many institutions require candidates to achieve a minimum number of OSCE stations passed (MNSP) in addition to the aggregate pass mark. The stated rationale is usually that this conjunctive standard prevents excessive degrees of compensation across an assessment. However, there is a lack of consideration and discussion of this practice in the medical education literature. METHODS We consider the motivations for the adoption of the MNSP from the assessment designer perspective, outlining potential concerns about the complexity of what the OSCE is trying to achieve, particularly around the blueprinting process and the limitations of scoring instruments. We also introduce four potential methods for setting an examinee-centred MNSP standard, and highlight briefly the theoretical advantages and disadvantages of these approaches. DISCUSSION AND CONCLUSION There are psychometric arguments for and against the limiting of compensation in OSCEs, but it is clear that many stakeholders value the application of an MNSP standard. This paper adds to the limited literature on this important topic and notes that current MNSP practices are often problematic in high stakes settings. More empirical work is needed to develop understanding of the impact on pass/fail decision-making of the proposed standard setting methods developed in this paper.
Collapse
Affiliation(s)
- Matt Homer
- Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK
| | - Jen Russell
- Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK
| |
Collapse
|
5
|
Lane AS, Roberts C, Khanna P. Do We Know Who the Person With the Borderline Score is, in Standard-Setting and Decision-Making. HEALTH PROFESSIONS EDUCATION 2020. [DOI: 10.1016/j.hpe.2020.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
6
|
Homer M, Fuller R, Hallam J, Pell G. Shining a spotlight on scoring in the OSCE: Checklists and item weighting. MEDICAL TEACHER 2020; 42:1037-1042. [PMID: 32608303 DOI: 10.1080/0142159x.2020.1781072] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Introduction: There has been a long-running debate about the validity of item-based checklist scoring of performance assessments like OSCEs. In recent years, the conception of a checklist has developed from its dichotomous inception into a more 'key-features' and/or chunked approach, where 'items' have the potential to become weighted differently, but the literature does not always reflect these broader conceptions.Methods: We consider theoretical, design and (clinically trained) assessor issues related to differential item weighting in checklist scoring of OSCEs stations. Using empirical evidence, this work also compares candidate decisions and psychometric quality of different item-weighting approaches (i.e. a simple 'unweighted' scheme versus a differentially weighted one).Results: The impact of different weighting schemes affect approximately 30% of the key borderline group of candidates, and 3% of candidates overall. We also find that measures of overall assessment quality are a little better under the differentially weighted scoring system.Discussion and conclusion: Differentially weighted modern checklists can contribute to valid assessment outcomes, and bring a range of additional benefits to the assessment. Judgment about weighting of particular items should be considered a key design consideration during station development and must align to clinical assessor expectations of the relative importance of sub-tasks.
Collapse
Affiliation(s)
- Matt Homer
- Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK
| | - Richard Fuller
- School of Medicine, University of Liverpool, Liverpool, UK
| | - Jennifer Hallam
- Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK
| | - Godfrey Pell
- Leeds Institute of Medical Education, School of Medicine, University of Leeds, Leeds, UK
| |
Collapse
|
7
|
Zuberi RW, Klamen DL, Hallam J, Yousuf N, Beason AM, Neumeister EL, Lane R, Ward J. The journeys of three ASPIRE winning medical schools toward excellence in student assessment. MEDICAL TEACHER 2019; 41:457-464. [PMID: 30451051 DOI: 10.1080/0142159x.2018.1497783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Introduction: ASPIRE Excellence Awards in Student Assessment are offered to medical schools with innovative and comprehensive assessment programmes adjudged by international experts, using evidence-based criteria. The journeys of three ASPIRE-winning medical schools toward "assessment excellence" are presented. These schools include Aga Khan University Medical College (AKU-MC), Pakistan, Southern Illinois University School of Medicine (SIUSOM), USA, and University of Leeds School of Medicine, UK. Methods: The unfolding journeys highlighting achievements, innovations, and essential components of each assessment programme were compared to identify differences and commonalities. Results: Cultural contextual differences included developed-versus-developing country, east-west, type of regulatory bodies, and institutional-versus-national certifying/licensing examinations, which influence curricula and assessments. In all, 12 essential commonalities were found: alignment with institutional vision; sustained assessment leadership; stakeholder engagement; communication between curriculum and assessment; assessment-for-learning and feedback; longitudinal student profiling of outcome achievement; assessment rigor and robustness; 360° feedback from-and-to assessment; continuous enrichment through rigorous quality assurance; societal sensitivity; influencing others; and a "wow factor." Conclusions: Although the journeys of the three medical schools were undertaken in different cultural contexts, similar core components highlight strong foundations in student assessment. The journeys continue as assessment programmes remain dynamic and measurement science expands. This article may be helpful to other institutions pursuing excellence in assessment.
Collapse
Affiliation(s)
- Rukhsana W Zuberi
- a Department for Educational Development , Aga Khan University , Karachi , Pakistan
| | - Debra L Klamen
- b Department of Medical Education , Southern Illinois University School of Medicine , Springfield , IL , USA
| | - Jennifer Hallam
- c Leeds Institute of Medical Education, School of Medicine, Worsley Building, University of Leeds , Leeds , UK
| | - Naveed Yousuf
- a Department for Educational Development , Aga Khan University , Karachi , Pakistan
| | - Austin M Beason
- b Department of Medical Education , Southern Illinois University School of Medicine , Springfield , IL , USA
| | - Evyn L Neumeister
- b Department of Medical Education , Southern Illinois University School of Medicine , Springfield , IL , USA
| | - Rob Lane
- c Leeds Institute of Medical Education, School of Medicine, Worsley Building, University of Leeds , Leeds , UK
| | - Jason Ward
- c Leeds Institute of Medical Education, School of Medicine, Worsley Building, University of Leeds , Leeds , UK
| |
Collapse
|
8
|
Daniels VJ, Pugh D. Twelve tips for developing an OSCE that measures what you want. MEDICAL TEACHER 2018; 40:1208-1213. [PMID: 29069965 DOI: 10.1080/0142159x.2017.1390214] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The Objective Structured Clinical Examination (OSCE) is used globally for both high and low stakes assessment. Despite its extensive use, very few published articles provide a set of best practices for developing an OSCE, and of those that do, none apply a modern understanding of validity. This article provides 12 tips for developing an OSCE guided by Kane's validity framework to ensure the OSCE is assessing what it purports to measure. The 12 tips are presented in the order they would be operationalized during OSCE development.
Collapse
Affiliation(s)
| | - Debra Pugh
- b Department of Medicine , University of Ottawa , Ottawa , Canada
| |
Collapse
|