1
|
Donohoe CL, Reilly F, Donnelly S, Cahill RA. Is There Variability in Scoring of Student Surgical OSCE Performance Based on Examiner Experience and Expertise? JOURNAL OF SURGICAL EDUCATION 2020; 77:1202-1210. [PMID: 32336628 DOI: 10.1016/j.jsurg.2020.03.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 02/22/2020] [Accepted: 03/22/2020] [Indexed: 06/11/2023]
Abstract
OBJECTIVE To investigate the influence of clinical experience and content expertise on global assessment scores in a Surgical Objective Structured Clinical Exam (OSCE) for senior medical undergraduate students. DESIGN Scripted videos of simulated student performance in an OSCE at two standards (clear pass and borderline) were awarded a global score on each of two rating scales by a range of clinical assessors. Results were analysed by examiner experience and content expertise. SETTING The study was designed in a large Medical School in Ireland. Examiners were consultant and training grade doctors from three university teaching hospitals. PARTICIPANTS 147 assessors participated. Of these, 75 (51%) were surgeons and 25 (17%) had sub-speciality surgical expertise directly relevant to the OSCE station. 41 were consultants. RESULTS Responsible academic scoring set the benchmark. By multivariable linear regression analysis, neither clinical experience (consultant status) nor relevant content expertise in surgery was independently predictive of assessor grading for either clear pass or borderline student performance. No educational factor (previous examining experience/training, self-rated confidence in assessment or frame of reference) was significant. Assessor gender (male) was associated with award of a fail grade for borderline performance. Trainees were reliable graders of borderline performance but more lenient than the gold standard for clear pass. We report greater agreement with the gold standard score using the global descriptive scale, with strong agreement for all assessors in the borderline case. CONCLUSIONS Neither assessor clinical experience nor content expertise is independently predictive of grade awarded in an OSCE. Where non-experts or trainees assess, we find evidence for use of a descriptive global score to maximise agreement with expert gold standard, particularly for borderline performance. These results inform the fair and reliable participation of a range of examiners across subspecialty stations in the surgical OSCE format.
Collapse
Affiliation(s)
- Claire L Donohoe
- Department of Surgery, Mater Misericordiae University Hospital, Dublin, Ireland; Department of Surgery, St James' Hospital, Dublin 8 and Trinity College, Dublin, Ireland
| | - Frank Reilly
- Department of Surgery, Mater Misericordiae University Hospital, Dublin, Ireland
| | - Suzanne Donnelly
- Medical Education Unit, School of Medicine, University College Dublin, Dublin, Ireland
| | - Ronan A Cahill
- Section of Surgery and Surgical Specialities, School of Medicine, University College, Dublin, Ireland; Department of Surgery, Mater Misericordiae University Hospital, Dublin, Ireland.
| |
Collapse
|
2
|
Taylor I, Bing-Jonsson PC, Johansen E, Levy-Malmberg R, Fagerström L. The Objective Structured Clinical Examination in evolving nurse practitioner education: A study of students' and examiners’ experiences. Nurse Educ Pract 2019; 37:115-123. [DOI: 10.1016/j.nepr.2019.04.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 03/10/2019] [Accepted: 04/02/2019] [Indexed: 10/27/2022]
|
3
|
Amini R, Hernandez NC, Keim SM, Gordon PR. Using standardized patients to evaluate medical students' evidence-based medicine skills. J Evid Based Med 2016; 9:38-42. [PMID: 26646923 DOI: 10.1111/jebm.12183] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 12/15/2015] [Indexed: 01/11/2023]
Abstract
OBJECTIVES To analyze the effectiveness of an Evidence Based Medicine Objective Structured Clinical Examination (EBM OSCE) with standardized patients for end of third year medical students at our institution. METHODS This was a single-center prospective cross-sectional investigation. As part of the eight-station OSCE exam, the authors developed and implemented a new 25-minute EBM OSCE station with the goal of evaluating evidence based medicine skills necessary for daily clinical encounters. The OSCE case involved a highly educated patient with a history of recurrent debilitating migraines who has brought eight specific questions regarding the use of steroids for migraine headaches. Students were provided computer stations equipped to record a log of the searches performed. RESULTS One hundred and four third-year medical students participated in this study. The average number of search tools used by the students was 4 (SD = 2). The 104 students performed a total of 896 searches. The two most commonly used websites were uptodate.com and google.com. Sixty-nine percent (95% CI, 60% to 78%) of students were able to find a meta-analysis regarding the use of dexamethasone for the prevention of rebound migraines. Fifty-two percent of students were able to explain that patients who took dexamethasone had a moderate RR (0.68 to 0.78) of having a recurrent migraine, and 71% of students were able to explain to the standardized patient that the NNT for dexamethasone was nine. CONCLUSION The EBM OSCE was successfully integrated into the existing eight-station OSCE and was able to assess student EBM skills.
Collapse
Affiliation(s)
- Richard Amini
- Department of Emergency Medicine, University of Arizona, Tucson, AZ
| | | | - Samuel M Keim
- Department of Emergency Medicine, University of Arizona, Tucson, AZ
| | - Paul R Gordon
- Department of Family Medicine, University of Arizona, Tucson, AZ
| |
Collapse
|
4
|
van Vught AJAH, Hettinga AM, Denessen EJPG, Gerhardus MJT, Bouwmans GAM, van den Brink GTWJ, Postma CT. Analysis of the level of general clinical skills of physician assistant students using an objective structured clinical examination. J Eval Clin Pract 2015; 21:971-5. [PMID: 26376735 DOI: 10.1111/jep.12418] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/08/2015] [Indexed: 11/30/2022]
Abstract
RATIONALE, AIMS AND OBJECTIVES The physician assistant (PA) is trained to perform clinical tasks traditionally performed by medical doctors (MDs). Previous research showed no difference in the level of clinical skills of PAs compared with MDs in a specific niche, that is the specialty in which they are employed. However, MDs as well as PAs working within a specialty have to be able to recognize medical problems in the full scope of medicine. The objective is to examine PA students' level of general clinical skills across the breadth of clinical cases. METHOD A cross-sectional study was conducted. PA students and recently graduated MDs in the Netherlands were observed on their clinical skills by means of an objective structured clinical examination comprising five stations with common medical cases. The level of mastering history taking, physical examination, communication and clinical reasoning of PA students and MDs were described in means and standard deviation. Cohen's d was used to present effect sizes. RESULTS PA students and MDs score about equal on history taking (PA 5.8 ± 0.8 vs. MD 5.7 ± 0.7), physical examination (PA 4.8 ± 1.3 vs. MD 5.4 ± 0.8) and communication (PA: 8.2 ± 0.8 vs. MD: 8.6 ± 0.5) in the full scope of medicine. In the quality of the report, including the patient management plan, PA students scored a mean of 6.0 ± 0.6 and MDs 6.8 ± 0.6. CONCLUSIONS In this setting in the Netherlands, PA students and MDs score about equal in the appraisal of common cases in medical practice. The slightly lower scores of PA students' clinical reasoning in the full scope of clinical care may have raise attention to medical teams working with PAs and PA training programmes.
Collapse
Affiliation(s)
- Anneke J A H van Vught
- Faculty of Health and Social Studies, HAN University of Applied Sciences, Nijmegen, The Netherlands
| | - Agatha M Hettinga
- Institute for (Bio) Medical Education, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Eddie J P G Denessen
- Behavioural Science Institute, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - Martin J T Gerhardus
- Faculty of Health and Social Studies, HAN University of Applied Sciences, Nijmegen, The Netherlands.,Department of Primary and Community Care, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Geert A M Bouwmans
- Institute for (Bio) Medical Education, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Cornelis T Postma
- Institute for (Bio) Medical Education, Radboud University Medical Center, Nijmegen, The Netherlands.,Department of Internal Medicine, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
5
|
Criscione-Schreiber LG, Sloane RJ, Hawley J, Jonas BL, O'Rourke KS, Bolster MB. Expert panel consensus on assessment checklists for a rheumatology objective structured clinical examination. Arthritis Care Res (Hoboken) 2015; 67:898-904. [PMID: 25580581 DOI: 10.1002/acr.22543] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 10/22/2014] [Accepted: 12/23/2014] [Indexed: 11/08/2022]
Abstract
OBJECTIVE While several regional fellowship groups conduct rheumatology objective structured clinical examinations (ROSCEs), none have been validated for use across programs. We aimed to establish agreement among subspecialty experts regarding checklist items for several ROSCE stations. METHODS We administered a 1-round survey to assess the importance of 173 assessment checklist items for 11 possible ROSCE stations. We e-mailed the survey to 127 rheumatology educators from across the US. Participants rated each item's importance on a 5-point Likert scale (1 = not important to 5 = very important). Consensus for high importance was predefined as a lower bound of the 95% confidence interval ≥4.0. RESULTS Twenty-five individuals (20%) completed the expert panel survey. A total of 133 of the 173 items (77%) met statistical cutoff for consensus to retain. Several items that had population means of ≥4.0 but did not meet the predetermined definition for consensus were rejected. The percentage of retained items for individual stations ranged from 24% to 100%; all items were retained for core elements of patient counseling and radiograph interpretation tasks. Only 24% of items were retained for a rehabilitation medicine station and 60% for a microscope use/synovial fluid analysis station. CONCLUSION This single-round expert panel survey established consensus on 133 items to assess on 11 proposed ROSCE stations. The method used in this study, which can engage a diverse geographic representation and employs rigorous statistical methods to establish checklist content agreement, can be used in any medical field.
Collapse
Affiliation(s)
| | | | | | | | - Kenneth S O'Rourke
- Wake Forest University School of Medicine, Winston-Salem, North Carolina
| | | |
Collapse
|
6
|
Bouwmans GAM, Denessen E, Hettinga AM, Michels C, Postma CT. Reliability and validity of an extended clinical examination. MEDICAL TEACHER 2015; 37:1072-1077. [PMID: 25683172 DOI: 10.3109/0142159x.2015.1009423] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
INTRODUCTION An extended clinical examination (ECE) was administered to 85 final year medical students at the Radboud University Medical Centre in the Netherlands. The aim of the study was to determine the psychometric quality and the suitability of the ECE as a measurement tool to assess the clinical proficiency of eight separate clinical skills. METHODS Generalizability studies were conducted to determine the generalizability coefficient and the sources of variance of the ECE. An additional D-study was performed to estimate the generalizability coefficients with altering numbers of stations. RESULTS The largest sources of variance were found in skill difficulties (36.18%), the general error term (26.76%) and in the rank ordering of skill difficulties across the stations (21.89%). The generalizability coefficient of the entire ECE was above the 0.70 lower bound (G = 0.74). D studies showed that the separate skills could yield sufficient G coefficients in seven out of eight skills, if the ECE was lengthened from 8 to 14 stations. DISCUSSION The ECE proved to be a reliable clinical assessment that enables examinees to compose a clinical reasoning path through self-obtained data. The ECE can also be used as an assessment tool for separate clinical skills.
Collapse
Affiliation(s)
| | - E Denessen
- b Radboud University Nijmegen , The Netherlands
| | - A M Hettinga
- a Radboud University Medical Centre , The Netherlands
| | - C Michels
- b Radboud University Nijmegen , The Netherlands
| | - C T Postma
- a Radboud University Medical Centre , The Netherlands
| |
Collapse
|
7
|
Ilgen JS, Ma IWY, Hatala R, Cook DA. A systematic review of validity evidence for checklists versus global rating scales in simulation-based assessment. MEDICAL EDUCATION 2015; 49:161-73. [PMID: 25626747 DOI: 10.1111/medu.12621] [Citation(s) in RCA: 198] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Revised: 08/01/2014] [Accepted: 09/09/2014] [Indexed: 05/14/2023]
Abstract
CONTEXT The relative advantages and disadvantages of checklists and global rating scales (GRSs) have long been debated. To compare the merits of these scale types, we conducted a systematic review of the validity evidence for checklists and GRSs in the context of simulation-based assessment of health professionals. METHODS We conducted a systematic review of multiple databases including MEDLINE, EMBASE and Scopus to February 2013. We selected studies that used both a GRS and checklist in the simulation-based assessment of health professionals. Reviewers working in duplicate evaluated five domains of validity evidence, including correlation between scales and reliability. We collected information about raters, instrument characteristics, assessment context, and task. We pooled reliability and correlation coefficients using random-effects meta-analysis. RESULTS We found 45 studies that used a checklist and GRS in simulation-based assessment. All studies included physicians or physicians in training; one study also included nurse anaesthetists. Topics of assessment included open and laparoscopic surgery (n = 22), endoscopy (n = 8), resuscitation (n = 7) and anaesthesiology (n = 4). The pooled GRS-checklist correlation was 0.76 (95% confidence interval [CI] 0.69-0.81, n = 16 studies). Inter-rater reliability was similar between scales (GRS 0.78, 95% CI 0.71-0.83, n = 23; checklist 0.81, 95% CI 0.75-0.85, n = 21), whereas GRS inter-item reliabilities (0.92, 95% CI 0.84-0.95, n = 6) and inter-station reliabilities (0.80, 95% CI 0.73-0.85, n = 10) were higher than those for checklists (0.66, 95% CI 0-0.84, n = 4 and 0.69, 95% CI 0.56-0.77, n = 10, respectively). Content evidence for GRSs usually referenced previously reported instruments (n = 33), whereas content evidence for checklists usually described expert consensus (n = 26). Checklists and GRSs usually had similar evidence for relations to other variables. CONCLUSIONS Checklist inter-rater reliability and trainee discrimination were more favourable than suggested in earlier work, but each task requires a separate checklist. Compared with the checklist, the GRS has higher average inter-item and inter-station reliability, can be used across multiple tasks, and may better capture nuanced elements of expertise.
Collapse
Affiliation(s)
- Jonathan S Ilgen
- Division of Emergency Medicine, Department of Medicine, University of Washington School of Medicine, Seattle, Washington, USA
| | | | | | | |
Collapse
|
8
|
Daniels VJ, Bordage G, Gierl MJ, Yudkowsky R. Effect of clinically discriminating, evidence-based checklist items on the reliability of scores from an Internal Medicine residency OSCE. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2014; 19:497-506. [PMID: 24449122 DOI: 10.1007/s10459-013-9482-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Accepted: 11/17/2013] [Indexed: 06/03/2023]
Abstract
Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving learners are more likely to use. The purpose of this study was to determine if limiting checklist items to clinically discriminating items and/or adding missing evidence-based items improved score reliability in an Internal Medicine residency OSCE. Six internists reviewed the traditional checklists of four OSCE stations classifying items as clinically discriminating or non-discriminating. Two independent reviewers augmented checklists with missing evidence-based items. We used generalizability theory to calculate overall reliability of faculty observer checklist scores from 45 first and second-year residents and predict how many 10-item stations would be required to reach a Phi coefficient of 0.8. Removing clinically non-discriminating items from the traditional checklist did not affect the number of stations (15) required to reach a Phi of 0.8 with 10 items. Focusing the checklist on only evidence-based clinically discriminating items increased test score reliability, needing 11 stations instead of 15 to reach 0.8; adding missing evidence-based clinically discriminating items to the traditional checklist modestly improved reliability (needing 14 instead of 15 stations). Checklists composed of evidence-based clinically discriminating items improved the reliability of checklist scores and reduced the number of stations needed for acceptable reliability. Educators should give preference to evidence-based items over non-evidence-based items when developing OSCE checklists.
Collapse
Affiliation(s)
- Vijay J Daniels
- Department of Medicine, University of Alberta, 5-112 Clinical Sciences Building, 11350-83 Avenue NW, Edmonton, AB, T6G 2G3, Canada,
| | | | | | | |
Collapse
|
9
|
Objective Structured Clinical Examinations: a guide to development and implementation in orthopaedic residency. J Am Acad Orthop Surg 2013; 21:592-600. [PMID: 24084433 DOI: 10.5435/jaaos-21-10-592] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Objective Structured Clinical Examinations (OSCEs) have been used extensively in medical schools and residency programs to evaluate various skills, including the six core competencies outlined by the Accreditation Council for Graduate Medical Education (ACGME). Orthopaedic surgery residency programs will be required by the ACGME to assess residents on core competencies in the Milestone Project. Thus, it is important that evaluations be made in a consistent, objective manner. Orthopaedic residency programs can also use simulation models in the examination to accurately and objectively assess residents' skills as they progress through training. The use of these models will become essential as resident work hours are decreased and opportunities to observe skills become more limited. In addition to providing a method to assess competency, OSCEs are a valuable tool for residents to develop and practice important clinical skills. Here, we describe a method for developing a successful OSCE for use in orthopaedic surgical resident training.
Collapse
|
10
|
Conn JJ, Lake FR, McColl GJ, Bilszta JLC, Woodward‐Kron R. Clinical teaching and learning: from theory and research to application. Med J Aust 2012; 196:527. [DOI: 10.5694/mja10.11473] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2010] [Accepted: 10/20/2011] [Indexed: 11/17/2022]
Affiliation(s)
- Jennifer J Conn
- Department of Medicine, Royal Melbourne Hospital, University of Melbourne, Melbourne, VIC
| | - Fiona R Lake
- School of Medicine and Pharmacology, University of Western Australia, Perth, WA
| | | | | | | |
Collapse
|
11
|
Hofer RE, Nikolaus OB, Pawlina W. Using checklists in a gross anatomy laboratory improves learning outcomes and dissection quality. ANATOMICAL SCIENCES EDUCATION 2011; 4:249-255. [PMID: 21786427 DOI: 10.1002/ase.243] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Revised: 06/01/2011] [Accepted: 06/20/2011] [Indexed: 05/31/2023]
Abstract
Checklists have been widely used in the aviation industry ever since aircraft operations became more complex than any single pilot could reasonably remember. More recently, checklists have found their way into medicine, where cognitive function can be compromised by stress and fatigue. The use of checklists in medical education has rarely been reported, especially in the basic sciences. We explored whether the use of a checklist in the gross anatomy laboratory would improve learning outcomes, dissection quality, and students' satisfaction in the first-year Human Structure didactic block at Mayo Medical School. During the second half of a seven-week anatomy course, dissection teams were each day given a hardcopy checklist of the structures to be identified during that day's dissection. The first half of the course was considered the control, as students did not receive any checklists to utilize during dissection. The measured outcomes were scored on four practice practical examinations and four dissection quality assessments, two each from the first half (control) and second half of the course. A student satisfaction survey was distributed at the end of the course. Examination and dissection scores were analyzed for correlations between practice practical examination score and checklist use. Our data suggest that a daily hardcopy list of anatomical structures for active use in the gross anatomy laboratory increases practice practical examination scores and dissection quality. Students recommend the use of these checklists in future anatomy courses.
Collapse
|