1
|
Lund S, Navarro S, D'Angelo JD, Park YS, Rivera M. Expanded Access to Video-Based Laparoscopic Skills Assessments: Ease, Reliability, and Accuracy. JOURNAL OF SURGICAL EDUCATION 2024; 81:850-857. [PMID: 38664172 DOI: 10.1016/j.jsurg.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/29/2024] [Accepted: 03/13/2024] [Indexed: 05/13/2024]
Abstract
OBJECTIVE Video-based performance assessments provide essential feedback to surgical residents, but in-person and remote video-based assessment by trained proctors incurs significant cost. We aimed to determine the reliability, accuracy, and difficulty of untrained attending staff surgeon raters completing video-based assessments of a basic laparoscopic skill. Secondarily, we aimed to compare reliability and accuracy between 2 different types of assessment tools. DESIGN An anonymous survey was distributed electronically to surgical attendings via a national organizational listserv. Survey items included demographics, rating of video-based assessment experience (1 = have never completed video-based assessments, 5 = often complete video-based assessments), and rating of favorability toward video-based and in-person assessments (0 = not favorable, 100 = favorable). Participants watched 2 laparoscopic peg transfer performances, then rated each performance using an Objective Structured Assessment of Technical Skill (OSATS) form and the McGill Inanimate System for Training and Evaluation of Laparoscopic Skills (MISTELS). Participants then rated assessment completion ease (1 = Very Easy, 5 = Very Difficult). SETTING National survey of practicing surgeons. PARTICIPANTS Sixty-one surgery attendings with experience in laparoscopic surgery from 10 institutions participated as untrained raters. Six experienced laparoscopic skills proctors participated as expert raters. RESULTS Inter-rater reliability was substantial for both OSATS (k = 0.75) and MISTELS (k = 0.85). MISTELS accuracy was significantly higher than that of OSATS (κ: MISTELS = 0.18, 95%CI = [0.06,0.29]; OSATS = 0.02, 95%CI = [-0.01,0.04]). While participants were inexperienced with completing video-based assessments (median = 1/5), they perceived video-based assessments favorably (mean = 73.4) and felt assessment completion was "Easy" on average. CONCLUSIONS We demonstrate that faculty raters untrained in simulation-based assessments can successfully complete video-based assessments of basic laparoscopic skills with substantial inter-rater reliability without marked difficulty. These findings suggest an opportunity to increase access to feedback for trainees using video-based assessment of fundamental skills in laparoscopic surgery.
Collapse
Affiliation(s)
- Sarah Lund
- Mayo Clinic Department of Surgery, 200 1st Street SW, Rochester, Minnesota 55905.
| | - Sergio Navarro
- Mayo Clinic Department of Surgery, 200 1st Street SW, Rochester, Minnesota 55905
| | - Jonathan D D'Angelo
- Mayo Clinic Division of Colon and Rectal Surgery, 200 1st Street SW, Rochester, Minnesota 55905
| | - Yoon Soo Park
- Department of Medical Education, University of Illinois at Chicago College of Medicine, 808 S Wood Street, Chicago Illinois 60612
| | - Mariela Rivera
- Mayo Clinic Division of Trauma, Critical Care, and General Surgery, 200 1st Street SW, Rochester, Minnesota 55905
| |
Collapse
|
2
|
Castanelli DJ, Woods JB, Chander AR, Weller JM. Trainee anaesthetist self-assessment using an entrustment scale in workplace-based assessment. Anaesth Intensive Care 2024:310057X241234676. [PMID: 38649296 DOI: 10.1177/0310057x241234676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
The role of self-assessment in workplace-based assessment remains contested. However, anaesthesia trainees need to learn to judge the quality of their own work. Entrustment scales have facilitated a shared understanding of performance standards among supervisors by aligning assessment ratings with everyday clinical supervisory decisions. We hypothesised that if the entrustment scale similarly helped trainees in their self-assessment, there would be substantial agreement between supervisor and trainee ratings. We collected separate mini-clinical evaluation exercises forms from 113 anaesthesia trainee-supervisor pairs from three hospitals in Australia and New Zealand. We calculated the agreement between trainee and supervisor ratings using Pearson and intraclass correlation coefficients. We also tested for associations with demographic variables and examined narrative comments for factors influencing rating. We found ratings agreed in 32% of cases, with 66% of trainee ratings within one point of the supervisor rating on a nine-point scale. The correlation between trainee and supervisor ratings was 0.71, and the degree of agreement measured by the intraclass correlation coefficient was 0.67. With higher supervisor ratings, trainee ratings better correlated with supervisor ratings. We found no strong association with demographic variables. Possible explanations of divergent ratings included one party being unaware of a vital aspect of the performance and different interpretations of the prospective nature of the scale. The substantial concordance between trainee and supervisor ratings supports the contention that the entrustment scale helped produce a shared understanding of the desired performance standard. Discussion between trainees and supervisors on the reasoning underlying their respective judgements would provide further opportunities to enhance this shared understanding.
Collapse
Affiliation(s)
- Damian J Castanelli
- School of Clinical Sciences at Monash Health, Monash University, Clayton, Australia
- Department of Anaesthesia and Perioperative Medicine, Monash Health, Clayton, Australia
| | - Jennifer B Woods
- Department of Anaesthesia, Canterbury District Health Board, Christchurch, New Zealand
| | - Anusha R Chander
- School of Clinical Sciences at Monash Health, Monash University, Clayton, Australia
- Department of Anaesthesia and Perioperative Medicine, Monash Health, Clayton, Australia
| | - Jennifer M Weller
- Centre for Medical and Health Sciences Education, University of Auckland, Auckland, New Zealand
| |
Collapse
|
3
|
Mizunuma K, Kurashima Y, Poudel S, Watanabe Y, Noji T, Nakamura T, Okamura K, Shichinohe T, Hirano S. Surgical skills assessment of pancreaticojejunostomy using a simulator may predict patient outcomes: A multicenter prospective observational study. Surgery 2023; 173:1374-1380. [PMID: 37003952 DOI: 10.1016/j.surg.2023.02.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 12/30/2022] [Accepted: 02/23/2023] [Indexed: 04/03/2023]
Abstract
BACKGROUND Pancreatoduodenectomy, an advanced surgical procedure with a high complication rate, requires surgical skill in performing pancreaticojejunostomy, which correlates with operative outcomes. We aimed to analyze the correlation between pancreaticojejunostomy assessment conducted in a simulator environment and the operating room and patient clinical outcomes. METHODS We recruited 30 surgeons (with different experience levels in pancreatoduodenectomy) from 11 institutes. Three trained blinded raters assessed the videos of the pancreaticojejunostomy procedure performed in the operating room using a simulator according to an objective structured assessment of technical skill and a newly developed pancreaticojejunostomy assessment scale. The correlations between the assessment score of the pancreaticojejunostomy performed in the operating room and using the simulator and between each assessment score and patient outcomes were calculated. The participants were also surveyed regarding various aspects of the simulator as a training tool. RESULTS There was no correlation between the average score of the pancreaticojejunostomy performed in the operating room and that in the simulator environment (r = 0.047). Pancreaticojejunostomy scores using the simulator were significantly lower in patients with postoperative pancreatic fistula than in those without postoperative pancreatic fistula (P = .05). Multivariate analysis showed that pancreaticojejunostomy assessment scores were independent factors in postoperative pancreatic fistula (P = .09). The participants highly rated the simulator and considered that it had the potential to be used for training. CONCLUSION There was no correlation between pancreaticojejunostomy surgical performance in the operating room and the simulation environment. Surgical skills evaluated in the simulation setting could predict patient surgical outcomes.
Collapse
Affiliation(s)
- Kenichi Mizunuma
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan
| | - Yo Kurashima
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan; Clinical Simulation Center, Hokkaido University Graduate School of Medicine, Sapporo, Japan
| | - Saseem Poudel
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan.
| | - Yusuke Watanabe
- Clinical Research and Medical Innovation Center, Institute of Health Science Innovation for Medical Care, Hokkaido University Hospital, Sapporo, Japan
| | - Takehiro Noji
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan
| | - Toru Nakamura
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan
| | - Keisuke Okamura
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan
| | - Toshiaki Shichinohe
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan
| | - Satoshi Hirano
- Department of Gastroenterologial Surgery II, Hokkaido University Faculty of Medicine, Sapporo, Japan
| |
Collapse
|
4
|
Kogan JR, Conforti LN, Holmboe ES. Faculty Perceptions of Frame of Reference Training to Improve Workplace-Based Assessment. J Grad Med Educ 2023; 15:81-91. [PMID: 36817545 PMCID: PMC9934818 DOI: 10.4300/jgme-d-22-00287.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 06/27/2022] [Accepted: 12/06/2022] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND Workplace-based assessment (WBA) is a key assessment strategy in competency-based medical education. However, its full potential has not been actualized secondary to concerns with reliability, validity, and accuracy. Frame of reference training (FORT), a rater training technique that helps assessors distinguish between learner performance levels, can improve the accuracy and reliability of WBA, but the effect size is variable. Understanding FORT benefits and challenges help improve this rater training technique. OBJECTIVE To explore faculty's perceptions of the benefits and challenges associated with FORT. METHODS Subjects were internal medicine and family medicine physicians (n=41) who participated in a rater training intervention in 2018 consisting of in-person FORT followed by asynchronous online spaced learning. We assessed participants' perceptions of FORT in post-workshop focus groups and an end-of-study survey. Focus groups and survey free text responses were coded using thematic analysis. RESULTS All subjects participated in 1 of 4 focus groups and completed the survey. Four benefits of FORT were identified: (1) opportunity to apply skills frameworks via deliberate practice; (2) demonstration of the importance of certain evidence-based clinical skills; (3) practice that improved the ability to discriminate between resident skill levels; and (4) highlighting the importance of direct observation and the dangers using proxy information in assessment. Challenges included time constraints and task repetitiveness. CONCLUSIONS Participants believe that FORT training serves multiple purposes, including helping them distinguish between learner skill levels while demonstrating the impact of evidence-based clinical skills and the importance of direct observation.
Collapse
Affiliation(s)
- Jennifer R. Kogan
- Jennifer R. Kogan, MD, is Associate Dean, Student Success and Professional Development, and Professor of Medicine, Perelman School of Medicine, University of Pennsylvania
| | - Lisa N. Conforti
- Lisa N. Conforti, MPH, is Research Associate for Milestones Evaluation, Accreditation Council for Graduate Medical Education (ACGME)
| | - Eric S. Holmboe
- Eric S. Holmboe, MD, is Chief Research, Milestone Development, and Evaluation Officer, ACGME
| |
Collapse
|
5
|
Kogan JR, Dine CJ, Conforti LN, Holmboe ES. Can Rater Training Improve the Quality and Accuracy of Workplace-Based Assessment Narrative Comments and Entrustment Ratings? A Randomized Controlled Trial. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2023; 98:237-247. [PMID: 35857396 DOI: 10.1097/acm.0000000000004819] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
PURPOSE Prior research evaluating workplace-based assessment (WBA) rater training effectiveness has not measured improvement in narrative comment quality and accuracy, nor accuracy of prospective entrustment-supervision ratings. The purpose of this study was to determine whether rater training, using performance dimension and frame of reference training, could improve WBA narrative comment quality and accuracy. A secondary aim was to assess impact on entrustment rating accuracy. METHOD This single-blind, multi-institution, randomized controlled trial of a multifaceted, longitudinal rater training intervention consisted of in-person training followed by asynchronous online spaced learning. In 2018, investigators randomized 94 internal medicine and family medicine physicians involved with resident education. Participants assessed 10 scripted standardized resident-patient videos at baseline and follow-up. Differences in holistic assessment of narrative comment accuracy and specificity, accuracy of individual scenario observations, and entrustment rating accuracy were evaluated with t tests. Linear regression assessed impact of participant demographics and baseline performance. RESULTS Seventy-seven participants completed the study. At follow-up, the intervention group (n = 41), compared with the control group (n = 36), had higher scores for narrative holistic specificity (2.76 vs 2.31, P < .001, Cohen V = .25), accuracy (2.37 vs 2.06, P < .001, Cohen V = .20) and mean quantity of accurate (6.14 vs 4.33, P < .001), inaccurate (3.53 vs 2.41, P < .001), and overall observations (2.61 vs 1.92, P = .002, Cohen V = .47). In aggregate, the intervention group had more accurate entrustment ratings (58.1% vs 49.7%, P = .006, Phi = .30). Baseline performance was significantly associated with performance on final assessments. CONCLUSIONS Quality and specificity of narrative comments improved with rater training; the effect was mitigated by inappropriate stringency. Training improved accuracy of prospective entrustment-supervision ratings, but the effect was more limited. Participants with lower baseline rating skill may benefit most from training.
Collapse
Affiliation(s)
- Jennifer R Kogan
- J.R. Kogan is associate dean, Student Success and Professional Development, and professor of medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; ORCID: https://orcid.org/0000-0001-8426-9506
| | - C Jessica Dine
- C.J. Dine is associate dean, Evaluation and Assessment, and associate professor of medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania; ORCID: https://orcid.org/0000-0001-5894-0861
| | - Lisa N Conforti
- L.N. Conforti is research associate for milestones evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois; ORCID: https://orcid.org/0000-0002-7317-6221
| | - Eric S Holmboe
- E.S. Holmboe is chief, research, milestones development and evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois; ORCID: https://orcid.org/0000-0003-0108-6021
| |
Collapse
|
6
|
Gittinger FP, Lemos M, Neumann JL, Förster J, Dohmen D, Berke B, Olmeo A, Lucas G, Jonas SM. Interrater reliability in the assessment of physiotherapy students. BMC MEDICAL EDUCATION 2022; 22:186. [PMID: 35296313 PMCID: PMC8928589 DOI: 10.1186/s12909-022-03231-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 01/17/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Reliable and objective assessment of psychomotor skills in physiotherapy students' education is essential for direct feedback and skill improvement. The aim of this study is to determine the interrater reliability in the assessment process of physiotherapy students and to analyse the assessment behaviour of the examiners. METHODS Physiotherapy teachers from two different schools assessed students from two different schools performing proprioceptive neuromuscular facilitation (PNF) patterns. An evaluation sheet with a 6-point rating scale and 20 evaluation criteria including an overall rating was used for assessment. The interrater reliability was determined calculating an intraclass-correlation coefficient (ICC) and Krippendorff's alpha. The assessment behaviour of the examiners was further analysed calculating the location parameters and showing the item response distribution over item in form of a Likert plot. RESULTS The ICC estimates were mostly below 0.4, indicating poor interrater reliability. This was confirmed by Krippendorff's alpha. The examiners showed a certain central tendency and intergroup bias. DISCUSSION AND CONCLUSION The interrater reliability in this assessment format was rather low. No difference between the two physiotherapy schools concerning the interrater reliability could be identified. Despite certain limitations of this study, there is a definite need for improvement of the assessment process in physiotherapy education to provide the students with reliable and objective feedback and ensure a certain level of professional competence in the students. TRIAL REGISTRATION The study was approved by the ethics committee of the Medical Faculty RWTH Aachen University (EK 340/16).
Collapse
Affiliation(s)
- Flora P Gittinger
- Department of Medical Informatics, Faculty of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074, Aachen, Germany.
| | - Martin Lemos
- Audiovisual Media Center, Faculty of Medicine, RWTH Aachen University, Aachen, Germany
| | - Jan L Neumann
- Schule für Physiotherapie, Uniklinik RWTH Aachen, Aachen, Germany
| | - Jürgen Förster
- Schule für Physiotherapie, Uniklinik RWTH Aachen, Aachen, Germany
| | - Daniel Dohmen
- Schule für Physiotherapie, Uniklinik RWTH Aachen, Aachen, Germany
| | - Birgit Berke
- Berufsfachschule für Physiotherapie, Grone-Bildungszentrum für Gesundheits- und Sozialberufe GmbH, Hamburg, Germany
| | - Anke Olmeo
- Berufsfachschule für Physiotherapie, Grone-Bildungszentrum für Gesundheits- und Sozialberufe GmbH, Hamburg, Germany
| | - Gisela Lucas
- Berufsfachschule für Physiotherapie, Grone-Bildungszentrum für Gesundheits- und Sozialberufe GmbH, Hamburg, Germany
| | - Stephan M Jonas
- Department of Medical Informatics, Faculty of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074, Aachen, Germany
- Department of Informatics, Technical University of Munich, Munich, Germany
- Department of Digital Health, University Hospital Bonn, Bonn, Germany
| |
Collapse
|
7
|
How can surgical skills in laparoscopic colon surgery be objectively assessed?-a scoping review. Surg Endosc 2021; 36:1761-1774. [PMID: 34873653 PMCID: PMC8847271 DOI: 10.1007/s00464-021-08914-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 11/21/2021] [Indexed: 11/08/2022]
Abstract
Background In laparoscopic colorectal surgery, higher technical skills have been associated with improved patient outcome. With the growing interest in laparoscopic techniques, pressure on surgeons and certifying bodies is mounting to ensure that operative procedures are performed safely and efficiently. The aim of the present review was to comprehensively identify tools for skill assessment in laparoscopic colon surgery and to assess their validity as reported in the literature. Methods A systematic search was conducted in EMBASE and PubMed/MEDLINE in May 2021 to identify studies examining technical skills assessment tools in laparoscopic colon surgery. Available information on validity evidence (content, response process, internal structure, relation to other variables, and consequences) was evaluated for all included tools. Results Fourteen assessment tools were identified, of which most were procedure-specific and video-based. Most tools reported moderate validity evidence. Commonly not reported were rater training, assessment correlation with variables other than training level, and validity reproducibility and reliability in external educational settings. Conclusion The results of this review show that several tools are available for evaluation of laparoscopic colon cancer surgery, but few authors present substantial validity for tool development and use. As we move towards the implementation of new techniques in laparoscopic colon surgery, it is imperative to establish validity before surgical skill assessment tools can be applied to new procedures and settings. Therefore, future studies ought to examine different aspects of tool validity, especially correlation with other variables, such as patient morbidity and pathological reports, which impact patient survival. Supplementary Information The online version contains supplementary material available at 10.1007/s00464-021-08914-z.
Collapse
|
8
|
Nayahangan LJ, Svendsen MBS, Bodtger U, Rahman N, Maskell N, Sidhu JS, Lawaetz J, Clementsen PF, Konge L. Assessment of competence in local anaesthetic thoracoscopy: development and validity investigation of a new assessment tool. J Thorac Dis 2021; 13:3998-4007. [PMID: 34422330 PMCID: PMC8339737 DOI: 10.21037/jtd-20-3560] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 02/24/2021] [Indexed: 12/03/2022]
Abstract
Background The aims of the study were to develop an assessment tool in local anaesthetic thoracoscopy (LAT), investigate validity evidence, and establish a pass/fail standard. Methods Validity evidence for the assessment tool was gathered using the unified Messick framework. The tool was developed by five experts in respiratory medicine and medical education. Doctors with varying experience performed two consecutive procedures in a standardized, simulation-based setting using a newly developed thorax/lung silicone model. Performances were video-recorded and assessed by four expert raters using the new tool. Contrasting groups’ method was used to set a pass/fail standard. Results Nine novices and 8 experienced participants were included, generating 34 recorded performances and 136 expert assessments. The tool had a high internal consistency (Cronbach’s alpha =0.94) and high inter-rater reliability (Cronbach’s alpha =0.91). The total item score significantly correlated with the global score (rs=0.86, P<0.001). Participants’ first performance correlated to second performance (test-retest reliability) with a Pearson’s r of 0.93, P<0.001. Generalisability (G) study showed a G-coefficient of 0.92 and decision (D) study estimated that one performance assessed by two raters or four performances assessed by one rater are needed to reach an acceptable reliability, i.e., G-coefficient >0.80. The tool was able to discriminate between the two groups in both performances: experienced mean score =30.8±4.2; novice mean score =15.8±2.3, P<0.001. Pass/fail standard was set at 22 points. Conclusions The newly developed assessment tool showed solid evidence of validity and can be used to ensure competence in LAT.
Collapse
Affiliation(s)
- Leizl Joy Nayahangan
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, The Capital Region of Denmark, Copenhagen, Denmark
| | - Morten Bo Søndergaard Svendsen
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, The Capital Region of Denmark, Copenhagen, Denmark
| | - Uffe Bodtger
- Department of Internal and Respiratory Medicine, Zealand University Hospital, Roskilde, Denmark.,Department of Respiratory Medicine, Næstved Hospital, Næstved, Denmark
| | - Najib Rahman
- Nuffield Department of Medicine, Oxford Respiratory Trials Unit, University of Oxford, Oxford, UK
| | - Nick Maskell
- Academic Respiratory Unit, School of Clinical Sciences, University of Bristol, Bristol, UK
| | | | - Jonathan Lawaetz
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, The Capital Region of Denmark, Copenhagen, Denmark.,Department of Vascular Surgery, Rigshospitalet, Copenhagen, Denmark
| | - Paul Frost Clementsen
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, The Capital Region of Denmark, Copenhagen, Denmark.,Department of Internal and Respiratory Medicine, Zealand University Hospital, Roskilde, Denmark.,Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Lars Konge
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, The Capital Region of Denmark, Copenhagen, Denmark.,Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
9
|
Vergis A, Leung C, Roberston R. Rater Training in Medical Education: A Scoping Review. Cureus 2020; 12:e11363. [PMID: 33304696 PMCID: PMC7721070 DOI: 10.7759/cureus.11363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
There is an increasing focus in medical education on trainee evaluation. Often, reliability and other psychometric properties of evaluations fall below expected standards. Rater training, a process whereby raters undergo instruction on how to consistently evaluate trainees and produce reliable and accurate scores, has been suggested to improve rater performance within behavioral sciences. A scoping literature review was undertaken to examine the effect of rater training in medical education and address the question: “Does rater training improve performance attending physician evaluations of medical trainees?” Two independent reviewers searched PubMed®, MEDLINE®, EMBASE™, the Cochrane Library, CINAHL®, ERIC™, and PsycInfo® databases and identified all prospective studies examining the effect of rater training on physician evaluations of medical trainees. Consolidated Standards of Reporting Trials (CONSORT) and Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklists were used to assess quality. Fourteen prospective studies met the inclusion criteria. All had heterogeneity in design, type of rater training, and measured outcomes. Pooled analysis was not performed. Four studies examined rater training used to assess technical skills; none identified a positive effect. Ten studies assessed its use to evaluate non-technical skills: six demonstrated no effect, while four showed a positive effect. The overall quality of studies was poor to moderate. Rater training in medical education literature is heterogeneous, limited, and describes minimal improvement on the psychometric properties of trainee evaluations when implemented. Further research is required to assess rater training’s efficacy in medical education.
Collapse
Affiliation(s)
- Ashley Vergis
- Surgery, St. Boniface Hospital, University of Manitoba, Winnipeg, CAN
| | - Caleb Leung
- Surgery, St. Boniface Hospital, University of Manitoba, Winnipeg, CAN
| | - Reagan Roberston
- Surgery, St. Boniface Hospital, University of Manitoba, Winnipeg, CAN
| |
Collapse
|
10
|
Robertson RL, Park J, Gillman L, Vergis A. The impact of rater training on the psychometric properties of standardized surgical skill assessment tools. Am J Surg 2020; 220:610-615. [DOI: 10.1016/j.amjsurg.2020.01.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 01/09/2020] [Accepted: 01/10/2020] [Indexed: 01/02/2023]
|
11
|
Vo TX, Juanda N, Ngu J, Gawad N, LaBelle K, Rubens FD. Development of a median sternotomy simulation model for cardiac surgery training. JTCVS Tech 2020; 2:109-116. [PMID: 34317771 PMCID: PMC8298924 DOI: 10.1016/j.xjtc.2020.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 02/04/2020] [Accepted: 03/08/2020] [Indexed: 11/15/2022] Open
Abstract
Objective We sought to develop a simulation model to train resident physicians in the performance of a median sternotomy. Methods A modified Delphi consensus process was used with cardiac surgery staff to develop a 20-point checklist for the safe performance of a median sternotomy. Thirteen junior cardiac surgery trainees from across Canada participated in this study to assess the simulation model. Trainees performed the sternotomy before and after reviewing an instructional video. Two senior cardiac surgery resident physicians assessed the participants with the checklist during each session. An entry and exit questionnaire was given to the participants to evaluate the simulation model. Results Participants scored higher after the training (14.3 ± 2.0) compared with before training (8.0 ± 3.1) (P < .001). The mean duration of time for participants to complete the sternotomy was shorter before training (188 ± 52 seconds vs 228 ± 58 seconds; P = .003). The checklist interrater reliability was κ = 0.47 (moderate) for before training and κ = 0.37 (fair) for after training. All study participants rated the simulation sessions as very useful or extremely useful. Conclusions Using the simulation model, training video, and checklist, trainees were able to improve their skill in performing a median sternotomy. This improvement was associated with longer times to complete all procedure steps. Rater training may further improve interrater reliability. Our median sternotomy checklist and simulation model can be adopted for the technical skills training of future cardiac surgery trainees.
Collapse
Affiliation(s)
- Thin Xuan Vo
- Division of Cardiac Surgery, Department of Surgery, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Nadzir Juanda
- Division of Cardiac Surgery, Department of Surgery, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Janet Ngu
- Division of Cardiac Surgery, Department of Surgery, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | - Nada Gawad
- Division of General Surgery, Department of Surgery, University of Ottawa, Ottawa, Ontario, Canada
| | - Kathy LaBelle
- University of Ottawa Skills and Simulation Centre, Ottawa, Ontario, Canada
| | - Fraser D. Rubens
- Division of Cardiac Surgery, Department of Surgery, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
- Address for reprints: Fraser D. Rubens, MD, MSc, FACS, FRCSC, University of Ottawa Heart Institute, 40 Ruskin St, Ottawa, Ontario K1Y 4W7, Canada.
| |
Collapse
|
12
|
Jokinen E, Mikkola TS, Härkki P. Simulator training and residents' first laparoscopic hysterectomy: a randomized controlled trial. Surg Endosc 2019; 34:4874-4882. [PMID: 31768724 PMCID: PMC7572324 DOI: 10.1007/s00464-019-07270-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 11/12/2019] [Indexed: 11/26/2022]
Abstract
BACKGROUND Hysterectomy rates are decreasing in many countries, and virtual reality simulators bring new opportunities into residents' surgical education. The objective of this study was to evaluate the effect of training in laparoscopic hysterectomy module with virtual reality simulator on surgical outcomes among residents performing their first laparoscopic hysterectomy. METHODS This randomized study was carried out at the Department of Obstetrics and Gynecology in Helsinki University Hospital and Hyvinkää Hospital. We recruited twenty residents and randomly signed half of them to train ten times with the laparoscopic hysterectomy module on a virtual reality simulator, while the rest represented the control group. Their first laparoscopic hysterectomy was video recorded and assessed later by using the Objective Structured Assessment of Technical Skills (OSATS) forms and Visual Analog Scale (VAS). The scores and surgical outcomes were compared between the groups. RESULTS The mean OSATS score for the Global Rating Scale (GRS) was 17.0 (SD 3.1) in the intervention group and 11.2 (SD 2.4) in the control group (p = 0.002). The mean procedure-specific OSATS score was 20.0 (SD 3.3) and 16.0 (SD 2.8) (p = 0.012), and the mean VAS score was 55.0 (SD 14.8) and 29.9 (SD 14.9) (p = 0.001). Operative time was 144 min in the intervention group and 165 min in the control group, but the difference did not reach statistical significance (p = 0.205). There were no differences between the groups in blood loss or direct complications. CONCLUSION Residents training with a virtual reality simulator prior to the first laparoscopic hysterectomy seem to perform better in the actual live operation. Thus, a virtual reality simulator hysterectomy module could be considered as a part of laparoscopic training curriculum.
Collapse
Affiliation(s)
- Ewa Jokinen
- Obstetrics and Gynecology, University of Helsinki and Helsinki University Hospital, Haartmaninkatu 2, P.O. Box 140, 00029 HUS, Helsinki, Finland.
| | - Tomi S Mikkola
- Obstetrics and Gynecology, University of Helsinki and Helsinki University Hospital, Haartmaninkatu 2, P.O. Box 140, 00029 HUS, Helsinki, Finland
| | - Päivi Härkki
- Obstetrics and Gynecology, University of Helsinki and Helsinki University Hospital, Haartmaninkatu 2, P.O. Box 140, 00029 HUS, Helsinki, Finland
| |
Collapse
|