1
|
Grüter AAJ, Toorenvliet BR, Tanis PJ, Tuynman JB. Video-based surgical quality assessment of minimally invasive right hemicolectomy by medical students after specific training. Surgery 2025; 178:108951. [PMID: 39617647 DOI: 10.1016/j.surg.2024.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 10/21/2024] [Accepted: 11/02/2024] [Indexed: 01/11/2025]
Abstract
BACKGROUND Recently, a competency assessment tool has been developed within the RIGHT project, a national quality improvement program for minimally invasive right hemicolectomy in patients with colon cancer. This study aimed to evaluate whether trained medical students can reliably evaluate minimally invasive right hemicolectomy videos using a competency assessment tool. METHODS Nine expert colorectal surgeons, 13 trained medical students, and 17 untrained medical students assessed the surgical quality of 6 full-length minimally invasive right hemicolectomy videos with the competency assessment tool. The expert surgeons were trained using the competency assessment tool by the RIGHT project leaders, who were also involved in the development and validation of the competency assessment tool. Training for medical students included anatomy, step-by-step procedure explanation, and competency assessment tool review with 2 supervised video assessments. The untrained students were taught only anatomy and minimally invasive right hemicolectomy steps. The intraclass correlation coefficient was calculated to determine inter-rater reliability, and analysis of variance with the Bonferroni correction for multiple testing was used to assess potential differences between the groups per video. RESULTS The trained students demonstrated an overall excellent inter-rater reliability (intraclass correlation coefficient score of 0.885). When their scores were combined with those of the expert surgeons, a high inter-rater reliability was also demonstrated (intraclass correlation coefficient score of 0.945). Trained students consistently aligned with surgeons' mean total scores, also accurately identifying lower quality surgeries. Untrained students assigned statistically significantly higher scores to the 3 lower quality surgeries as compared with expert surgeons and trained students. CONCLUSION Among trained students, excellent inter-rater reliability and concordance with expert colorectal surgeons was found. The study highlights the potential to engage trained medical students for objective minimally invasive right hemicolectomy video assessment.
Collapse
Affiliation(s)
- Alexander A J Grüter
- Department of Surgery, Amsterdam UMC location Vrije Universiteit Amsterdam, the Netherlands; Treatment and Quality of Life, Cancer Center Amsterdam, the Netherlands.
| | | | - Pieter J Tanis
- Department of Surgery, Amsterdam UMC location University of Amsterdam, the Netherlands; Department of Surgery, Erasmus MC, Rotterdam, the Netherlands
| | - Jurriaan B Tuynman
- Department of Surgery, Amsterdam UMC location Vrije Universiteit Amsterdam, the Netherlands. https://twitter.com/JurriaanTuynman
| |
Collapse
|
2
|
Jørgensen RJ, Olsen RG, Svendsen MBS, Stadeager M, Konge L, Bjerrum F. Comparing Simulator Metrics and Rater Assessment of Laparoscopic Suturing Skills. JOURNAL OF SURGICAL EDUCATION 2023; 80:302-310. [PMID: 37683093 DOI: 10.1016/j.jsurg.2022.09.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 08/17/2022] [Accepted: 09/25/2022] [Indexed: 09/10/2023]
Abstract
BACKGROUND Laparoscopic intracorporeal suturing is important to master and competence should be ensured using an optimal method in a simulated environment before proceeding to real operations. The objectives of this study were to gather validity evidence for two tools for assessing laparoscopic intracorporeal knot tying and compare the rater-based assessment of laparoscopic intracorporeal suturing with the assessment based on simulator metrics. METHODS Twenty-eight novices and 19 experienced surgeons performed four laparoscopic sutures on a Simball Box simulator twice. Two surgeons used the Intracorporeal Suturing Assessment Tool (ISAT) for blinded video rating. RESULTS Composite Simulator Score (CSS) had higher test-retest reliability than the ISAT. The correlation between the number performed procedures including suturing and ISAT score was 0.51, p<0.001, and 0.59 p<0.001 for CSS. We found an inter-rater reliability (0.72, p<0.001 for test 1 and 0.53 p<0.001 for test 2). The pass/fail rates for ISAT and CSS were similar. CONCLUSION CSS and ISAT provide similar results for assessing laparoscopic suturing but assess different aspects of performance. Using simulator metrics and raters' assessments in combination should be considered for a more comprehensive evaluation of laparoscopic knot-tying competency.
Collapse
Affiliation(s)
- Rikke Jeong Jørgensen
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, Capital Region, Copenhagen, Denmark.
| | - Rikke Groth Olsen
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, Capital Region, Copenhagen, Denmark
| | - Morten Bo Søndergaard Svendsen
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, Capital Region, Copenhagen, Denmark
| | - Morten Stadeager
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, Capital Region, Copenhagen, Denmark; Department of Surgery, Hvidovre Hospital, Copenhagen University Hospital, Copenhagen, Denmark
| | - Lars Konge
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, Capital Region, Copenhagen, Denmark; University of Copenhagen, Copenhagen, Denmark
| | - Flemming Bjerrum
- Copenhagen Academy for Medical Education and Simulation, Centre for HR and Education, Capital Region, Copenhagen, Denmark; Department of Surgery, Herlev-Gentofte Hospital, Herlev, Denmark
| |
Collapse
|
3
|
Jogerst KM, Park YS, Anteby R, Sinyard R, Coe TM, Cassidy D, McKinley SK, Petrusa E, Phitayakorn R, Mohapatra A, Gee DW. Impact of Rater Training on Residents Technical Skill Assessments: A Randomized Trial. JOURNAL OF SURGICAL EDUCATION 2022; 79:e225-e234. [PMID: 36333174 DOI: 10.1016/j.jsurg.2022.09.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 08/28/2022] [Accepted: 09/23/2022] [Indexed: 06/16/2023]
Abstract
OBJECTIVE The ACS/APDS Resident Skills Curriculum's Objective Structured Assessment of Technical Skills (OSATS) consists of task-specific checklists and a global rating scale (GRS) completed by raters. Prior work demonstrated a need for rater training. This study evaluates the impact of a rater-training curriculum on scoring discrimination, consistency, and validity for handsewn bowel anastomosis (HBA) and vascular anastomosis (VA). DESIGN/ METHODS A rater training video model was developed, which included a GRS orientation and anchoring performances representing the range of potential scores. Faculty raters were randomized to rater training or no rater training and were asked to score videos of resident HBA/VA. Consensus scores were assigned to each video using a modified Delphi process (Gold Score). Trained and untrained scores were analyzed for discrimination and score spread and compared to the Gold Score for relative agreement. RESULTS Eight general and eight vascular surgery faculty were randomized to score 24 HBA/VA videos. Rater training increased rater discrimination and decreased rating scale shrinkage for both VA (mean trained score: 2.83, variance 1.88; mean untrained score: 3.1, variance 1.14, p = 0.007) and HBA (mean trained score: 3.52, variance 1.44; mean untrained score: 3.42, variance 0.96, p = 0.033). On validity analyses, a comparison between each rater group vs Gold Score revealed a moderate training impact for VA, trained κ=0.65 vs untrained κ=0.57 and no impact for HBA, R1 κ = 0.71 vs R2 κ = 0.73. CONCLUSION A rater-training curriculum improved raters' ability to differentiate performance levels and use a wider range of the scoring scale. However, despite rater training, there was persistent disagreement between faculty GRS scores with no groups reaching the agreement threshold for formative assessment. If technical skill exams are incorporated into high stakes assessments, consensus ratings via a standard setting process are likely a more valid option than individual faculty ratings.
Collapse
Affiliation(s)
- Kristen M Jogerst
- Department of General Surgery, Mayo Clinic Arizona, Phoenix, Arizona; Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Yoon Soo Park
- Department of Emergency Medicine, Massachusetts General Hospital, Boston, Massachusetts
| | - Roi Anteby
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Robert Sinyard
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Taylor M Coe
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Douglas Cassidy
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Sophia K McKinley
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Emil Petrusa
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Roy Phitayakorn
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Abhisekh Mohapatra
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts
| | - Denise W Gee
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts.
| |
Collapse
|
4
|
Jogerst KM, Eurboonyanun C, Park YS, Cassidy D, McKinley SK, Hamdi I, Phitayakorn R, Petrusa E, Gee DW. Implementation of the ACS/ APDS Resident Skills Curriculum reveals a need for rater training: An analysis using generalizability theory. Am J Surg 2021; 222:541-548. [PMID: 33516415 DOI: 10.1016/j.amjsurg.2021.01.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 12/16/2020] [Accepted: 01/11/2021] [Indexed: 11/25/2022]
Abstract
BACKGROUND The American College of Surgeons (ACS)/Association of Program Directors in Surgery (APDS) Resident Skills Curriculum includes validated task-specific checklists and global rating scales (GRS) for Objective Structured Assessment of Technical Skills (OSATS). However, it does not include instructions on use of these assessment tools. Since consistency of ratings is a key feature of assessment, we explored rater reliability for two skills. METHODS Surgical faculty assessed hand-sewn bowel and vascular anastomoses in real-time using the OSATS GRS. OSATS were video-taped and independently evaluated by a research resident and surgical attending. Rating consistency was estimated using intraclass correlation coefficients (ICC) and generalizability analysis. RESULTS Three-rater ICC coefficients across 24 videos ranged from 0.12 to 0.75. Generalizability reliability coefficients ranged from 0.55 to 0.8. Percent variance attributable to raters ranged from 2.7% to 32.1%. Pairwise agreement showed considerable inconsistency for both tasks. CONCLUSIONS Variability of ratings for these two skills indicate the need for rater training to increase scoring agreement and decrease rater variability for technical skill assessments.
Collapse
Affiliation(s)
- Kristen M Jogerst
- Department of General Surgery, Mayo Clinic Arizona, 5777 E. Mayo Blvd, Phoenix, AZ, 85054, USA; Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Chalerm Eurboonyanun
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA; Department of Surgery, Khon Kaen University, 123 Tambon Sila, Mueang Khon Kaen District, Khon Kaen 40002, Thailand.
| | - Yoon Soo Park
- Department of Emergency Medicine, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Douglas Cassidy
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Sophia K McKinley
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Isra Hamdi
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Roy Phitayakorn
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Emil Petrusa
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| | - Denise W Gee
- Department of Surgery, Massachusetts General Hospital, 55 Fruit St, Boston, MA, 02114, USA.
| |
Collapse
|
5
|
Pugh CM, Hashimoto DA, Korndorffer JR. The what? How? And Who? Of video based assessment. Am J Surg 2020; 221:13-18. [PMID: 32665080 DOI: 10.1016/j.amjsurg.2020.06.027] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 06/19/2020] [Indexed: 01/25/2023]
Abstract
BACKGROUND Currently, there is significant variability in the development, implementation and overarching goals of video review for assessment of surgical performance. METHODS This paper evaluates the current methods in which video review is used for evaluation of surgical performance and identifies which processes are critical for successful, widespread implementation of video-based assessment. RESULTS Despite the advances in video capture technology and growing interest in video-based assessment, there is a notable gap in the implementation and longitudinal use of formative and summative assessment using video. CONCLUSION Validity, scalability and discoverability are current but removable barriers to video-based assessment.
Collapse
Affiliation(s)
- Carla M Pugh
- Department of Surgery, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA, 94305, USA.
| | - Daniel A Hashimoto
- Department of Surgery, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA.
| | - James R Korndorffer
- Department of Surgery, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA, 94305, USA.
| |
Collapse
|
6
|
Patnaik R, Anton NE, Stefanidis D. A video anchored rating scale leads to high inter-rater reliability of inexperienced and expert raters in the absence of rater training. Am J Surg 2020; 219:221-226. [PMID: 31918843 PMCID: PMC10495932 DOI: 10.1016/j.amjsurg.2019.12.026] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 11/01/2019] [Accepted: 12/21/2019] [Indexed: 11/26/2022]
Abstract
BACKGROUND Our objective was to assess the impact of incorporating videos in a behaviorally anchored performance rating scale on the inter-rater reliability (IRR) of expert, intermediate and novice raters. METHODS The Intra-corporeal Suturing Assessment Tool (ISAT) was modified to include short video clips demonstrating poor, average, and expert performances. Blinded raters used this tool to assess videos of trainees performing suturing on a porcine model. Three attending surgeons, 4 residents, and 4 novice raters participated; no rater training was provided. The IRR was then compared among rater groups. RESULTS The IRR using the modified ISAT was high at 0.80 (p < 0.001). Ratings were significantly correlated with trainee objective suturing scores for all rater groups (experts: R = 0.84, residents: R = 0.81, and novices: R = 0.69; p < 0.001). CONCLUSIONS Incorporating video anchors (to define performance) in the ISAT led to high IRR and enabled novices to achieve similar consistency in their ratings as experts.
Collapse
Affiliation(s)
- Ronit Patnaik
- Indiana University School of Medicine, Department of Surgery, 545 Barnhill Dr. Emerson Hall, Indianapolis, IN, 46202, USA.
| | - Nicholas E Anton
- Indiana University School of Medicine, Department of Surgery, 545 Barnhill Dr. Emerson Hall, Indianapolis, IN, 46202, USA.
| | - Dimitrios Stefanidis
- Indiana University School of Medicine, Department of Surgery, 545 Barnhill Dr. Emerson Hall, Indianapolis, IN, 46202, USA.
| |
Collapse
|
7
|
Bilgic E, Alyafi M, Hada T, Landry T, Fried GM, Vassiliou MC. Simulation platforms to assess laparoscopic suturing skills: a scoping review. Surg Endosc 2019; 33:2742-2762. [PMID: 31089881 DOI: 10.1007/s00464-019-06821-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 05/03/2019] [Indexed: 11/30/2022]
Abstract
BACKGROUND Laparoscopic suturing (LS) has become a common technique used in a variety of advanced laparoscopic procedures. However, LS is a challenging skill to master, and many trainees may not be competent in performing LS at the end of their training. The purpose of this review is to identify simulation platforms available for assessment of LS skills, and determine the characteristics of the platforms and the LS skills that are targeted. METHODS A scoping review was conducted between January 1997 and October 2018 for full-text articles. The search was done in various databases. Only articles written in English or French were included. Additional studies were identified through reference lists. The search terms included "laparoscopic suturing" and "clinical competence." RESULTS Sixty-two studies were selected. The majority of the simulation platforms were box trainers with inanimate tissue, and targeted basic suturing and intracorporeal knot-tying techniques. Most of the validation came from internal structure (rater reliability) and relationship to other variables (compare training levels/case experience, and various metrics). Consequences were not addressed in any of the studies. CONCLUSION We identified many types of simulation platforms that were used for assessing LS skills, with most being for assessment of basic skills. Platforms assessing the competence of trainees for advanced LS skills were limited. Therefore, future research should focus on development of LS tasks that better reflect the needs of the trainees.
Collapse
Affiliation(s)
- Elif Bilgic
- Steinberg Centre for Simulation and Interactive Learning, McGill University, Montreal, QC, Canada
- Steinberg-Bernstein Centre for Minimally Invasive Surgery and Innovation, McGill University Health Centre, 1650, Cedar Avenue, L9. 313, Montreal, QC, H3G 1A4, Canada
| | - Motaz Alyafi
- Steinberg-Bernstein Centre for Minimally Invasive Surgery and Innovation, McGill University Health Centre, 1650, Cedar Avenue, L9. 313, Montreal, QC, H3G 1A4, Canada
| | - Tomonori Hada
- Steinberg-Bernstein Centre for Minimally Invasive Surgery and Innovation, McGill University Health Centre, 1650, Cedar Avenue, L9. 313, Montreal, QC, H3G 1A4, Canada
| | - Tara Landry
- Montreal General Hospital Medical Library, McGill University Health Centre, Montreal, QC, Canada
| | - Gerald M Fried
- Steinberg-Bernstein Centre for Minimally Invasive Surgery and Innovation, McGill University Health Centre, 1650, Cedar Avenue, L9. 313, Montreal, QC, H3G 1A4, Canada
| | - Melina C Vassiliou
- Steinberg-Bernstein Centre for Minimally Invasive Surgery and Innovation, McGill University Health Centre, 1650, Cedar Avenue, L9. 313, Montreal, QC, H3G 1A4, Canada.
| |
Collapse
|
8
|
Cox ML, Risucci DA, Gilmore BF, Nag UP, Turner MC, Sprinkle SR, Migaly J, Sudan R. Validation of the Omni: A Novel, Multimodality, and Longitudinal Surgical Skills Assessment. JOURNAL OF SURGICAL EDUCATION 2018; 75:e218-e228. [PMID: 30522827 PMCID: PMC10765322 DOI: 10.1016/j.jsurg.2018.10.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 10/17/2018] [Accepted: 10/21/2018] [Indexed: 06/09/2023]
Abstract
OBJECTIVE The breadth of technical skills included in general surgery training continues to expand. The current competency-based training model requires assessment tools to measure acquisition, learning, and mastery of technical skill longitudinally in a reliable and valid manner. This study describes a novel skills assessment tool, the Omni, which evaluates performance in a broad range of skills over time. DESIGN The 5 Omni tasks, consisting of open bowel anastomosis, knot tying, laparoscopic clover pattern cut, robotic needle drive, and endoscopic bubble pop, were developed by general surgery faculty. Component performance metrics assessed speed, accuracy, and quality, which were scaled into an overall score ranging from 0 to 10 for each task. For each task, ANOVAs with Scheffé's post hoc comparisons and Pearson's chi-squared tests compared performance between 6 resident cohorts (clinical years (CY1-5) and research fellows (RF)). Paired samples t-tests evaluated changes in performance across academic years. Cronbach's alpha coefficient determined the internal consistency of the Omni as an overall assessment. SETTING The Omni was developed by the Department of Surgery at Duke University. Annual assessment and this research study took place in the Surgical Education and Activities Lab. PARTICIPANTS All active general surgery residents in 2 consecutive academic years spanning 2015 to 2017. RESULTS A total of 62 general surgery residents completed the Omni and 39 (67.2%) of those residents completed the assessment in 2 consecutive years. Based on data from all residents' first assessment, statistically significant differences (p < 0.05) were observed among CY cohorts for bowel anastomosis, robotic, and laparoscopic task metrics. By pair-wise comparisons, mean bowel anastomosis scores distinguished CY1 from CY3-5 and CY2 from CY5. Mean robotic scores distinguished CY1 from RF, and mean laparoscopic scores distinguished CY1 from RF, CY3, and CY5 in addition to CY2 from CY3. Mean scores in performance on the knot tying and endoscopic tasks were not significantly different. Statistically significant improvement in mean scores was observed for all tasks from year 1 to year 2 (all p < 0.02). The internal consistency analysis revealed an alpha coefficient of 0.656. CONCLUSIONS The Omni is a novel composite assessment tool for surgical technical skill that utilizes objective measures and scoring algorithms to evaluate performance. In this pilot study, 3 tasks demonstrated discriminative ability of performance by CY, and all 5 tasks demonstrated construct validity by showing longitudinal improvement in performance. Additionally, the Omni has adequate internal consistency for a formative assessment. These results suggest the Omni holds promise for the evaluation of resident technical skill and early identification of outliers requiring intervention.
Collapse
Key Words
- ABS, American Board of Surgery
- ACS, American College of Surgeons
- APDS, Association of Program Directors in Surgery
- CY, clinical year
- FES, Fundamentals of Endoscopic Surgery
- FLS, Fundamentals of Laparoscopic Surgery
- General surgery
- Medical Knowledge
- OSATS, Objective Structured Assessment of Technical Skills
- Omni
- Patient Care
- Practice-Based Learning and Improvement
- REDCap, Research Electronic Data Capture
- RF, research fellow
- Resident
- SD, standard deviation
- Skills assessment
- df, degrees of freedom
Collapse
Affiliation(s)
- Morgan L Cox
- Department of Surgery, Duke University, Durham, North Carolina.
| | | | - Brian F Gilmore
- Department of Surgery, Duke University, Durham, North Carolina
| | - Uttara P Nag
- Department of Surgery, Duke University, Durham, North Carolina
| | - Megan C Turner
- Department of Surgery, Duke University, Durham, North Carolina
| | | | - John Migaly
- Department of Surgery, Duke University, Durham, North Carolina
| | - Ranjan Sudan
- Department of Surgery, Duke University, Durham, North Carolina
| |
Collapse
|