1
|
Gui S, Wang Z, Chen J, Zhou X, Zhang C, Cao Y. MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1628-1639. [PMID: 38127608 DOI: 10.1109/tmi.2023.3345736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split.
Collapse
|
2
|
Zang C, Turkcan MK, Narasimhan S, Cao Y, Yarali K, Xiang Z, Szot S, Ahmad F, Choksi S, Bitner DP, Filicori F, Kostic Z. Surgical Phase Recognition in Inguinal Hernia Repair-AI-Based Confirmatory Baseline and Exploration of Competitive Models. Bioengineering (Basel) 2023; 10:654. [PMID: 37370585 DOI: 10.3390/bioengineering10060654] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 05/18/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023] Open
Abstract
Video-recorded robotic-assisted surgeries allow the use of automated computer vision and artificial intelligence/deep learning methods for quality assessment and workflow analysis in surgical phase recognition. We considered a dataset of 209 videos of robotic-assisted laparoscopic inguinal hernia repair (RALIHR) collected from 8 surgeons, defined rigorous ground-truth annotation rules, then pre-processed and annotated the videos. We deployed seven deep learning models to establish the baseline accuracy for surgical phase recognition and explored four advanced architectures. For rapid execution of the studies, we initially engaged three dozen MS-level engineering students in a competitive classroom setting, followed by focused research. We unified the data processing pipeline in a confirmatory study, and explored a number of scenarios which differ in how the DL networks were trained and evaluated. For the scenario with 21 validation videos of all surgeons, the Video Swin Transformer model achieved ~0.85 validation accuracy, and the Perceiver IO model achieved ~0.84. Our studies affirm the necessity of close collaborative research between medical experts and engineers for developing automated surgical phase recognition models deployable in clinical settings.
Collapse
Affiliation(s)
- Chengbo Zang
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Mehmet Kerem Turkcan
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Sanjeev Narasimhan
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Yuqing Cao
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Kaan Yarali
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Zixuan Xiang
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Skyler Szot
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| | - Feroz Ahmad
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Sarah Choksi
- Intraoperative Performance Analytics Laboratory (IPAL), Lenox Hill Hospital, New York, NY 10021, USA
| | - Daniel P Bitner
- Intraoperative Performance Analytics Laboratory (IPAL), Lenox Hill Hospital, New York, NY 10021, USA
| | - Filippo Filicori
- Intraoperative Performance Analytics Laboratory (IPAL), Lenox Hill Hospital, New York, NY 10021, USA
- Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY 11549, USA
| | - Zoran Kostic
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA
| |
Collapse
|