1
|
Hein J, Cavalcanti N, Suter D, Zingg L, Carrillo F, Calvet L, Farshad M, Navab N, Pollefeys M, Fürnstahl P. Next-generation surgical navigation: Marker-less multi-view 6DoF pose estimation of surgical instruments. Med Image Anal 2025; 103:103613. [PMID: 40381257 DOI: 10.1016/j.media.2025.103613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 04/10/2025] [Accepted: 04/18/2025] [Indexed: 05/20/2025]
Abstract
State-of-the-art research of traditional computer vision is increasingly leveraged in the surgical domain. A particular focus in computer-assisted surgery is to replace marker-based tracking systems for instrument localization with pure image-based 6DoF pose estimation using deep-learning methods. However, state-of-the-art single-view pose estimation methods do not yet meet the accuracy required for surgical navigation. In this context, we investigate the benefits of multi-view setups for highly accurate and occlusion-robust 6DoF pose estimation of surgical instruments and derive recommendations for an ideal camera system that addresses the challenges in the operating room. Our contributions are threefold. First, we present a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured with static and head-mounted cameras and including rich annotations for surgeon, instruments, and patient anatomy. Second, we perform an extensive evaluation of three state-of-the-art single-view and multi-view pose estimation methods, analyzing the impact of camera quantities and positioning, limited real-world data, and static, hybrid, or fully mobile camera setups on the pose accuracy, occlusion robustness, and generalizability. Third, we design a multi-camera system for marker-less surgical instrument tracking, achieving an average position error of 1.01mm and orientation error of 0.89° for a surgical drill, and 2.79mm and 3.33° for a screwdriver under optimal conditions. Our results demonstrate that marker-less tracking of surgical instruments is becoming a feasible alternative to existing marker-based systems.
Collapse
Affiliation(s)
- Jonas Hein
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, Zurich, Switzerland; Computer Vision and Geometry Group, ETH Zurich, Zurich, Switzerland.
| | - Nicola Cavalcanti
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| | - Daniel Suter
- Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| | - Lukas Zingg
- Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| | - Fabio Carrillo
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, Zurich, Switzerland; OR-X Translational Center for Surgery, Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| | - Lilian Calvet
- OR-X Translational Center for Surgery, Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| | - Mazda Farshad
- Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| | - Nassir Navab
- Computer Aided Medical Procedures, Technical University Munich, Munich, Germany
| | - Marc Pollefeys
- Computer Vision and Geometry Group, ETH Zurich, Zurich, Switzerland
| | - Philipp Fürnstahl
- Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, Zurich, Switzerland; OR-X Translational Center for Surgery, Balgrist University Hospital, University of Zurich, Zurich, Switzerland
| |
Collapse
|
2
|
Power D, Burke C, Madden MG, Ullah I. Automated assessment of simulated laparoscopic surgical skill performance using deep learning. Sci Rep 2025; 15:13591. [PMID: 40253514 PMCID: PMC12009314 DOI: 10.1038/s41598-025-96336-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 03/27/2025] [Indexed: 04/21/2025] Open
Abstract
Artificial intelligence (AI) has the potential to improve healthcare and patient safety and is currently being adopted across various fields of medicine and healthcare. AI and in particular computer vision (CV) are well suited to the analysis of minimally invasive surgical simulation videos for training and performance improvement. CV techniques have rapidly improved in recent years from accurately recognizing objects, instruments, and gestures to phases of surgery and more recently to remembering past surgical steps. Lack of labeled data is a particular problem in surgery considering its complexity, as human annotation and manual assessment are both expensive in time and cost, and in most cases rely on direct intervention of clinical expertise. In this study, we introduce a newly collected simulated Laparoscopic Surgical Performance Dataset (LSPD) specifically designed to address these challenges. Unlike existing datasets that focus on instrument tracking or anatomical structure recognition, the LSPD is tailored for evaluating simulated laparoscopic surgical skill performance at various expertise levels. We provide detailed statistical analyses to identify and compare poorly performed and well-executed operations across different skill levels (novice, trainee, expert) for three specific skills: stack, bands, and tower. We employ a 3-dimensional convolutional neural network (3DCNN) with a weakly-supervised approach to classify the experience levels of surgeons. Our results show that the 3DCNN effectively distinguishes between novices, trainees, and experts, achieving an F1 score of 0.91 and an AUC of 0.92. This study highlights the value of the LSPD dataset and demonstrates the potential of leveraging 3DCNN-based and weakly-supervised approaches to automate the evaluation of surgical performance, reducing reliance on manual expert annotation and assessments. These advancements contribute to improving surgical training and performance analysis.
Collapse
Affiliation(s)
- David Power
- ASSERT Centre, College of Medicine and Health, University College Cork, Cork, Ireland.
| | - Cathy Burke
- Cork University Maternity Hospital, Cork, Ireland
| | - Michael G Madden
- School of Computer Science, University of Galway, Galway, Ireland
- Insight Research Ireland Centre for Data Analytics and Data Science Institute, University of Galway, Galway, Ireland
| | - Ihsan Ullah
- School of Computer Science, University of Galway, Galway, Ireland
- Insight Research Ireland Centre for Data Analytics and Data Science Institute, University of Galway, Galway, Ireland
| |
Collapse
|
3
|
Magro M, Covallero N, Gambaro E, Ruffaldi E, De Momi E. A dual-instrument Kalman-based tracker to enhance robustness of microsurgical tools tracking. Int J Comput Assist Radiol Surg 2024; 19:2351-2362. [PMID: 39133431 DOI: 10.1007/s11548-024-03246-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 07/26/2024] [Indexed: 08/13/2024]
Abstract
PURPOSE The integration of a surgical robotic instrument tracking module within optical microscopes holds the potential to advance microsurgery practices, as it facilitates automated camera movements, thereby augmenting the surgeon's capability in executing surgical procedures. METHODS In the present work, an innovative detection backbone based on spatial attention module is implemented to enhance the detection accuracy of small objects within the image. Additionally, we have introduced a robust data association technique, capable to re-track surgical instrument, mainly based on the knowledge of the dual-instrument robotics system, Intersection over Union metric and Kalman filter. RESULTS The effectiveness of this pipeline was evaluated through testing on a dataset comprising ten manually annotated videos of anastomosis procedures involving either animal or phantom vessels, exploiting the Symani®Surgical System-a dedicated robotic platform designed for microsurgery. The multiple object tracking precision (MOTP) and the multiple object tracking accuracy (MOTA) are used to evaluate the performance of the proposed approach, and a new metric is computed to demonstrate the efficacy in stabilizing the tracking result along the video frames. An average MOTP of 74±0.06% and a MOTA of 99±0.03% over the test videos were found. CONCLUSION These results confirm the potential of the proposed approach in enhancing precision and reliability in microsurgical instrument tracking. Thus, the integration of attention mechanisms and a tailored data association module could be a solid base for automatizing the motion of optical microscopes.
Collapse
Affiliation(s)
- Mattia Magro
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
- Medical Microinstruments, Inc., Wilmington, USA.
| | | | | | | | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
4
|
Wang H, Yang G, Zhang S, Qin J, Guo Y, Xu B, Jin Y, Zhu L. Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:4457-4469. [PMID: 38990752 DOI: 10.1109/tmi.2024.3426953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (https://github.com/whq-xxh/RSVIS).
Collapse
|
5
|
Pan X, Bi M, Wang H, Ma C, He X. DBH-YOLO: a surgical instrument detection method based on feature separation in laparoscopic surgery. Int J Comput Assist Radiol Surg 2024; 19:2215-2225. [PMID: 38613730 DOI: 10.1007/s11548-024-03115-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 03/18/2024] [Indexed: 04/15/2024]
Abstract
PURPOSE Accurately locating and analysing surgical instruments in laparoscopic surgical videos can assist doctors in postoperative quality assessment. This can provide patients with more scientific and rational solutions for healing surgical complications. Therefore, we propose an end-to-end algorithm for the detection of surgical instruments. METHODS Dual-Branched Head (DBH) and Overall Intersection over Union Loss (OIoU Loss) are introduced to solve the problem of inaccurate surgical instrument detection, both in terms of localization and classification. An effective method (DBHYOLO) for the detection for laparoscopic surgery in complex scenarios is proposed. This study manually annotates a new laparoscopic gastric cancer resection surgical instrument location dataset LGIL, which provides a better validation platform for surgical instrument detection methods. RESULTS The proposed method's performance was tested using the m2cai16-tool-locations, LGIL, and Onyeogulu datasets. The mean Average Precision (mAP) values obtained were 96.8%, 95.6%, and 98.4%, respectively, which were higher than the other classical models compared. The improved model is more effective than the benchmark network in distinguishing between surgical instrument classes with high similarity and avoiding too many missed detection cases. CONCLUSIONS In this paper, the problem of inaccurate detection of surgical instruments is addressed from two different perspectives: classification and localization. And the experimental results on three representative datasets verify the performance of DBH-YOLO. It is shown that this method has a good generalization capability.
Collapse
Affiliation(s)
- Xiaoying Pan
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, GuoDu, Xi'an, 710121, Shaanxi, China.
| | - Manrong Bi
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, GuoDu, Xi'an, 710121, Shaanxi, China
| | - Hao Wang
- School of Software, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Chenyang Ma
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, GuoDu, Xi'an, 710121, Shaanxi, China
| | - Xianli He
- Department of General Surgery, Tangdu Hospital, Air Force Medical University, Xi'an, 710038, Shaanxi, China
| |
Collapse
|
6
|
Lin W, Hu Y, Fu H, Yang M, Chng CB, Kawasaki R, Chui C, Liu J. Instrument-Tissue Interaction Detection Framework for Surgical Video Understanding. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2803-2813. [PMID: 38530715 DOI: 10.1109/tmi.2024.3381209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/28/2024]
Abstract
Instrument-tissue interaction detection task, which helps understand surgical activities, is vital for constructing computer-assisted surgery systems but with many challenges. Firstly, most models represent instrument-tissue interaction in a coarse-grained way which only focuses on classification and lacks the ability to automatically detect instruments and tissues. Secondly, existing works do not fully consider relations between intra- and inter-frame of instruments and tissues. In the paper, we propose to represent instrument-tissue interaction as 〈 instrument class, instrument bounding box, tissue class, tissue bounding box, action class 〉 quintuple and present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding. Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to incorporate features of proposals between adjacent frames through spatial encoding. To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance. For evaluation, we build a cataract surgery video (PhacoQ) dataset and a cholecystectomy surgery video (CholecQ) dataset. Experimental results demonstrate the promising performance of our model, which outperforms other state-of-the-art models on both datasets.
Collapse
|
7
|
Zhang Y, Ye X, Wu W, Luo Y, Chen M, Du Y, Wen Y, Song H, Liu Y, Zhang G, Wang L. Morphological Rule-Constrained Object Detection of Key Structures in Infant Fundus Image. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1031-1041. [PMID: 37018340 DOI: 10.1109/tcbb.2023.3234100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The detection of optic disc and macula is an essential step for ROP (Retinopathy of prematurity) zone segmentation and disease diagnosis. This paper aims to enhance deep learning-based object detection with domain-specific morphological rules. Based on the fundus morphology, we define five morphological rules, i.e., number restriction (maximum number of optic disc and macula is one), size restriction (e.g., optic disc width: 1.05 +/- 0.13 mm), distance restriction (distance between the optic disc and macula/fovea: 4.4 +/- 0.4 mm), angle/slope restriction (optic disc and macula should roughly be positioned in the same horizontal line), position restriction (In OD, the macula is on the left side of the optic disc; vice versa for OS). A case study on 2953 infant fundus images (with 2935 optic disc instances and 2892 macula instances) proves the effectiveness of the proposed method. Without the morphological rules, naïve object detection accuracies of optic disc and macula are 0.955 and 0.719, respectively. With the proposed method, false-positive ROIs (region of interest) are further ruled out, and the accuracy of the macula is raised to 0.811. The IoU (intersection over union) and RCE (relative center error) metrics are also improved .
Collapse
|
8
|
Benavides D, Cisnal A, Fontúrbel C, de la Fuente E, Fraile JC. Real-Time Tool Localization for Laparoscopic Surgery Using Convolutional Neural Network. SENSORS (BASEL, SWITZERLAND) 2024; 24:4191. [PMID: 39000974 PMCID: PMC11243864 DOI: 10.3390/s24134191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 06/17/2024] [Accepted: 06/25/2024] [Indexed: 07/16/2024]
Abstract
Partially automated robotic systems, such as camera holders, represent a pivotal step towards enhancing efficiency and precision in surgical procedures. Therefore, this paper introduces an approach for real-time tool localization in laparoscopy surgery using convolutional neural networks. The proposed model, based on two Hourglass modules in series, can localize up to two surgical tools simultaneously. This study utilized three datasets: the ITAP dataset, alongside two publicly available datasets, namely Atlas Dione and EndoVis Challenge. Three variations of the Hourglass-based models were proposed, with the best model achieving high accuracy (92.86%) and frame rates (27.64 FPS), suitable for integration into robotic systems. An evaluation on an independent test set yielded slightly lower accuracy, indicating limited generalizability. The model was further analyzed using the Grad-CAM technique to gain insights into its functionality. Overall, this work presents a promising solution for automating aspects of laparoscopic surgery, potentially enhancing surgical efficiency by reducing the need for manual endoscope manipulation.
Collapse
Affiliation(s)
| | - Ana Cisnal
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Escuela de Ingenierías Industriales, Universidad de Valladolid, Paseo Prado de la Magdalena 3-5, 47011 Valladolid, Spain; (D.B.); (C.F.); (E.d.l.F.); (J.C.F.)
| | | | | | | |
Collapse
|
9
|
Loza G, Valdastri P, Ali S. Real-time surgical tool detection with multi-scale positional encoding and contrastive learning. Healthc Technol Lett 2024; 11:48-58. [PMID: 38638504 PMCID: PMC11022231 DOI: 10.1049/htl2.12060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 11/22/2023] [Indexed: 04/20/2024] Open
Abstract
Real-time detection of surgical tools in laparoscopic data plays a vital role in understanding surgical procedures, evaluating the performance of trainees, facilitating learning, and ultimately supporting the autonomy of robotic systems. Existing detection methods for surgical data need to improve processing speed and high prediction accuracy. Most methods rely on anchors or region proposals, limiting their adaptability to variations in tool appearance and leading to sub-optimal detection results. Moreover, using non-anchor-based detectors to alleviate this problem has been partially explored without remarkable results. An anchor-free architecture based on a transformer that allows real-time tool detection is introduced. The proposal is to utilize multi-scale features within the feature extraction layer and at the transformer-based detection architecture through positional encoding that can refine and capture context-aware and structural information of different-sized tools. Furthermore, a supervised contrastive loss is introduced to optimize representations of object embeddings, resulting in improved feed-forward network performances for classifying localized bounding boxes. The strategy demonstrates superiority to state-of-the-art (SOTA) methods. Compared to the most accurate existing SOTA (DSSS) method, the approach has an improvement of nearly 4% on mAP and a reduction in the inference time by 113%. It also showed a 7% higher mAP than the baseline model.
Collapse
Affiliation(s)
- Gerardo Loza
- School of Computing, Faculty of Engineering and Physical SciencesUniversity of LeedsWest YorkshireUK
| | - Pietro Valdastri
- School of Electronic and Electrical Engineering, Faculty of Engineering and Physical SciencesUniversity of LeedsWest YorkshireUK
| | - Sharib Ali
- School of Computing, Faculty of Engineering and Physical SciencesUniversity of LeedsWest YorkshireUK
| |
Collapse
|
10
|
Liu Y, Hayashi Y, Oda M, Kitasaka T, Mori K. YOLOv7-RepFPN: Improving real-time performance of laparoscopic tool detection on embedded systems. Healthc Technol Lett 2024; 11:157-166. [PMID: 38638498 PMCID: PMC11022232 DOI: 10.1049/htl2.12072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 12/09/2023] [Indexed: 04/20/2024] Open
Abstract
This study focuses on enhancing the inference speed of laparoscopic tool detection on embedded devices. Laparoscopy, a minimally invasive surgery technique, markedly reduces patient recovery times and postoperative complications. Real-time laparoscopic tool detection helps assisting laparoscopy by providing information for surgical navigation, and its implementation on embedded devices is gaining interest due to the portability, network independence and scalability of the devices. However, embedded devices often face computation resource limitations, potentially hindering inference speed. To mitigate this concern, the work introduces a two-fold modification to the YOLOv7 model: the feature channels and integrate RepBlock is halved, yielding the YOLOv7-RepFPN model. This configuration leads to a significant reduction in computational complexity. Additionally, the focal EIoU (efficient intersection of union) loss function is employed for bounding box regression. Experimental results on an embedded device demonstrate that for frame-by-frame laparoscopic tool detection, the proposed YOLOv7-RepFPN achieved an mAP of 88.2% (with IoU set to 0.5) on a custom dataset based on EndoVis17, and an inference speed of 62.9 FPS. Contrasting with the original YOLOv7, which garnered an 89.3% mAP and 41.8 FPS under identical conditions, the methodology enhances the speed by 21.1 FPS while maintaining detection accuracy. This emphasizes the effectiveness of the work.
Collapse
Affiliation(s)
- Yuzhang Liu
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
| | - Yuichiro Hayashi
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
| | - Masahiro Oda
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
- Information and CommunicationsNagoya UniversityAichi NagoyaJapan
| | - Takayuki Kitasaka
- Department of Information ScienceAichi Institute of TechnologyAichi, NagoyaJapan
| | - Kensaku Mori
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
- Information and CommunicationsNagoya UniversityAichi NagoyaJapan
- Research Center of Medical BigdataNational Institute of InformaticsTokyoJapan
| |
Collapse
|
11
|
Feng X, Zhang X, Shi X, Li L, Wang S. ST-ITEF: Spatio-Temporal Intraoperative Task Estimating Framework to recognize surgical phase and predict instrument path based on multi-object tracking in keratoplasty. Med Image Anal 2024; 91:103026. [PMID: 37976868 DOI: 10.1016/j.media.2023.103026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 08/22/2023] [Accepted: 11/03/2023] [Indexed: 11/19/2023]
Abstract
Computer-assisted cognition guidance for surgical robotics by computer vision is a potential future outcome, which could facilitate the surgery for both operation accuracy and autonomy level. In this paper, multiple-object segmentation and feature extraction from this segmentation are combined to determine and predict surgical manipulation. A novel three-stage Spatio-Temporal Intraoperative Task Estimating Framework is proposed, with a quantitative expression derived from ophthalmologists' visual information process and also with the multi-object tracking of surgical instruments and human corneas involved in keratoplasty. In the estimation of intraoperative workflow, quantifying the operation parameters is still an open challenge. This problem is tackled by extracting key geometric properties from multi-object segmentation and calculating the relative position among instruments and corneas. A decision framework is further proposed, based on prior geometric properties, to recognize the current surgical phase and predict the instrument path for each phase. Our framework is tested and evaluated by real human keratoplasty videos. The optimized DeepLabV3 with image filtration won the competitive class-IoU in the segmentation task and the mean phase jaccard reached 55.58 % for the phase recognition. Both the qualitative and quantitative results indicate that our framework can achieve accurate segmentation and surgical phase recognition under complex disturbance. The Intraoperative Task Estimating Framework would be highly potential to guide surgical robots in clinical practice.
Collapse
Affiliation(s)
- Xiaojing Feng
- School of Mechanical Engineering at Xi'an Jiaotong University, 28 Xianning West Road, Xi'an 710049, China.
| | - Xiaodong Zhang
- School of Mechanical Engineering at Xi'an Jiaotong University, 28 Xianning West Road, Xi'an 710049, China.
| | - Xiaojun Shi
- School of Mechanical Engineering at Xi'an Jiaotong University, 28 Xianning West Road, Xi'an 710049, China
| | - Li Li
- Department of Ophthalmology at the First Affiliated Hospital of Xi'an Jiaotong University, 277 Yanta West Road, Xi'an 710061, China
| | - Shaopeng Wang
- School of Mechanical Engineering at Xi'an Jiaotong University, 28 Xianning West Road, Xi'an 710049, China
| |
Collapse
|
12
|
Zhao X, Guo J, He Z, Jiang X, Lou H, Li D. CLAD-Net: cross-layer aggregation attention network for real-time endoscopic instrument detection. Health Inf Sci Syst 2023; 11:58. [PMID: 38028959 PMCID: PMC10678866 DOI: 10.1007/s13755-023-00260-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 11/05/2023] [Indexed: 12/01/2023] Open
Abstract
As medical treatments continue to advance rapidly, minimally invasive surgery (MIS) has found extensive applications across various clinical procedures. Accurate identification of medical instruments plays a vital role in comprehending surgical situations and facilitating endoscopic image-guided surgical procedures. However, the endoscopic instrument detection poses a great challenge owing to the narrow operating space, with various interfering factors (e.g. smoke, blood, body fluids) and inevitable issues (e.g. mirror reflection, visual obstruction, illumination variation) in the surgery. To promote surgical efficiency and safety in MIS, this paper proposes a cross-layer aggregated attention detection network (CLAD-Net) for accurate and real-time detection of endoscopic instruments in complex surgical scenarios. We propose a cross-layer aggregation attention module to enhance the fusion of features and raise the effectiveness of lateral propagation of feature information. We propose a composite attention mechanism (CAM) to extract contextual information at different scales and model the importance of each channel in the feature map, mitigate the information loss due to feature fusion, and effectively solve the problem of inconsistent target size and low contrast in complex contexts. Moreover, the proposed feature refinement module (RM) enhances the network's ability to extract target edge and detail information by adaptively adjusting the feature weights to fuse different layers of features. The performance of CLAD-Net was evaluated using a public laparoscopic dataset Cholec80 and another set of neuroendoscopic dataset from Sun Yat-sen University Cancer Center. From both datasets and comparisons, CLAD-Net achieves the A P 0.5 of 98.9% and 98.6%, respectively, that is better than advanced detection networks. A video for the real-time detection is presented in the following link: https://github.com/A0268/video-demo.
Collapse
Affiliation(s)
- Xiushun Zhao
- School of Automation, Guangdong University of Technology, Guangzhou, 510006 China
| | - Jing Guo
- School of Automation, Guangdong University of Technology, Guangzhou, 510006 China
| | - Zhaoshui He
- School of Automation, Guangdong University of Technology, Guangzhou, 510006 China
| | - Xiaobing Jiang
- Department of Neurosurgery, Sun Yat-Sen University Cancer Center, Guangzhou, 510006 China
| | - Haifang Lou
- Department of Gastroenterology, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, 310006 China
| | - Depei Li
- Department of Neurosurgery, Sun Yat-Sen University Cancer Center, Guangzhou, 510006 China
| |
Collapse
|
13
|
Shen W, Wang Y, Liu M, Wang J, Ding R, Zhang Z, Meijering E. Branch Aggregation Attention Network for Robotic Surgical Instrument Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3408-3419. [PMID: 37342952 DOI: 10.1109/tmi.2023.3288127] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/23/2023]
Abstract
Surgical instrument segmentation is of great significance to robot-assisted surgery, but the noise caused by reflection, water mist, and motion blur during the surgery as well as the different forms of surgical instruments would greatly increase the difficulty of precise segmentation. A novel method called Branch Aggregation Attention network (BAANet) is proposed to address these challenges, which adopts a lightweight encoder and two designed modules, named Branch Balance Aggregation module (BBA) and Block Attention Fusion module (BAF), for efficient feature localization and denoising. By introducing the unique BBA module, features from multiple branches are balanced and optimized through a combination of addition and multiplication to complement strengths and effectively suppress noise. Furthermore, to fully integrate the contextual information and capture the region of interest, the BAF module is proposed in the decoder, which receives adjacent feature maps from the BBA module and localizes the surgical instruments from both global and local perspectives by utilizing a dual branch attention mechanism. According to the experimental results, the proposed method has the advantage of being lightweight while outperforming the second-best method by 4.03%, 1.53%, and 1.34% in mIoU scores on three challenging surgical instrument datasets, respectively, compared to the existing state-of-the-art methods. Code is available at https://github.com/SWT-1014/BAANet.
Collapse
|
14
|
Rodriguez Peñaranda N, Eissa A, Ferretti S, Bianchi G, Di Bari S, Farinha R, Piazza P, Checcucci E, Belenchón IR, Veccia A, Gomez Rivas J, Taratkin M, Kowalewski KF, Rodler S, De Backer P, Cacciamani GE, De Groote R, Gallagher AG, Mottrie A, Micali S, Puliatti S. Artificial Intelligence in Surgical Training for Kidney Cancer: A Systematic Review of the Literature. Diagnostics (Basel) 2023; 13:3070. [PMID: 37835812 PMCID: PMC10572445 DOI: 10.3390/diagnostics13193070] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 09/17/2023] [Accepted: 09/24/2023] [Indexed: 10/15/2023] Open
Abstract
The prevalence of renal cell carcinoma (RCC) is increasing due to advanced imaging techniques. Surgical resection is the standard treatment, involving complex radical and partial nephrectomy procedures that demand extensive training and planning. Furthermore, artificial intelligence (AI) can potentially aid the training process in the field of kidney cancer. This review explores how artificial intelligence (AI) can create a framework for kidney cancer surgery to address training difficulties. Following PRISMA 2020 criteria, an exhaustive search of PubMed and SCOPUS databases was conducted without any filters or restrictions. Inclusion criteria encompassed original English articles focusing on AI's role in kidney cancer surgical training. On the other hand, all non-original articles and articles published in any language other than English were excluded. Two independent reviewers assessed the articles, with a third party settling any disagreement. Study specifics, AI tools, methodologies, endpoints, and outcomes were extracted by the same authors. The Oxford Center for Evidence-Based Medicine's evidence levels were employed to assess the studies. Out of 468 identified records, 14 eligible studies were selected. Potential AI applications in kidney cancer surgical training include analyzing surgical workflow, annotating instruments, identifying tissues, and 3D reconstruction. AI is capable of appraising surgical skills, including the identification of procedural steps and instrument tracking. While AI and augmented reality (AR) enhance training, challenges persist in real-time tracking and registration. The utilization of AI-driven 3D reconstruction proves beneficial for intraoperative guidance and preoperative preparation. Artificial intelligence (AI) shows potential for advancing surgical training by providing unbiased evaluations, personalized feedback, and enhanced learning processes. Yet challenges such as consistent metric measurement, ethical concerns, and data privacy must be addressed. The integration of AI into kidney cancer surgical training offers solutions to training difficulties and a boost to surgical education. However, to fully harness its potential, additional studies are imperative.
Collapse
Affiliation(s)
- Natali Rodriguez Peñaranda
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
| | - Ahmed Eissa
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
- Department of Urology, Faculty of Medicine, Tanta University, Tanta 31527, Egypt
| | - Stefania Ferretti
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
| | - Giampaolo Bianchi
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
| | - Stefano Di Bari
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
| | - Rui Farinha
- Orsi Academy, 9090 Melle, Belgium; (R.F.); (P.D.B.); (R.D.G.); (A.G.G.); (A.M.)
- Urology Department, Lusíadas Hospital, 1500-458 Lisbon, Portugal
| | - Pietro Piazza
- Division of Urology, IRCCS Azienda Ospedaliero-Universitaria di Bologna, 40138 Bologna, Italy;
| | - Enrico Checcucci
- Department of Surgery, FPO-IRCCS Candiolo Cancer Institute, 10060 Turin, Italy;
| | - Inés Rivero Belenchón
- Urology and Nephrology Department, Virgen del Rocío University Hospital, 41013 Seville, Spain;
| | - Alessandro Veccia
- Department of Urology, University of Verona, Azienda Ospedaliera Universitaria Integrata, 37126 Verona, Italy;
| | - Juan Gomez Rivas
- Department of Urology, Hospital Clinico San Carlos, 28040 Madrid, Spain;
| | - Mark Taratkin
- Institute for Urology and Reproductive Health, Sechenov University, 119435 Moscow, Russia;
| | - Karl-Friedrich Kowalewski
- Department of Urology and Urosurgery, University Medical Center Mannheim, Medical Faculty Mannheim, Heidelberg University, 68167 Mannheim, Germany;
| | - Severin Rodler
- Department of Urology, University Hospital LMU Munich, 80336 Munich, Germany;
| | - Pieter De Backer
- Orsi Academy, 9090 Melle, Belgium; (R.F.); (P.D.B.); (R.D.G.); (A.G.G.); (A.M.)
- Department of Human Structure and Repair, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Giovanni Enrico Cacciamani
- USC Institute of Urology, Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA;
- AI Center at USC Urology, USC Institute of Urology, University of Southern California, Los Angeles, CA 90089, USA
| | - Ruben De Groote
- Orsi Academy, 9090 Melle, Belgium; (R.F.); (P.D.B.); (R.D.G.); (A.G.G.); (A.M.)
| | - Anthony G. Gallagher
- Orsi Academy, 9090 Melle, Belgium; (R.F.); (P.D.B.); (R.D.G.); (A.G.G.); (A.M.)
- Faculty of Life and Health Sciences, Ulster University, Derry BT48 7JL, UK
| | - Alexandre Mottrie
- Orsi Academy, 9090 Melle, Belgium; (R.F.); (P.D.B.); (R.D.G.); (A.G.G.); (A.M.)
| | - Salvatore Micali
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
| | - Stefano Puliatti
- Department of Urology, Azienda Ospedaliero-Universitaria di Modena, Via Pietro Giardini, 1355, 41126 Baggiovara, Italy; (N.R.P.); (A.E.); (S.F.); (G.B.); (S.D.B.); (S.M.)
| |
Collapse
|
15
|
Ping L, Wang Z, Yao J, Gao J, Yang S, Li J, Shi J, Wu W, Hua S, Wang H. Application and evaluation of surgical tool and tool tip recognition based on Convolutional Neural Network in multiple endoscopic surgical scenarios. Surg Endosc 2023; 37:7376-7384. [PMID: 37580576 DOI: 10.1007/s00464-023-10323-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 07/19/2023] [Indexed: 08/16/2023]
Abstract
BACKGROUND In recent years, computer-assisted intervention and robot-assisted surgery are receiving increasing attention. The need for real-time identification and tracking of surgical tools and tool tips is constantly demanding. A series of researches focusing on surgical tool tracking and identification have been performed. However, the size of dataset, the sensitivity/precision, and the response time of these studies were limited. In this work, we developed and utilized an automated method based on Convolutional Neural Network (CNN) and You Only Look Once (YOLO) v3 algorithm to locate and identify surgical tools and tool tips covering five different surgical scenarios. MATERIALS AND METHODS An algorithm of object detection was applied to identify and locate the surgical tools and tool tips. DarkNet-19 was used as Backbone Network and YOLOv3 was modified and applied for the detection. We included a series of 181 endoscopy videos covering 5 different surgical scenarios: pancreatic surgery, thyroid surgery, colon surgery, gastric surgery, and external scenes. A total amount of 25,333 images containing 94,463 targets were collected. Training and test sets were divided in a proportion of 2.5:1. The data sets were openly stored at the Kaggle database. RESULTS Under an Intersection over Union threshold of 0.5, the overall sensitivity and precision rate of the model were 93.02% and 89.61% for tool recognition and 87.05% and 83.57% for tool tip recognition, respectively. The model demonstrated the highest tool and tool tip recognition sensitivity and precision rate under external scenes. Among the four different internal surgical scenes, the network had better performances in pancreatic and colon surgeries and poorer performances in gastric and thyroid surgeries. CONCLUSION We developed a surgical tool and tool tip recognition model based on CNN and YOLOv3. Validation of our model demonstrated satisfactory precision, accuracy, and robustness across different surgical scenes.
Collapse
Affiliation(s)
- Lu Ping
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Zhihong Wang
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jingjing Yao
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Junyi Gao
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Sen Yang
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jiayi Li
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jile Shi
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Wenming Wu
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Surong Hua
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| | - Huizhen Wang
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| |
Collapse
|
16
|
Huang Y, Jiao J, Yu J, Zheng Y, Wang Y. Si-MSPDNet: A multiscale Siamese network with parallel partial decoders for the 3-D measurement of spines in 3D ultrasonic images. Comput Med Imaging Graph 2023; 108:102262. [PMID: 37385048 DOI: 10.1016/j.compmedimag.2023.102262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 05/26/2023] [Accepted: 06/09/2023] [Indexed: 07/01/2023]
Abstract
Early screening and frequent monitoring effectively decrease the risk of severe scoliosis, but radiation exposure is a consequence of traditional radiograph examinations. Additionally, traditional X-ray images on the coronal or sagittal plane have difficulty providing three-dimensional (3-D) information on spinal deformities. The Scolioscan system provides an innovative 3-D spine imaging approach via ultrasonic scanning, and its feasibility has been demonstrated in numerous studies. In this paper, to further examine the potential of spinal ultrasonic data for describing 3-D spinal deformities, we propose a novel deep-learning tracker named Si-MSPDNet for extracting widely employed landmarks (spinous process (SP)) in ultrasonic images of spines and establish a 3-D spinal profile to measure 3-D spinal deformities. Si-MSPDNet has a Siamese architecture. First, we employ two efficient two-stage encoders to extract features from the uncropped ultrasonic image and the patch centered on the SP cut from the image. Then, a fusion block is designed to strengthen the communication between encoded features and further refine them from channel and spatial perspectives. The SP is a very small target in ultrasonic images, so its representation is weak in the highest-level feature maps. To overcome this, we ignore the highest-level feature maps and introduce parallel partial decoders to localize the SP. The correlation evaluation in the traditional Siamese network is also expanded to multiple scales to enhance cooperation. Furthermore, we propose a binary guided mask based on vertebral anatomical prior knowledge, which can further improve the performance of our tracker by highlighting the potential region with SP. The binary-guided mask is also utilized for fully automatic initialization in tracking. We collected spinal ultrasonic data and corresponding radiographs on the coronal and sagittal planes from 150 patients to evaluate the tracking precision of Si-MSPDNet and the performance of the generated 3-D spinal profile. Experimental results revealed that our tracker achieved a tracking success rate of 100% and a mean IoU of 0.882, outperforming some commonly used tracking and real-time detection models. Furthermore, a high correlation existed on both the coronal and sagittal planes between our projected spinal curve and that extracted from the spinal annotation in X-ray images. The correlation between the tracking results of the SP and their ground truths on other projected planes was also satisfactory. More importantly, the difference in mean curvatures was slight on all projected planes between tracking results and ground truths. Thus, this study effectively demonstrates the promising potential of our 3-D spinal profile extraction method for the 3-D measurement of spinal deformities using 3-D ultrasound data.
Collapse
Affiliation(s)
- Yi Huang
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China
| | - Jing Jiao
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China
| | - Jinhua Yu
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Fudan University, 200433, China
| | - Yongping Zheng
- Department of Biomedical Engineering, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region of China; Research Institute for Smart Ageing, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region of China.
| | - Yuanyuan Wang
- Biomedical Engineering Center, Fudan University, Shanghai 200433, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Fudan University, 200433, China.
| |
Collapse
|
17
|
Ray A, Habibagahi I, Babakhani A. A Fully Wireless and Batteryless Localization System With 50 Micrometre Motion Detection Capability and Adaptive Transmitter Power Control for Point-of-Care Biomedical Applications. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2023; 17:674-687. [PMID: 37363841 DOI: 10.1109/tbcas.2023.3289149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Localization has varied applications in biomedicine, such as wireless capsule endoscopy (WCE), detection of cancerous tissue, drug delivery, robotic surgeries, and brain mapping. Currently, most localization systems are battery-powered and suffer from issues regarding battery leakage and limited battery life, resulting in potential health hazards and inconveniences when using them for continuous health monitoring applications. This article proposes an entirely wireless and battery-less 2D localization system consisting of an integrated circuit (IC) that is wirelessly powered at a distance of 4 cm by a 40.68 MHz radio frequency (RF) power of only 2 W. The proposed localization system wirelessly transmits a locked sub-harmonic 13.56 MHz signal generated from the wirelessly received 40.68 MHz RF power signal, eliminating the need for a power-hungry oscillator. Additionally, the system, having a measurement latency of 11.3 ms, has also been verified to sense motion as small as 50 [Formula: see text] as well as measure the rate of motion up to 10 beats per minute, therefore extending its application to the detection of physiological motions such as diaphragm motion during breathing. The localizer has a small form factor of 17 mm × 12 mm × 0.2 mm and consumes an average power of 6 μW. Ex vivo measurements using the localizer inside the porcine intestine demonstrate a localization accuracy of less than 5 mm.
Collapse
|
18
|
Du W, Yi G, Omisore OM, Duan W, Akinyemi TO, Chen X, Wang L, Lee BG, Liu J. Guidewire Endpoint Detection Based on Pixel Adjacent Relation in Robot-assisted Cardiovascular Interventions. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-5. [PMID: 38082615 DOI: 10.1109/embc40787.2023.10340841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Visualization of endovascular tools like guidewire and catheter is essential for procedural success of endovascular interventions. This requires tracking the tool pixels and motion during catheterization; however, detecting the endpoints of the endovascular tools is challenging due to their small size, thin appearance, and flexibility. As this still limit the performances of existing methods used for endovascular tool segmentation, predicting correct object location could provide ways forward. In this paper, we proposed a neighborhood-based method for detecting guidewire endpoints in X-ray angiograms. Typically, it consists of pixel-level segmentation and a post-segmentation step that is based on adjacency relationships of pixels in a given neighborhood. The latter includes skeletonization to predict endpoint pixels of guidewire. The method is evaluated with proprietary guidewire dataset obtained during in-vivo study in six rabbits, and it shows a high segmentation performance characterized with precision of 87.87% and recall of 90.53%, and low detection error with a mean pixel error of 2.26±0.14 pixels. We compared our method with four state-of-the-art detection methods and found it to exhibit the best detection performance. This neighborhood-based detection method can be generalized for other surgical tool detection and in related computer vision tasks.Clinical Relevance- The proposed method can be provided with better tool tracking and visualization systems during robot-assisted intravascular interventional surgery.
Collapse
|
19
|
Eckhoff JA, Ban Y, Rosman G, Müller DT, Hashimoto DA, Witkowski E, Babic B, Rus D, Bruns C, Fuchs HF, Meireles O. TEsoNet: knowledge transfer in surgical phase recognition from laparoscopic sleeve gastrectomy to the laparoscopic part of Ivor-Lewis esophagectomy. Surg Endosc 2023; 37:4040-4053. [PMID: 36932188 PMCID: PMC10156818 DOI: 10.1007/s00464-023-09971-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 02/21/2023] [Indexed: 03/19/2023]
Abstract
BACKGROUND Surgical phase recognition using computer vision presents an essential requirement for artificial intelligence-assisted analysis of surgical workflow. Its performance is heavily dependent on large amounts of annotated video data, which remain a limited resource, especially concerning highly specialized procedures. Knowledge transfer from common to more complex procedures can promote data efficiency. Phase recognition models trained on large, readily available datasets may be extrapolated and transferred to smaller datasets of different procedures to improve generalizability. The conditions under which transfer learning is appropriate and feasible remain to be established. METHODS We defined ten operative phases for the laparoscopic part of Ivor-Lewis Esophagectomy through expert consensus. A dataset of 40 videos was annotated accordingly. The knowledge transfer capability of an established model architecture for phase recognition (CNN + LSTM) was adapted to generate a "Transferal Esophagectomy Network" (TEsoNet) for co-training and transfer learning from laparoscopic Sleeve Gastrectomy to the laparoscopic part of Ivor-Lewis Esophagectomy, exploring different training set compositions and training weights. RESULTS The explored model architecture is capable of accurate phase detection in complex procedures, such as Esophagectomy, even with low quantities of training data. Knowledge transfer between two upper gastrointestinal procedures is feasible and achieves reasonable accuracy with respect to operative phases with high procedural overlap. CONCLUSION Robust phase recognition models can achieve reasonable yet phase-specific accuracy through transfer learning and co-training between two related procedures, even when exposed to small amounts of training data of the target procedure. Further exploration is required to determine appropriate data amounts, key characteristics of the training procedure and temporal annotation methods required for successful transferal phase recognition. Transfer learning across different procedures addressing small datasets may increase data efficiency. Finally, to enable the surgical application of AI for intraoperative risk mitigation, coverage of rare, specialized procedures needs to be explored.
Collapse
Affiliation(s)
- J A Eckhoff
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA.
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany.
| | - Y Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - G Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - D T Müller
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - D A Hashimoto
- Department of Surgery, University Hospitals Cleveland Medical Center, Cleveland, OH, 44106, USA
- Department of Surgery, Case Western Reserve School of Medicine, Cleveland, OH, 44106, USA
| | - E Witkowski
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| | - B Babic
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - D Rus
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - C Bruns
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - H F Fuchs
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - O Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| |
Collapse
|
20
|
Bykanov A, Danilov G, Kostumov V, Pilipenko O, Nutfullin B, Rastvorova O, Pitskhelauri D. Artificial Intelligence Technologies in the Microsurgical Operating Room (Review). Sovrem Tekhnologii Med 2023; 15:86-94. [PMID: 37389018 PMCID: PMC10306972 DOI: 10.17691/stm2023.15.2.08] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Indexed: 07/01/2023] Open
Abstract
Surgery performed by a novice neurosurgeon under constant supervision of a senior surgeon with the experience of thousands of operations, able to handle any intraoperative complications and predict them in advance, and never getting tired, is currently an elusive dream, but can become a reality with the development of artificial intelligence methods. This paper has presented a review of the literature on the use of artificial intelligence technologies in the microsurgical operating room. Searching for sources was carried out in the PubMed text database of medical and biological publications. The key words used were "surgical procedures", "dexterity", "microsurgery" AND "artificial intelligence" OR "machine learning" OR "neural networks". Articles in English and Russian were considered with no limitation to publication date. The main directions of research on the use of artificial intelligence technologies in the microsurgical operating room have been highlighted. Despite the fact that in recent years machine learning has been increasingly introduced into the medical field, a small number of studies related to the problem of interest have been published, and their results have not proved to be of practical use yet. However, the social significance of this direction is an important argument for its development.
Collapse
Affiliation(s)
- A.E. Bykanov
- Neurosurgeon, 7 Department of Neurosurgery, Researcher; National Medical Research Center for Neurosurgery named after Academician N.N. Burdenko, Ministry of Healthcare of the Russian Federation, 16, 4 Tverskaya-Yamskaya St., Moscow, 125047, Russia
| | - G.V. Danilov
- Academic Secretary; National Medical Research Center for Neurosurgery named after Academician N.N. Burdenko, Ministry of Healthcare of the Russian Federation, 16, 4 Tverskaya-Yamskaya St., Moscow, 125047, Russia
| | - V.V. Kostumov
- PhD Student, Programmer, the CMC Faculty; Lomonosov Moscow State University, 1 Leninskiye Gory, Moscow, 119991, Russia
| | - O.G. Pilipenko
- PhD Student, Programmer, the CMC Faculty; Lomonosov Moscow State University, 1 Leninskiye Gory, Moscow, 119991, Russia
| | - B.M. Nutfullin
- PhD Student, Programmer, the CMC Faculty; Lomonosov Moscow State University, 1 Leninskiye Gory, Moscow, 119991, Russia
| | - O.A. Rastvorova
- Resident, 7 Department of Neurosurgery; National Medical Research Center for Neurosurgery named after Academician N.N. Burdenko, Ministry of Healthcare of the Russian Federation, 16, 4 Tverskaya-Yamskaya St., Moscow, 125047, Russia
| | - D.I. Pitskhelauri
- Professor, Head of the 7 Department of Neurosurgery; National Medical Research Center for Neurosurgery named after Academician N.N. Burdenko, Ministry of Healthcare of the Russian Federation, 16, 4 Tverskaya-Yamskaya St., Moscow, 125047, Russia
| |
Collapse
|
21
|
Qin M, Brawer J, Scassellati B. Robot tool use: A survey. Front Robot AI 2023; 9:1009488. [PMID: 36726401 PMCID: PMC9885045 DOI: 10.3389/frobt.2022.1009488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 12/28/2022] [Indexed: 01/18/2023] Open
Abstract
Using human tools can significantly benefit robots in many application domains. Such ability would allow robots to solve problems that they were unable to without tools. However, robot tool use is a challenging task. Tool use was initially considered to be the ability that distinguishes human beings from other animals. We identify three skills required for robot tool use: perception, manipulation, and high-level cognition skills. While both general manipulation tasks and tool use tasks require the same level of perception accuracy, there are unique manipulation and cognition challenges in robot tool use. In this survey, we first define robot tool use. The definition highlighted the skills required for robot tool use. The skills coincide with an affordance model which defined a three-way relation between actions, objects, and effects. We also compile a taxonomy of robot tool use with insights from animal tool use literature. Our definition and taxonomy lay a theoretical foundation for future robot tool use studies and also serve as practical guidelines for robot tool use applications. We first categorize tool use based on the context of the task. The contexts are highly similar for the same task (e.g., cutting) in non-causal tool use, while the contexts for causal tool use are diverse. We further categorize causal tool use based on the task complexity suggested in animal tool use studies into single-manipulation tool use and multiple-manipulation tool use. Single-manipulation tool use are sub-categorized based on tool features and prior experiences of tool use. This type of tool may be considered as building blocks of causal tool use. Multiple-manipulation tool use combines these building blocks in different ways. The different combinations categorize multiple-manipulation tool use. Moreover, we identify different skills required in each sub-type in the taxonomy. We then review previous studies on robot tool use based on the taxonomy and describe how the relations are learned in these studies. We conclude with a discussion of the current applications of robot tool use and open questions to address future robot tool use.
Collapse
|
22
|
Martin T, El Hage G, Shedid D, Bojanowski MW. Using artificial intelligence to quantify dynamic retraction of brain tissue and the manipulation of instruments in neurosurgery. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-022-02824-8. [PMID: 36598652 DOI: 10.1007/s11548-022-02824-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 12/20/2022] [Indexed: 01/05/2023]
Abstract
PURPOSE There is no objective way to measure the amount of manipulation and retraction of neural tissue by the surgeon. Our goal is to develop metrics quantifying dynamic retraction and manipulation by instruments during neurosurgery. METHODS We trained a convolutional neural network (CNN) to analyze microscopic footage of neurosurgical procedures and thereby generate metrics evaluating the surgeon's dynamic retraction of brain tissue and, using an object tracking process, evaluate the surgeon's manipulation of the instruments themselves. U-Net image segmentation is used to output bounding polygons around cerebral parenchyma of interest, as well as the vascular structures and cranial nerves. A channel and spatial reliability tracker framework is used in conjunction with our CNN to track desired surgical instruments. RESULTS Our network achieved a state-of-the-art intersection over union ([Formula: see text]) for biological tissue segmentation. Multivariate statistical analysis was used to evaluate dynamic retraction, tissue handling, and instrument manipulation. CONCLUSION Our model enables to evaluate dynamic retraction of soft tissue and manipulation of instruments during a surgical procedure, while accounting for movement of the operative microscope. This model can potentially provide the surgeon with objective feedback about the movement of instruments and its effect on brain tissue.
Collapse
Affiliation(s)
- Tristan Martin
- Department of Surgery, Division of Neurosurgery, University of Montreal, Montreal, QC, Canada
| | - Gilles El Hage
- Department of Surgery, Division of Neurosurgery, University of Montreal, Montreal, QC, Canada
| | - Daniel Shedid
- Department of Surgery, Division of Neurosurgery, University of Montreal, Montreal, QC, Canada
| | - Michel W Bojanowski
- Department of Surgery, Division of Neurosurgery, University of Montreal, Montreal, QC, Canada.
| |
Collapse
|
23
|
Park M, Oh S, Jeong T, Yu S. Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition. Diagnostics (Basel) 2022; 13:107. [PMID: 36611399 PMCID: PMC9818879 DOI: 10.3390/diagnostics13010107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/28/2022] [Accepted: 12/28/2022] [Indexed: 12/31/2022] Open
Abstract
In recent times, many studies concerning surgical video analysis are being conducted due to its growing importance in many medical applications. In particular, it is very important to be able to recognize the current surgical phase because the phase information can be utilized in various ways both during and after surgery. This paper proposes an efficient phase recognition network, called MomentNet, for cholecystectomy endoscopic videos. Unlike LSTM-based network, MomentNet is based on a multi-stage temporal convolutional network. Besides, to improve the phase prediction accuracy, the proposed method adopts a new loss function to supplement the general cross entropy loss function. The new loss function significantly improves the performance of the phase recognition network by constraining un-desirable phase transition and preventing over-segmentation. In addition, MomnetNet effectively applies positional encoding techniques, which are commonly applied in transformer architectures, to the multi-stage temporal convolution network. By using the positional encoding techniques, MomentNet can provide important temporal context, resulting in higher phase prediction accuracy. Furthermore, the MomentNet applies label smoothing technique to suppress overfitting and replaces the backbone network for feature extraction to further improve the network performance. As a result, the MomentNet achieves 92.31% accuracy in the phase recognition task with the Cholec80 dataset, which is 4.55% higher than that of the baseline architecture.
Collapse
Affiliation(s)
- Minyoung Park
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
| | - Seungtaek Oh
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
| | - Taikyeong Jeong
- School of Artificial Intelligence Convergence, Hallym University, Chuncheon 24252, Republic of Korea
| | - Sungwook Yu
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
| |
Collapse
|
24
|
Reiter W. Domain generalization improves end-to-end object detection for real-time surgical tool detection. Int J Comput Assist Radiol Surg 2022; 18:939-944. [PMID: 36581742 DOI: 10.1007/s11548-022-02823-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/20/2022] [Indexed: 12/31/2022]
Abstract
PURPOSE Computer assistance for endoscopic surgery depends on knowledge about the contents in an endoscopic scene. An important step of analysing the video contents is real-time surgical tool detection. Most methods for tool detection nevertheless depend on multi-step algorithms building upon prior knowledge like anchor boxes or non-maximum suppression which ultimately decrease performance. A real-world difficulty encountered by learning-based methods are limited datasets. Training a neural network on data matching a specific distribution (e.g. from a single hospital or showing a specific type of surgery) can result in a lack of generalization. METHODS In this paper, we propose the application of a transformer based architecture for end-to-end tool detection. This architecture promises state-of-the-art accuracy while decreasing the complexity resulting in improved run-time performance. To improve the lack of cross-domain generalization due to limited datasets, we enhance the architecture with a latent feature space via variational encoding to capture common intra-domain information. This feature space models the linear dependencies between domains by constraining their rank. RESULTS The trained neural networks show a distinct improvement on out-of-domain data indicating better generalization to unseen domains. Inference with the end-to-end architecture can be performed at up to 138 frames per second (FPS) achieving a speedup in comparison to older approaches. CONCLUSIONS Experimental results on three representative datasets demonstrate the performance of the method. We also show that our approach leads to better domain generalization.
Collapse
|
25
|
Liu X, Esser D, Wagstaff B, Zavodni A, Matsuura N, Kelly J, Diller E. Capsule robot pose and mechanism state detection in ultrasound using attention-based hierarchical deep learning. Sci Rep 2022; 12:21130. [PMID: 36476715 PMCID: PMC9729303 DOI: 10.1038/s41598-022-25572-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022] Open
Abstract
Ingestible robotic capsules with locomotion capabilities and on-board sampling mechanism have great potential for non-invasive diagnostic and interventional use in the gastrointestinal tract. Real-time tracking of capsule location and operational state is necessary for clinical application, yet remains a significant challenge. To this end, we propose an approach that can simultaneously determine the mechanism state and in-plane 2D pose of millimeter capsule robots in an anatomically representative environment using ultrasound imaging. Our work proposes an attention-based hierarchical deep learning approach and adapts the success of transfer learning towards solving the multi-task tracking problem with limited dataset. To train the neural networks, we generate a representative dataset of a robotic capsule within ex-vivo porcine stomachs. Experimental results show that the accuracy of capsule state classification is 97%, and the mean estimation errors for orientation and centroid position are 2.0 degrees and 0.24 mm (1.7% of the capsule's body length) on the hold-out test set. Accurate detection of the capsule while manipulated by an external magnet in a porcine stomach and colon is also demonstrated. The results suggest our proposed method has the potential for advancing the wireless capsule-based technologies by providing accurate detection of capsule robots in clinical scenarios.
Collapse
Affiliation(s)
- Xiaoyun Liu
- grid.17063.330000 0001 2157 2938Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S1A8 Canada
| | - Daniel Esser
- grid.152326.10000 0001 2264 7217Department of Mechanical Engineering, Vanderbilt University, Nashville, TN 37235 USA
| | - Brandon Wagstaff
- grid.17063.330000 0001 2157 2938University of Toronto Institute of Aerospace Studies, University of Toronto, Toronto, ON M5S1A8 Canada
| | - Anna Zavodni
- grid.17063.330000 0001 2157 2938Division of Cardiology, Department of Medicine, University of Toronto, Toronto, ON M5S1A8 Canada
| | - Naomi Matsuura
- grid.17063.330000 0001 2157 2938Department of Materials Science and Engineering and Institute of Biomedical Engineering, University of Toronto, Toronto, ON M5S1A8 Canada
| | - Jonathan Kelly
- grid.17063.330000 0001 2157 2938University of Toronto Institute of Aerospace Studies, University of Toronto, Toronto, ON M5S1A8 Canada
| | - Eric Diller
- grid.17063.330000 0001 2157 2938Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S1A8 Canada
| |
Collapse
|
26
|
Teevno MA, Ochoa-Ruiz G, Ali S. A semi-supervised Teacher-Student framework for surgical tool detection and localization. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2022. [DOI: 10.1080/21681163.2022.2150688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Mansoor Ali Teevno
- Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Guadalajara, México
| | - Gilberto Ochoa-Ruiz
- Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Guadalajara, México
| | - Sharib Ali
- School of Computing, University of Leeds, Leeds, UK
| |
Collapse
|
27
|
Nema S, Vachhani L. Surgical instrument detection and tracking technologies: Automating dataset labeling for surgical skill assessment. Front Robot AI 2022; 9:1030846. [PMID: 36405072 PMCID: PMC9671944 DOI: 10.3389/frobt.2022.1030846] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 10/14/2022] [Indexed: 11/06/2022] Open
Abstract
Surgical skills can be improved by continuous surgical training and feedback, thus reducing adverse outcomes while performing an intervention. With the advent of new technologies, researchers now have the tools to analyze surgical instrument motion to differentiate surgeons’ levels of technical skill. Surgical skills assessment is time-consuming and prone to subjective interpretation. The surgical instrument detection and tracking algorithm analyzes the image captured by the surgical robotic endoscope and extracts the movement and orientation information of a surgical instrument to provide surgical navigation. This information can be used to label raw surgical video datasets that are used to form an action space for surgical skill analysis. Instrument detection and tracking is a challenging problem in MIS, including robot-assisted surgeries, but vision-based approaches provide promising solutions with minimal hardware integration requirements. This study offers an overview of the developments of assessment systems for surgical intervention analysis. The purpose of this study is to identify the research gap and make a leap in developing technology to automate the incorporation of new surgical skills. A prime factor in automating the learning is to create datasets with minimal manual intervention from raw surgical videos. This review encapsulates the current trends in artificial intelligence (AI) based visual detection and tracking technologies for surgical instruments and their application for surgical skill assessment.
Collapse
|
28
|
Learning strategy for continuous robot visual control: A multi-objective perspective. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
29
|
Markarian N, Kugener G, Pangal DJ, Unadkat V, Sinha A, Zhu Y, Roshannai A, Chan J, Hung AJ, Wrobel BB, Anandkumar A, Zada G, Donoho DA. Validation of Machine Learning-Based Automated Surgical Instrument Annotation Using Publicly Available Intraoperative Video. Oper Neurosurg (Hagerstown) 2022; 23:235-240. [PMID: 35972087 DOI: 10.1227/ons.0000000000000274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 03/05/2022] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Intraoperative tool movement data have been demonstrated to be clinically useful in quantifying surgical performance. However, collecting this information from intraoperative video requires laborious hand annotation. The ability to automatically annotate tools in surgical video would advance surgical data science by eliminating a time-intensive step in research. OBJECTIVE To identify whether machine learning (ML) can automatically identify surgical instruments contained within neurosurgical video. METHODS A ML model which automatically identifies surgical instruments in frame was developed and trained on multiple publicly available surgical video data sets with instrument location annotations. A total of 39 693 frames from 4 data sets were used (endoscopic endonasal surgery [EEA] [30 015 frames], cataract surgery [4670], laparoscopic cholecystectomy [2532], and microscope-assisted brain/spine tumor removal [2476]). A second model trained only on EEA video was also developed. Intraoperative EEA videos from YouTube were used for test data (3 videos, 1239 frames). RESULTS The YouTube data set contained 2169 total instruments. Mean average precision (mAP) for instrument detection on the YouTube data set was 0.74. The mAP for each individual video was 0.65, 0.74, and 0.89. The second model trained only on EEA video also had an overall mAP of 0.74 (0.62, 0.84, and 0.88 for individual videos). Development costs were $130 for manual video annotation and under $100 for computation. CONCLUSION Surgical instruments contained within endoscopic endonasal intraoperative video can be detected using a fully automated ML model. The addition of disparate surgical data sets did not improve model performance, although these data sets may improve generalizability of the model in other use cases.
Collapse
Affiliation(s)
- Nicholas Markarian
- Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Guillaume Kugener
- Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Dhiraj J Pangal
- Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Vyom Unadkat
- Department of Computer Science, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | | | - Yichao Zhu
- Department of Computer Science, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Arman Roshannai
- Department of Computer Science, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Justin Chan
- Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Andrew J Hung
- Center for Robotic Simulation and Education, USC Institute of Urology, Keck School of Medicine of the University of Southern California, Los Angeles, California, USA
| | - Bozena B Wrobel
- Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA.,USC Caruso Department of Otolaryngology-Head and Neck Surgery, University of Southern California, Los Angeles, California, USA
| | - Animashree Anandkumar
- Department of Computing + Mathematical Sciences, California Institute of Technology, Pasadena, California, USA
| | - Gabriel Zada
- Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Daniel A Donoho
- Division of Neurosurgery, Department of Surgery, Texas Children's Hospital, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
30
|
Surgical Tool Datasets for Machine Learning Research: A Survey. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01640-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThis paper is a comprehensive survey of datasets for surgical tool detection and related surgical data science and machine learning techniques and algorithms. The survey offers a high level perspective of current research in this area, analyses the taxonomy of approaches adopted by researchers using surgical tool datasets, and addresses key areas of research, such as the datasets used, evaluation metrics applied and deep learning techniques utilised. Our presentation and taxonomy provides a framework that facilitates greater understanding of current work, and highlights the challenges and opportunities for further innovative and useful research.
Collapse
|
31
|
Sánchez-Brizuela G, Santos-Criado FJ, Sanz-Gobernado D, de la Fuente-López E, Fraile JC, Pérez-Turiel J, Cisnal A. Gauze Detection and Segmentation in Minimally Invasive Surgery Video Using Convolutional Neural Networks. SENSORS (BASEL, SWITZERLAND) 2022; 22:5180. [PMID: 35890857 PMCID: PMC9319965 DOI: 10.3390/s22145180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 06/30/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
Medical instruments detection in laparoscopic video has been carried out to increase the autonomy of surgical robots, evaluate skills or index recordings. However, it has not been extended to surgical gauzes. Gauzes can provide valuable information to numerous tasks in the operating room, but the lack of an annotated dataset has hampered its research. In this article, we present a segmentation dataset with 4003 hand-labelled frames from laparoscopic video. To prove the dataset potential, we analyzed several baselines: detection using YOLOv3, coarse segmentation, and segmentation with a U-Net. Our results show that YOLOv3 can be executed in real time but provides a modest recall. Coarse segmentation presents satisfactory results but lacks inference speed. Finally, the U-Net baseline achieves a good speed-quality compromise running above 30 FPS while obtaining an IoU of 0.85. The accuracy reached by U-Net and its execution speed demonstrate that precise and real-time gauze segmentation can be achieved, training convolutional neural networks on the proposed dataset.
Collapse
Affiliation(s)
- Guillermo Sánchez-Brizuela
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Francisco-Javier Santos-Criado
- Escuela Técnica Superior de Ingenieros Industriales, Universidad Politécnica de Madrid, Calle de José Gutiérrez Abascal, 2, 28006 Madrid, Spain;
| | - Daniel Sanz-Gobernado
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Eusebio de la Fuente-López
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Juan-Carlos Fraile
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Javier Pérez-Turiel
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Ana Cisnal
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| |
Collapse
|
32
|
Yang S, Wang Y, Zhao H, Cheng H, Ding H. Autonomous Laparoscope Control for Minimally Invasive Surgery With Intuition and RCM Constraints. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3186507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Sihang Yang
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yiwei Wang
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Huan Zhao
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Haoyuan Cheng
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Han Ding
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
33
|
Deep Learning Approaches for Automatic Localization in Medical Images. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:6347307. [PMID: 35814554 PMCID: PMC9259335 DOI: 10.1155/2022/6347307] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 05/23/2022] [Indexed: 12/21/2022]
Abstract
Recent revolutionary advances in deep learning (DL) have fueled several breakthrough achievements in various complicated computer vision tasks. The remarkable successes and achievements started in 2012 when deep learning neural networks (DNNs) outperformed the shallow machine learning models on a number of significant benchmarks. Significant advances were made in computer vision by conducting very complex image interpretation tasks with outstanding accuracy. These achievements have shown great promise in a wide variety of fields, especially in medical image analysis by creating opportunities to diagnose and treat diseases earlier. In recent years, the application of the DNN for object localization has gained the attention of researchers due to its success over conventional methods, especially in object localization. As this has become a very broad and rapidly growing field, this study presents a short review of DNN implementation for medical images and validates its efficacy on benchmarks. This study presents the first review that focuses on object localization using the DNN in medical images. The key aim of this study was to summarize the recent studies based on the DNN for medical image localization and to highlight the research gaps that can provide worthwhile ideas to shape future research related to object localization tasks. It starts with an overview on the importance of medical image analysis and existing technology in this space. The discussion then proceeds to the dominant DNN utilized in the current literature. Finally, we conclude by discussing the challenges associated with the application of the DNN for medical image localization which can drive further studies in identifying potential future developments in the relevant field of study.
Collapse
|
34
|
Gumbs AA, Grasso V, Bourdel N, Croner R, Spolverato G, Frigerio I, Illanes A, Abu Hilal M, Park A, Elyan E. The Advances in Computer Vision That Are Enabling More Autonomous Actions in Surgery: A Systematic Review of the Literature. SENSORS (BASEL, SWITZERLAND) 2022; 22:4918. [PMID: 35808408 PMCID: PMC9269548 DOI: 10.3390/s22134918] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 06/21/2022] [Accepted: 06/21/2022] [Indexed: 12/28/2022]
Abstract
This is a review focused on advances and current limitations of computer vision (CV) and how CV can help us obtain to more autonomous actions in surgery. It is a follow-up article to one that we previously published in Sensors entitled, "Artificial Intelligence Surgery: How Do We Get to Autonomous Actions in Surgery?" As opposed to that article that also discussed issues of machine learning, deep learning and natural language processing, this review will delve deeper into the field of CV. Additionally, non-visual forms of data that can aid computerized robots in the performance of more autonomous actions, such as instrument priors and audio haptics, will also be highlighted. Furthermore, the current existential crisis for surgeons, endoscopists and interventional radiologists regarding more autonomy during procedures will be discussed. In summary, this paper will discuss how to harness the power of CV to keep doctors who do interventions in the loop.
Collapse
Affiliation(s)
- Andrew A. Gumbs
- Departement de Chirurgie Digestive, Centre Hospitalier Intercommunal de, Poissy/Saint-Germain-en-Laye, 78300 Poissy, France
- Department of Surgery, University of Magdeburg, 39106 Magdeburg, Germany;
| | - Vincent Grasso
- Family Christian Health Center, 31 West 155th St., Harvey, IL 60426, USA;
| | - Nicolas Bourdel
- Gynecological Surgery Department, CHU Clermont Ferrand, 1, Place Lucie-Aubrac Clermont-Ferrand, 63100 Clermont-Ferrand, France;
- EnCoV, Institut Pascal, UMR6602 CNRS, UCA, Clermont-Ferrand University Hospital, 63000 Clermont-Ferrand, France
- SurgAR-Surgical Augmented Reality, 63000 Clermont-Ferrand, France
| | - Roland Croner
- Department of Surgery, University of Magdeburg, 39106 Magdeburg, Germany;
| | - Gaya Spolverato
- Department of Surgical, Oncological and Gastroenterological Sciences, University of Padova, 35122 Padova, Italy;
| | - Isabella Frigerio
- Department of Hepato-Pancreato-Biliary Surgery, Pederzoli Hospital, 37019 Peschiera del Garda, Italy;
| | - Alfredo Illanes
- INKA-Innovation Laboratory for Image Guided Therapy, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany;
| | - Mohammad Abu Hilal
- Unità Chirurgia Epatobiliopancreatica, Robotica e Mininvasiva, Fondazione Poliambulanza Istituto Ospedaliero, Via Bissolati, 57, 25124 Brescia, Italy;
| | - Adrian Park
- Anne Arundel Medical Center, Johns Hopkins University, Annapolis, MD 21401, USA;
| | - Eyad Elyan
- School of Computing, Robert Gordon University, Aberdeen AB10 7JG, UK;
| |
Collapse
|
35
|
Goldbraikh A, Volk T, Pugh CM, Laufer S. Using open surgery simulation kinematic data for tool and gesture recognition. Int J Comput Assist Radiol Surg 2022; 17:965-979. [PMID: 35419721 PMCID: PMC10766114 DOI: 10.1007/s11548-022-02615-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 03/22/2022] [Indexed: 01/18/2023]
Abstract
PURPOSE The use of motion sensors is emerging as a means for measuring surgical performance. Motion sensors are typically used for calculating performance metrics and assessing skill. The aim of this study was to identify surgical gestures and tools used during an open surgery suturing simulation based on motion sensor data. METHODS Twenty-five participants performed a suturing task on a variable tissue simulator. Electromagnetic motion sensors were used to measure their performance. The current study compares GRU and LSTM networks, which are known to perform well on other kinematic datasets, as well as MS-TCN++, which was developed for video data and was adapted in this work for motion sensors data. Finally, we extended all architectures for multi-tasking. RESULTS In the gesture recognition task the MS-TCN++ has the highest performance with accuracy of [Formula: see text] and F1-Macro of [Formula: see text], edit distance of [Formula: see text] and F1@10 of [Formula: see text] In the tool usage recognition task for the right hand, MS-TCN++ performs the best in most metrics with an accuracy score of [Formula: see text], F1-Macro of [Formula: see text], F1@10 of [Formula: see text], and F1@25 of [Formula: see text]. The multi-task GRU performs best in all metrics in the left-hand case, with an accuracy of [Formula: see text], edit distance of [Formula: see text], F1-Macro of [Formula: see text], F1@10 of [Formula: see text], and F1@25 of [Formula: see text]. CONCLUSION In this study, using motion sensor data, we automatically identified the surgical gestures and the tools used during an open surgery suturing simulation. Our methods may be used for computing more detailed performance metrics and assisting in automatic workflow analysis. MS-TCN++ performed better in gesture recognition as well as right-hand tool recognition, while the multi-task GRU provided better results in the left-hand case. It should be noted that our multi-task GRU network is significantly smaller and has achieved competitive results in the rest of the tasks as well.
Collapse
Affiliation(s)
- Adam Goldbraikh
- Applied Mathematics Department, Technion - Israel Institute of Technology, 3200003, Haifa, Israel.
| | - Tomer Volk
- Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, 3200003, Haifa, Israel
| | - Carla M Pugh
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 610101, USA
| | - Shlomi Laufer
- Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, 3200003, Haifa, Israel
| |
Collapse
|
36
|
Huang Y, Li J, Zhang X, Xie K, Li J, Liu Y, Ng SH, Chiu PWY, Li Z. A Surgeon Preference-Guided Autonomous Instrument Tracking Method With a Robotic Flexible Endoscope Based on dVRK Platform. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3143305] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
37
|
Sun Y, Pan B, Fu Y. Lightweight Deep Neural Network for Articulated Joint Detection of Surgical Instrument in Minimally Invasive Surgical Robot. J Digit Imaging 2022; 35:923-937. [PMID: 35266089 PMCID: PMC9485358 DOI: 10.1007/s10278-022-00616-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 01/15/2022] [Accepted: 02/27/2022] [Indexed: 11/29/2022] Open
Abstract
Vision-based detection and tracking of surgical instrument are attractive because it relies purely on surgical instrument already in the operating scenario. The vision knowledge of the surgical instruments is a crucial piece of topic for surgical task understanding, autonomous robot control and human-robot collaborative surgeries to enhance surgical outcomes. In this work, a novel method has been demonstrated by developing a multitask lightweight deep neural network framework to explore surgical instrument articulated joint detection. The model has an end-to-end architecture with two branches, which share the same high-level visual features provided by a lightweight backbone while holding respective layers targeting for specific tasks. We have designed a novel subnetwork with joint detection branch and an instrument classification branch to sufficiently take advantage of the relatedness of surgical instrument presence detection and surgical instrument articulated joint detection tasks. The lightweight joint detection branch has been employed to efficiently locate the articulated joint position with simultaneously holding low computational cost. Moreover, the surgical instrument classification branch is introduced to boost the performance of joint detection. The two branches are merged to output the articulated joint location with respective instrument type. Extensive validation has been conducted to evaluate the proposed method. The results demonstrate promising performance of our proposed method. The work represents the feasibility to perform real-time surgical instrument articulated joint detection by taking advantage of the components of surgical robot system, contributing to the reference for further surgical intelligence.
Collapse
Affiliation(s)
- Yanwen Sun
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China
| | - Bo Pan
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China.
| | - Yili Fu
- State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
38
|
Kugener G, Pangal DJ, Cardinal T, Collet C, Lechtholz-Zey E, Lasky S, Sundaram S, Markarian N, Zhu Y, Roshannai A, Sinha A, Han XY, Papyan V, Hung A, Anandkumar A, Wrobel B, Zada G, Donoho DA. Utility of the Simulated Outcomes Following Carotid Artery Laceration Video Data Set for Machine Learning Applications. JAMA Netw Open 2022; 5:e223177. [PMID: 35311962 PMCID: PMC8938712 DOI: 10.1001/jamanetworkopen.2022.3177] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
IMPORTANCE Surgical data scientists lack video data sets that depict adverse events, which may affect model generalizability and introduce bias. Hemorrhage may be particularly challenging for computer vision-based models because blood obscures the scene. OBJECTIVE To assess the utility of the Simulated Outcomes Following Carotid Artery Laceration (SOCAL)-a publicly available surgical video data set of hemorrhage complication management with instrument annotations and task outcomes-to provide benchmarks for surgical data science techniques, including computer vision instrument detection, instrument use metrics and outcome associations, and validation of a SOCAL-trained neural network using real operative video. DESIGN, SETTING, AND PARTICIPANTS For this quailty improvement study, a total of 75 surgeons with 1 to 30 years' experience (mean, 7 years) were filmed from January 1, 2017, to December 31, 2020, managing catastrophic surgical hemorrhage in a high-fidelity cadaveric training exercise at nationwide training courses. Videos were annotated from January 1 to June 30, 2021. INTERVENTIONS Surgeons received expert coaching between 2 trials. MAIN OUTCOMES AND MEASURES Hemostasis within 5 minutes (task success, dichotomous), time to hemostasis (in seconds), and blood loss (in milliliters) were recorded. Deep neural networks (DNNs) were trained to detect surgical instruments in view. Model performance was measured using mean average precision (mAP), sensitivity, and positive predictive value. RESULTS SOCAL contains 31 443 frames with 65 071 surgical instrument annotations from 147 trials with associated surgeon demographic characteristics, time to hemostasis, and recorded blood loss for each trial. Computer vision-based instrument detection methods using DNNs trained on SOCAL achieved a mAP of 0.67 overall and 0.91 for the most common surgical instrument (suction). Hemorrhage control challenges standard object detectors: detection of some surgical instruments remained poor (mAP, 0.25). On real intraoperative video, the model achieved a sensitivity of 0.77 and a positive predictive value of 0.96. Instrument use metrics derived from the SOCAL video were significantly associated with performance (blood loss). CONCLUSIONS AND RELEVANCE Hemorrhage control is a high-stakes adverse event that poses unique challenges for video analysis, but no data sets of hemorrhage control exist. The use of SOCAL, the first data set to depict hemorrhage control, allows the benchmarking of data science applications, including object detection, performance metric development, and identification of metrics associated with outcomes. In the future, SOCAL may be used to build and validate surgical data science models.
Collapse
Affiliation(s)
- Guillaume Kugener
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Dhiraj J. Pangal
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Tyler Cardinal
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Casey Collet
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Elizabeth Lechtholz-Zey
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Sasha Lasky
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Shivani Sundaram
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Nicholas Markarian
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Yichao Zhu
- Department of Computer Science, Viterbi School of Engineering, University of Southern California, Los Angeles
| | - Arman Roshannai
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Aditya Sinha
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - X. Y. Han
- Department of Operations Research and Information Engineering, Cornell University, Ithaca, New York
| | - Vardan Papyan
- Department of Mathematics, University of Toronto, Toronto, Ontario, Canada
| | - Andrew Hung
- Center for Robotic Simulation and Education, USC Institute of Urology, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Animashree Anandkumar
- Department of Computer Science and Mathematics, California Institute of Technology, Pasadena
| | - Bozena Wrobel
- Department of Otolaryngology, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Gabriel Zada
- Department of Neurosurgery, Keck School of Medicine of the University of Southern California, Los Angeles
| | - Daniel A. Donoho
- Division of Neurosurgery, Center for Neuroscience, Children’s National Hospital, Washington, DC
| |
Collapse
|
39
|
Liu J, Guo X, Yuan Y. Graph-Based Surgical Instrument Adaptive Segmentation via Domain-Common Knowledge. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:715-726. [PMID: 34673485 DOI: 10.1109/tmi.2021.3121138] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Unsupervised domain adaptation (UDA), aiming to adapt the model to an unseen domain without annotations, has drawn sustained attention in surgical instrument segmentation. Existing UDA methods neglect the domain-common knowledge of two datasets, thus failing to grasp the inter-category relationship in the target domain and leading to poor performance. To address these issues, we propose a graph-based unsupervised domain adaptation framework, named Interactive Graph Network (IGNet), to effectively adapt a model to an unlabeled new domain in surgical instrument segmentation tasks. In detail, the Domain-common Prototype Constructor (DPC) is first advanced to adaptively aggregate the feature map into domain-common prototypes using the probability mixture model, and construct a prototypical graph to interact the information among prototypes from the global perspective. In this way, DPC can grasp the co-occurrent and long-range relationship for both domains. To further narrow down the domain gap, we design a Domain-common Knowledge Incorporator (DKI) to guide the evolution of feature maps towards domain-common direction via a common-knowledge guidance graph and category-attentive graph reasoning. At last, the Cross-category Mismatch Estimator (CME) is developed to evaluate the category-level alignment from a graph perspective and assign each pixel with different adversarial weights, so as to refine the feature distribution alignment. The extensive experiments on three types of tasks demonstrate the feasibility and superiority of IGNet compared with other state-of-the-art methods. Furthermore, ablation studies verify the effectiveness of each component of IGNet. The source code is available at https://github.com/CityU-AIM-Group/Prototypical-Graph-DA.
Collapse
|
40
|
Xue Y, Liu S, Li Y, Wang P, Qian X. A new weakly supervised strategy for surgical tool detection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
41
|
Ni ZL, Zhou XH, Wang GA, Yue WQ, Li Z, Bian GB, Hou ZG. SurgiNet: Pyramid Attention Aggregation and Class-wise Self-Distillation for Surgical Instrument Segmentation. Med Image Anal 2022; 76:102310. [PMID: 34954623 DOI: 10.1016/j.media.2021.102310] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 10/01/2021] [Accepted: 11/22/2021] [Indexed: 11/21/2022]
Abstract
Surgical instrument segmentation plays a promising role in robot-assisted surgery. However, illumination issues often appear in surgical scenes, altering the color and texture of surgical instruments. Changes in visual features make surgical instrument segmentation difficult. To address illumination issues, the SurgiNet is proposed to learn pyramid attention features. The double attention module is designed to capture the semantic dependencies between locations and channels. Based on semantic dependencies, the semantic features in the disturbed area can be inferred for addressing illumination issues. Pyramid attention is aggregated to capture multi-scale features and make predictions more accurate. To perform model compression, class-wise self-distillation is proposed to enhance the representation learning of the network, which performs feature distillation within the class to eliminate interference from other classes. Top-down and multi-stage knowledge distillation is designed to distill class probability maps. By inter-layer supervision, high-level probability maps are applied to calibrate the probability distribution of low-level probability maps. Since class-wise distillation enhances the self-learning of the network, the network can get excellent performance with a lightweight backbone. The proposed network achieves the state-of-the-art performance of 89.14% mIoU on CataIS with only 1.66 GFlops and 2.05 M parameters. It also takes first place on EndoVis 2017 with 66.30% mIoU.
Collapse
Affiliation(s)
- Zhen-Liang Ni
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; The School of Artificial Intelligence, University of Chinese Academy of Sciences, China
| | - Xiao-Hu Zhou
- The School of Artificial Intelligence, University of Chinese Academy of Sciences, China
| | - Guan-An Wang
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; The School of Artificial Intelligence, University of Chinese Academy of Sciences, China
| | - Wen-Qian Yue
- The School of Artificial Intelligence, University of Chinese Academy of Sciences, China
| | - Zhen Li
- The School of Artificial Intelligence, University of Chinese Academy of Sciences, China
| | - Gui-Bin Bian
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; The School of Artificial Intelligence, University of Chinese Academy of Sciences, China.
| | - Zeng-Guang Hou
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; The School of Artificial Intelligence, University of Chinese Academy of Sciences, China; Joint Laboratory of Intelligence Science and Technology, Institute of Systems Engineering, Macau University of Science and Technology, China.
| |
Collapse
|
42
|
Dupont PE, Nelson BJ, Goldfarb M, Hannaford B, Menciassi A, O'Malley MK, Simaan N, Valdastri P, Yang GZ. A decade retrospective of medical robotics research from 2010 to 2020. Sci Robot 2021; 6:eabi8017. [PMID: 34757801 DOI: 10.1126/scirobotics.abi8017] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
[Figure: see text].
Collapse
Affiliation(s)
- Pierre E Dupont
- Department of Cardiovascular Surgery, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Bradley J Nelson
- Institute of Robotics and Intelligent Systems, Department of Mechanical and Process Engineering, ETH-Zürich, Zürich, Switzerland
| | - Michael Goldfarb
- Department of Mechanical Engineering, Vanderbilt University, Nashville, TN 37235, USA
| | - Blake Hannaford
- Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195, USA
| | | | - Marcia K O'Malley
- Department of Mechanical Engineering, Rice University, Houston, TX 77005, USA
| | - Nabil Simaan
- Department of Mechanical Engineering, Vanderbilt University, Nashville, TN 37235, USA
| | - Pietro Valdastri
- Department of Electronic and Electrical Engineering, University of Leeds, Leeds, UK
| | - Guang-Zhong Yang
- Medical Robotics Institute, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
43
|
Moglia A, Georgiou K, Georgiou E, Satava RM, Cuschieri A. A systematic review on artificial intelligence in robot-assisted surgery. Int J Surg 2021; 95:106151. [PMID: 34695601 DOI: 10.1016/j.ijsu.2021.106151] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/04/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022]
Abstract
BACKGROUND Despite the extensive published literature on the significant potential of artificial intelligence (AI) there are no reports on its efficacy in improving patient safety in robot-assisted surgery (RAS). The purposes of this work are to systematically review the published literature on AI in RAS, and to identify and discuss current limitations and challenges. MATERIALS AND METHODS A literature search was conducted on PubMed, Web of Science, Scopus, and IEEExplore according to PRISMA 2020 statement. Eligible articles were peer-review studies published in English language from January 1, 2016 to December 31, 2020. Amstar 2 was used for quality assessment. Risk of bias was evaluated with the Newcastle Ottawa Quality assessment tool. Data of the studies were visually presented in tables using SPIDER tool. RESULTS Thirty-five publications, representing 3436 patients, met the search criteria and were included in the analysis. The selected reports concern: motion analysis (n = 17), urology (n = 12), gynecology (n = 1), other specialties (n = 1), training (n = 3), and tissue retraction (n = 1). Precision for surgical tools detection varied from 76.0% to 90.6%. Mean absolute error on prediction of urinary continence after robot-assisted radical prostatectomy (RARP) ranged from 85.9 to 134.7 days. Accuracy on prediction of length of stay after RARP was 88.5%. Accuracy on recognition of the next surgical task during robot-assisted partial nephrectomy (RAPN) achieved 75.7%. CONCLUSION The reviewed studies were of low quality. The findings are limited by the small size of the datasets. Comparison between studies on the same topic was restricted due to algorithms and datasets heterogeneity. There is no proof that currently AI can identify the critical tasks of RAS operations, which determine patient outcome. There is an urgent need for studies on large datasets and external validation of the AI algorithms used. Furthermore, the results should be transparent and meaningful to surgeons, enabling them to inform patients in layman's words. REGISTRATION Review Registry Unique Identifying Number: reviewregistry1225.
Collapse
Affiliation(s)
- Andrea Moglia
- EndoCAS, Center for Computer Assisted Surgery, University of Pisa, 56124, Pisa, Italy 1st Propaedeutic Surgical Unit, Hippocrateion Athens General Hospital, Athens Medical School, National and Kapodistrian University of Athens, Greece MPLSC, Athens Medical School, National and Kapodistrian University of Athens, Greece Department of Surgery, University of Washington Medical Center, Seattle, WA, United States Scuola Superiore Sant'Anna of Pisa, 56214, Pisa, Italy Institute for Medical Science and Technology, University of Dundee, Dundee, DD2 1FD, United Kingdom
| | | | | | | | | |
Collapse
|
44
|
Xue Y, Li Y, Liu S, Wang P, Qian X. Oriented Localization of Surgical Tools by Location Encoding. IEEE Trans Biomed Eng 2021; 69:1469-1480. [PMID: 34652994 DOI: 10.1109/tbme.2021.3120430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Surgical tool localization is the foundation to a series of advanced surgical functions e.g. image guided surgical navigation. For precise scenarios like surgical tool localization, sophisticated tools and sensitive tissues can be quite close. This requires a higher localization accuracy than general object localization. And it is also meaningful to know the orientation of tools. To achieve these, this paper proposes a Compressive Sensing based Location Encoding scheme, which formulates the task of surgical tool localization in pixel space into a task of vector regression in encoding space. Furthermore with this scheme, the method is able to capture orientation of surgical tools rather than simply outputting horizontal bounding boxes. To prevent gradient vanishing, a novel back-propagation rule for sparse reconstruction is derived. The back-propagation rule is applicable to different implementations of sparse reconstruction and renders the entire network end-to-end trainable. Finally, the proposed approach gives more accurate bounding boxes as well as capturing the orientation of tools, and achieves state-of-the-art performance compared with 9 competitive both oriented and non-oriented localization methods (RRD, RefineDet, etc) on a mainstream surgical image dataset: m2cai16-tool-locations. A range of experiments support our claim that regression in CSLE space performs better than traditionally detecting bounding boxes in pixel space.
Collapse
|
45
|
Li RQ, Xie XL, Zhou XH, Liu SQ, Ni ZL, Zhou YJ, Bian GB, Hou ZG. A Unified Framework for Multi-Guidewire Endpoint Localization in Fluoroscopy Images. IEEE Trans Biomed Eng 2021; 69:1406-1416. [PMID: 34613905 DOI: 10.1109/tbme.2021.3118001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECTIVE In this paper, Keypoint Localization Region-based CNN (KL R-CNN) is proposed, which can simultaneously accomplish the guidewire detection and endpoint localization in a unified model. METHODS KL R-CNN modifies Mask R-CNN by replacing the mask branch with a novel keypoint localization branch. Besides, some settings of Mask R-CNN are also modified to generate the keypoint localization results at a higher detail level. At the same time, based on the existing metrics of Average Precision (AP) and Percentage of Correct Keypoints (PCK), a new metric named AP PCK is proposed to evaluate the overall performance on the multi-guidewire endpoint localization task. Compared with existing metrics, AP PCK is easy to use and its results are more intuitive. RESULTS Compared with existing methods, KL R-CNN has better performance when the threshold is loose, reaching a mean AP PCK of 90.65% when the threshold is 9 pixels. CONCLUSION KL R-CNN achieves the state-of-the-art performance on the multi-guidewire endpoint localization task and has application potentials. SIGNIFICANCE KL R-CNN can achieve the localization of guidewire endpoints in fluoroscopy images, which is a prerequisite for computer-assisted percutaneous coronary intervention. KL R-CNN can also be extended to other multi-instrument localization tasks.
Collapse
|
46
|
Fu Y, Xue P, Li N, Zhao P, Xu Z, Ji H, Zhang Z, Cui W, Dong E. Fusion of 3D lung CT and serum biomarkers for diagnosis of multiple pathological types on pulmonary nodules. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 210:106381. [PMID: 34496322 DOI: 10.1016/j.cmpb.2021.106381] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 08/24/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Current researches on pulmonary nodules mainly focused on the binary-classification of benign and malignant pulmonary nodules. However, in clinical applications, it is not enough to judge whether pulmonary nodules are benign or malignant. In this paper, we proposed a fusion model based on the Lung Information Dataset Containing 3D CT Images and Serum Biomarkers (LIDCCISB) we constructed to accurately diagnose the types of pulmonary nodules in squamous cell carcinoma, adenocarcinoma, inflammation and other benign diseases. METHODS Using single modal information of lung 3D CT images and single modal information of Lung Tumor Biomarkers (LTBs) in LIDCCISB, a Multi-resolution 3D Multi-classification deep learning model (Mr-Mc) and a Multi-Layer Perceptron machine learning model (MLP) were constructed for diagnosing multiple pathological types of pulmonary nodules, respectively. To comprehensively use the double modal information of CT images and LTBs, we used transfer learning to fuse Mr-Mc and MLP, and constructed a multimodal information fusion model that could classify multiple pathological types of benign and malignant pulmonary nodules. RESULTS Experiments showed that the constructed Mr-Mc model can achieve an average accuracy of 0.805 and MLP model can achieve an average accuracy of 0.887. The fusion model was verified on a dataset containing 64 samples, and achieved an average accuracy of 0.906. CONCLUSIONS This is the first study to simultaneously use CT images and LTBs to diagnose multiple pathological types of benign and malignant pulmonary nodules, and experiments showed that our research was more advanced and more suitable for practical clinical applications.
Collapse
Affiliation(s)
- Yu Fu
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China
| | - Peng Xue
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China
| | - Ning Li
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan 250021, China
| | - Peng Zhao
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan 250021, China
| | - Zhuodong Xu
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan 250021, China
| | - Huizhong Ji
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China
| | - Zhili Zhang
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China
| | - Wentao Cui
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China.
| | - Enqing Dong
- School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264209, China.
| |
Collapse
|
47
|
Gómez Rivas J, Toribio Vázquez C, Ballesteros Ruiz C, Taratkin M, Marenco JL, Cacciamani GE, Checcucci E, Okhunov Z, Enikeev D, Esperto F, Grossmann R, Somani B, Veneziano D. Artificial intelligence and simulation in urology. Actas Urol Esp 2021; 45:524-529. [PMID: 34526254 DOI: 10.1016/j.acuroe.2021.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 10/27/2020] [Indexed: 10/20/2022]
Abstract
INTRODUCTION AND OBJECTIVE Artificial intelligence (AI) is in full development and its implementation in medicine has led to an improvement in clinical and surgical practice. One of its multiple applications is surgical training, with the creation of programs that allow avoiding complications and risks for the patient. The aim of this article is to analyze the advantages of AI applied to surgical training in urology. MATERIAL AND METHODS A literary research is carried out to identify articles published in English regarding AI applied to medicine, especially in surgery and the acquisition of surgical skills. RESULTS Surgical training has evolved over time thanks to AI. A model for surgical learning where skills are acquired in a progressive way while avoiding complications to the patient, has been created. The use of simulators allows a progressive learning, providing trainees with procedures that increase in number and complexity. On the other hand, AI is used in imaging tests for surgical or treatment planning. CONCLUSION Currently, the use of AI in daily clinical practice has led to progress in medicine, specifically in surgical training.
Collapse
Affiliation(s)
- J Gómez Rivas
- Departamento de Urología, Hospital Clínico San Carlos, Madrid, Spain; Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands.
| | - C Toribio Vázquez
- Departamento de Urología, Hospital Universitario La Paz, Madrid, Spain
| | | | - M Taratkin
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Institute for Urology and Reproductive Health, Sechenov University, Moscú, Russia
| | - J L Marenco
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Departamento de Urología, Instituto Valenciano de Oncología, Valencia, Spain
| | - G E Cacciamani
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Catherine and Joseph Aresty Department of Urology, USC Institute of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| | - E Checcucci
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Division of Urology, Department of Oncology, School of Medicine, San Luigi Hospital, University of Turin, Orbassano, Italy
| | - Z Okhunov
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Department of Urology, University of California, Irvine, CA, United States
| | - D Enikeev
- Institute for Urology and Reproductive Health, Sechenov University, Moscú, Russia
| | - F Esperto
- Department of Urology, Campus Biomedico, University of Rome, Roma, Italy
| | - R Grossmann
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Eastern Maine Medical Center, Bangor, ME, United States
| | - B Somani
- Department of Urology, University Hospital Southhampton, Southampton, United Kingdom
| | - D Veneziano
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, The Netherlands; Department of Urology and Kidney Transplant, Grande Ospedale Metropolitano, Reggio Calabria, Italy
| |
Collapse
|
48
|
Gumbs AA, Frigerio I, Spolverato G, Croner R, Illanes A, Chouillard E, Elyan E. Artificial Intelligence Surgery: How Do We Get to Autonomous Actions in Surgery? SENSORS (BASEL, SWITZERLAND) 2021; 21:5526. [PMID: 34450976 PMCID: PMC8400539 DOI: 10.3390/s21165526] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 08/03/2021] [Accepted: 08/11/2021] [Indexed: 12/30/2022]
Abstract
Most surgeons are skeptical as to the feasibility of autonomous actions in surgery. Interestingly, many examples of autonomous actions already exist and have been around for years. Since the beginning of this millennium, the field of artificial intelligence (AI) has grown exponentially with the development of machine learning (ML), deep learning (DL), computer vision (CV) and natural language processing (NLP). All of these facets of AI will be fundamental to the development of more autonomous actions in surgery, unfortunately, only a limited number of surgeons have or seek expertise in this rapidly evolving field. As opposed to AI in medicine, AI surgery (AIS) involves autonomous movements. Fortuitously, as the field of robotics in surgery has improved, more surgeons are becoming interested in technology and the potential of autonomous actions in procedures such as interventional radiology, endoscopy and surgery. The lack of haptics, or the sensation of touch, has hindered the wider adoption of robotics by many surgeons; however, now that the true potential of robotics can be comprehended, the embracing of AI by the surgical community is more important than ever before. Although current complete surgical systems are mainly only examples of tele-manipulation, for surgeons to get to more autonomously functioning robots, haptics is perhaps not the most important aspect. If the goal is for robots to ultimately become more and more independent, perhaps research should not focus on the concept of haptics as it is perceived by humans, and the focus should be on haptics as it is perceived by robots/computers. This article will discuss aspects of ML, DL, CV and NLP as they pertain to the modern practice of surgery, with a focus on current AI issues and advances that will enable us to get to more autonomous actions in surgery. Ultimately, there may be a paradigm shift that needs to occur in the surgical community as more surgeons with expertise in AI may be needed to fully unlock the potential of AIS in a safe, efficacious and timely manner.
Collapse
Affiliation(s)
- Andrew A. Gumbs
- Centre Hospitalier Intercommunal de POISSY/SAINT-GERMAIN-EN-LAYE 10, Rue Champ de Gaillard, 78300 Poissy, France;
| | - Isabella Frigerio
- Department of Hepato-Pancreato-Biliary Surgery, Pederzoli Hospital, 37019 Peschiera del Garda, Italy;
| | - Gaya Spolverato
- Department of Surgical, Oncological and Gastroenterological Sciences, University of Padova, 35122 Padova, Italy;
| | - Roland Croner
- Department of General-, Visceral-, Vascular- and Transplantation Surgery, University of Magdeburg, Haus 60a, Leipziger Str. 44, 39120 Magdeburg, Germany;
| | - Alfredo Illanes
- INKA–Innovation Laboratory for Image Guided Therapy, Medical Faculty, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany;
| | - Elie Chouillard
- Centre Hospitalier Intercommunal de POISSY/SAINT-GERMAIN-EN-LAYE 10, Rue Champ de Gaillard, 78300 Poissy, France;
| | - Eyad Elyan
- School of Computing, Robert Gordon University, Aberdeen AB10 7JG, UK;
| |
Collapse
|
49
|
Li RQ, Xie XL, Zhou XH, Liu SQ, Ni ZL, Zhou YJ, Bian GB, Hou ZG. Real-Time Multi-Guidewire Endpoint Localization in Fluoroscopy Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2002-2014. [PMID: 33788685 DOI: 10.1109/tmi.2021.3069998] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The real-time localization of the guidewire endpoints is a stepping stone to computer-assisted percutaneous coronary intervention (PCI). However, methods for multi-guidewire endpoint localization in fluoroscopy images are still scarce. In this paper, we introduce a framework for real-time multi-guidewire endpoint localization in fluoroscopy images. The framework consists of two stages, first detecting all guidewire instances in the fluoroscopy image, and then locating the endpoints of each single guidewire instance. In the first stage, a YOLOv3 detector is used for guidewire detection, and a post-processing algorithm is proposed to refine the guidewire detection results. In the second stage, a Segmentation Attention-hourglass (SA-hourglass) network is proposed to predict the endpoint locations of each single guidewire instance. The SA-hourglass network can be generalized to the keypoint localization of other surgical instruments. In our experiments, the SA-hourglass network is applied not only on a guidewire dataset but also on a retinal microsurgery dataset, reaching the mean pixel error (MPE) of 2.20 pixels on the guidewire dataset and the MPE of 5.30 pixels on the retinal microsurgery dataset, both achieving the state-of-the-art localization results. Besides, the inference rate of our framework is at least 20FPS, which meets the real-time requirement of fluoroscopy images (6-12FPS).
Collapse
|
50
|
Gómez Rivas J, Toribio Vázquez C, Ballesteros Ruiz C, Taratkin M, Marenco JL, Cacciamani GE, Checcucci E, Okhunov Z, Enikeev D, Esperto F, Grossmann R, Somani B, Veneziano D. Artificial intelligence and simulation in urology. Actas Urol Esp 2021; 45:S0210-4806(21)00088-7. [PMID: 34127285 DOI: 10.1016/j.acuro.2020.10.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 10/27/2020] [Indexed: 11/28/2022]
Abstract
INTRODUCTION AND OBJECTIVE Artificial intelligence (AI) is in full development and its implementation in medicine has led to an improvement in clinical and surgical practice. One of its multiple applications is surgical training, with the creation of programs that allow avoiding complications and risks for the patient. The aim of this article is to analyze the advantages of AI applied to surgical training in urology. MATERIAL AND METHODS A literary research is carried out to identify articles published in English regarding AI applied to medicine, especially in surgery and the acquisition of surgical skills. RESULTS Surgical training has evolved over time thanks to AI. A model for surgical learning where skills are acquired in a progressive way while avoiding complications to the patient, has been created. The use of simulators allows a progressive learning, providing trainees with procedures that increase in number and complexity. On the other hand, AI is used in imaging tests for surgical or treatment planning. CONCLUSION Currently, the use of AI in daily clinical practice has led to progress in medicine, specifically in surgical training.
Collapse
Affiliation(s)
- J Gómez Rivas
- Departamento de Urología, Hospital Clínico San Carlos, Madrid, España; Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos.
| | - C Toribio Vázquez
- Departamento de Urología, Hospital Universitario La Paz, Madrid, España
| | | | - M Taratkin
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Institute for Urology and Reproductive Health, Sechenov University, Moscú, Rusia
| | - J L Marenco
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Departamento de Urología, Instituto Valenciano de Oncología, Valencia, España
| | - G E Cacciamani
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Catherine and Joseph Aresty Department of Urology, USC Institute of Urology, Keck School of Medicine, University of Southern California, Los Angeles, California, Estados Unidos
| | - E Checcucci
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Division of Urology, Department of Oncology, School of Medicine, San Luigi Hospital, University of Turin, Orbassano, Italia
| | - Z Okhunov
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Department of Urology, University of California, Irvine, California, Estados Unidos
| | - D Enikeev
- Institute for Urology and Reproductive Health, Sechenov University, Moscú, Rusia
| | - F Esperto
- Department of Urology, Campus Biomedico, University of Rome, Roma, Italia
| | - R Grossmann
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Eastern Maine Medical Center, Bangor, Maine, Estados Unidos
| | - B Somani
- Department of Urology, University Hospital Southhampton, Southampton, Reino Unido
| | - D Veneziano
- Young Academic Urologist-Urotechnology Working Party (ESUT-YAU), European Association of Urology, Arnhem, Países Bajos; Department of Urology and Kidney Transplant, Grande Ospedale Metropolitano, Reggio Calabria, Italia
| |
Collapse
|