1
|
Pan X, Bi M, Wang H, Ma C, He X. DBH-YOLO: a surgical instrument detection method based on feature separation in laparoscopic surgery. Int J Comput Assist Radiol Surg 2024:10.1007/s11548-024-03115-0. [PMID: 38613730 DOI: 10.1007/s11548-024-03115-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 03/18/2024] [Indexed: 04/15/2024]
Abstract
PURPOSE Accurately locating and analysing surgical instruments in laparoscopic surgical videos can assist doctors in postoperative quality assessment. This can provide patients with more scientific and rational solutions for healing surgical complications. Therefore, we propose an end-to-end algorithm for the detection of surgical instruments. METHODS Dual-Branched Head (DBH) and Overall Intersection over Union Loss (OIoU Loss) are introduced to solve the problem of inaccurate surgical instrument detection, both in terms of localization and classification. An effective method (DBHYOLO) for the detection for laparoscopic surgery in complex scenarios is proposed. This study manually annotates a new laparoscopic gastric cancer resection surgical instrument location dataset LGIL, which provides a better validation platform for surgical instrument detection methods. RESULTS The proposed method's performance was tested using the m2cai16-tool-locations, LGIL, and Onyeogulu datasets. The mean Average Precision (mAP) values obtained were 96.8%, 95.6%, and 98.4%, respectively, which were higher than the other classical models compared. The improved model is more effective than the benchmark network in distinguishing between surgical instrument classes with high similarity and avoiding too many missed detection cases. CONCLUSIONS In this paper, the problem of inaccurate detection of surgical instruments is addressed from two different perspectives: classification and localization. And the experimental results on three representative datasets verify the performance of DBH-YOLO. It is shown that this method has a good generalization capability.
Collapse
Affiliation(s)
- Xiaoying Pan
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, GuoDu, Xi'an, 710121, Shaanxi, China.
| | - Manrong Bi
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, GuoDu, Xi'an, 710121, Shaanxi, China
| | - Hao Wang
- School of Software, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Chenyang Ma
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, GuoDu, Xi'an, 710121, Shaanxi, China
| | - Xianli He
- Department of General Surgery, Tangdu Hospital, Air Force Medical University, Xi'an, 710038, Shaanxi, China
| |
Collapse
|
2
|
Ping L, Wang Z, Yao J, Gao J, Yang S, Li J, Shi J, Wu W, Hua S, Wang H. Application and evaluation of surgical tool and tool tip recognition based on Convolutional Neural Network in multiple endoscopic surgical scenarios. Surg Endosc 2023; 37:7376-7384. [PMID: 37580576 DOI: 10.1007/s00464-023-10323-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 07/19/2023] [Indexed: 08/16/2023]
Abstract
BACKGROUND In recent years, computer-assisted intervention and robot-assisted surgery are receiving increasing attention. The need for real-time identification and tracking of surgical tools and tool tips is constantly demanding. A series of researches focusing on surgical tool tracking and identification have been performed. However, the size of dataset, the sensitivity/precision, and the response time of these studies were limited. In this work, we developed and utilized an automated method based on Convolutional Neural Network (CNN) and You Only Look Once (YOLO) v3 algorithm to locate and identify surgical tools and tool tips covering five different surgical scenarios. MATERIALS AND METHODS An algorithm of object detection was applied to identify and locate the surgical tools and tool tips. DarkNet-19 was used as Backbone Network and YOLOv3 was modified and applied for the detection. We included a series of 181 endoscopy videos covering 5 different surgical scenarios: pancreatic surgery, thyroid surgery, colon surgery, gastric surgery, and external scenes. A total amount of 25,333 images containing 94,463 targets were collected. Training and test sets were divided in a proportion of 2.5:1. The data sets were openly stored at the Kaggle database. RESULTS Under an Intersection over Union threshold of 0.5, the overall sensitivity and precision rate of the model were 93.02% and 89.61% for tool recognition and 87.05% and 83.57% for tool tip recognition, respectively. The model demonstrated the highest tool and tool tip recognition sensitivity and precision rate under external scenes. Among the four different internal surgical scenes, the network had better performances in pancreatic and colon surgeries and poorer performances in gastric and thyroid surgeries. CONCLUSION We developed a surgical tool and tool tip recognition model based on CNN and YOLOv3. Validation of our model demonstrated satisfactory precision, accuracy, and robustness across different surgical scenes.
Collapse
Affiliation(s)
- Lu Ping
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Zhihong Wang
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jingjing Yao
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Junyi Gao
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Sen Yang
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jiayi Li
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jile Shi
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Wenming Wu
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Surong Hua
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| | - Huizhen Wang
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| |
Collapse
|
3
|
Reiter W. Domain generalization improves end-to-end object detection for real-time surgical tool detection. Int J Comput Assist Radiol Surg 2022; 18:939-944. [PMID: 36581742 DOI: 10.1007/s11548-022-02823-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/20/2022] [Indexed: 12/31/2022]
Abstract
PURPOSE Computer assistance for endoscopic surgery depends on knowledge about the contents in an endoscopic scene. An important step of analysing the video contents is real-time surgical tool detection. Most methods for tool detection nevertheless depend on multi-step algorithms building upon prior knowledge like anchor boxes or non-maximum suppression which ultimately decrease performance. A real-world difficulty encountered by learning-based methods are limited datasets. Training a neural network on data matching a specific distribution (e.g. from a single hospital or showing a specific type of surgery) can result in a lack of generalization. METHODS In this paper, we propose the application of a transformer based architecture for end-to-end tool detection. This architecture promises state-of-the-art accuracy while decreasing the complexity resulting in improved run-time performance. To improve the lack of cross-domain generalization due to limited datasets, we enhance the architecture with a latent feature space via variational encoding to capture common intra-domain information. This feature space models the linear dependencies between domains by constraining their rank. RESULTS The trained neural networks show a distinct improvement on out-of-domain data indicating better generalization to unseen domains. Inference with the end-to-end architecture can be performed at up to 138 frames per second (FPS) achieving a speedup in comparison to older approaches. CONCLUSIONS Experimental results on three representative datasets demonstrate the performance of the method. We also show that our approach leads to better domain generalization.
Collapse
|
4
|
Surgical Tool Datasets for Machine Learning Research: A Survey. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01640-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThis paper is a comprehensive survey of datasets for surgical tool detection and related surgical data science and machine learning techniques and algorithms. The survey offers a high level perspective of current research in this area, analyses the taxonomy of approaches adopted by researchers using surgical tool datasets, and addresses key areas of research, such as the datasets used, evaluation metrics applied and deep learning techniques utilised. Our presentation and taxonomy provides a framework that facilitates greater understanding of current work, and highlights the challenges and opportunities for further innovative and useful research.
Collapse
|
5
|
Sánchez-Brizuela G, Santos-Criado FJ, Sanz-Gobernado D, de la Fuente-López E, Fraile JC, Pérez-Turiel J, Cisnal A. Gauze Detection and Segmentation in Minimally Invasive Surgery Video Using Convolutional Neural Networks. SENSORS (BASEL, SWITZERLAND) 2022; 22:5180. [PMID: 35890857 PMCID: PMC9319965 DOI: 10.3390/s22145180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 06/30/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]
Abstract
Medical instruments detection in laparoscopic video has been carried out to increase the autonomy of surgical robots, evaluate skills or index recordings. However, it has not been extended to surgical gauzes. Gauzes can provide valuable information to numerous tasks in the operating room, but the lack of an annotated dataset has hampered its research. In this article, we present a segmentation dataset with 4003 hand-labelled frames from laparoscopic video. To prove the dataset potential, we analyzed several baselines: detection using YOLOv3, coarse segmentation, and segmentation with a U-Net. Our results show that YOLOv3 can be executed in real time but provides a modest recall. Coarse segmentation presents satisfactory results but lacks inference speed. Finally, the U-Net baseline achieves a good speed-quality compromise running above 30 FPS while obtaining an IoU of 0.85. The accuracy reached by U-Net and its execution speed demonstrate that precise and real-time gauze segmentation can be achieved, training convolutional neural networks on the proposed dataset.
Collapse
Affiliation(s)
- Guillermo Sánchez-Brizuela
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Francisco-Javier Santos-Criado
- Escuela Técnica Superior de Ingenieros Industriales, Universidad Politécnica de Madrid, Calle de José Gutiérrez Abascal, 2, 28006 Madrid, Spain;
| | - Daniel Sanz-Gobernado
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Eusebio de la Fuente-López
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Juan-Carlos Fraile
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Javier Pérez-Turiel
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| | - Ana Cisnal
- Instituto de las Tecnologías Avanzadas de la Producción (ITAP), Universidad de Valladolid, Paseo del Cauce 59, 47011 Valladolid, Spain; (D.S.-G.); (E.d.l.F.-L.); (J.-C.F.); (J.P.-T.); (A.C.)
| |
Collapse
|
6
|
Video-based fully automatic assessment of open surgery suturing skills. Int J Comput Assist Radiol Surg 2022; 17:437-448. [PMID: 35103921 PMCID: PMC8805431 DOI: 10.1007/s11548-022-02559-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 01/03/2022] [Indexed: 01/09/2023]
Abstract
Purpose The goal of this study was to develop a new reliable open surgery suturing simulation system for training medical students in situations where resources are limited or in the domestic setup. Namely, we developed an algorithm for tools and hands localization as well as identifying the interactions between them based on simple webcam video data, calculating motion metrics for assessment of surgical skill. Methods Twenty-five participants performed multiple suturing tasks using our simulator. The YOLO network was modified to a multi-task network for the purpose of tool localization and tool–hand interaction detection. This was accomplished by splitting the YOLO detection heads so that they supported both tasks with minimal addition to computer run-time. Furthermore, based on the outcome of the system, motion metrics were calculated. These metrics included traditional metrics such as time and path length as well as new metrics assessing the technique participants use for holding the tools. Results The dual-task network performance was similar to that of two networks, while computational load was only slightly bigger than one network. In addition, the motion metrics showed significant differences between experts and novices. Conclusion While video capture is an essential part of minimal invasive surgery, it is not an integral component of open surgery. Thus, new algorithms, focusing on the unique challenges open surgery videos present, are required. In this study, a dual-task network was developed to solve both a localization task and a hand–tool interaction task. The dual network may be easily expanded to a multi-task network, which may be useful for images with multiple layers and for evaluating the interaction between these different layers. Supplementary Information The online version contains supplementary material available at 10.1007/s11548-022-02559-6.
Collapse
|
7
|
Koskinen J, Torkamani-Azar M, Hussein A, Huotarinen A, Bednarik R. Automated tool detection with deep learning for monitoring kinematics and eye-hand coordination in microsurgery. Comput Biol Med 2021; 141:105121. [PMID: 34968859 DOI: 10.1016/j.compbiomed.2021.105121] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/30/2021] [Accepted: 12/03/2021] [Indexed: 11/03/2022]
Abstract
In microsurgical procedures, surgeons use micro-instruments under high magnifications to handle delicate tissues. These procedures require highly skilled attentional and motor control for planning and implementing eye-hand coordination strategies. Eye-hand coordination in surgery has mostly been studied in open, laparoscopic, and robot-assisted surgeries, as there are no available tools to perform automatic tool detection in microsurgery. We introduce and investigate a method for simultaneous detection and processing of micro-instruments and gaze during microsurgery. We train and evaluate a convolutional neural network for detecting 17 microsurgical tools with a dataset of 7500 frames from 20 videos of simulated and real surgical procedures. Model evaluations result in mean average precision at the 0.5 threshold of 89.5-91.4% for validation and 69.7-73.2% for testing over partially unseen surgical settings, and the average inference time of 39.90 ± 1.2 frames/second. While prior research has mostly evaluated surgical tool detection on homogeneous datasets with limited number of tools, we demonstrate the feasibility of transfer learning, and conclude that detectors that generalize reliably to new settings require data from several different surgical procedures. In a case study, we apply the detector with a microscope eye tracker to investigate tool use and eye-hand coordination during an intracranial vessel dissection task. The results show that tool kinematics differentiate microsurgical actions. The gaze-to-microscissors distances are also smaller during dissection than other actions when the surgeon has more space to maneuver. The presented detection pipeline provides the clinical and research communities with a valuable resource for automatic content extraction and objective skill assessment in various microsurgical environments.
Collapse
Affiliation(s)
- Jani Koskinen
- School of Computing, University of Eastern Finland, Länsikatu 15, Joensuu, 80100, Pohjois-Karjala, Finland.
| | - Mastaneh Torkamani-Azar
- School of Computing, University of Eastern Finland, Länsikatu 15, Joensuu, 80100, Pohjois-Karjala, Finland
| | - Ahmed Hussein
- Microsurgery Center, Kuopio University Hospital, Kuopio, 70211, Pohjois-Savo, Finland; Department of Neurosurgery, Faculty of Medicine, Assiut University, Assiut, 71111, Egypt
| | - Antti Huotarinen
- Microsurgery Center, Kuopio University Hospital, Kuopio, 70211, Pohjois-Savo, Finland; Department of Neurosurgery, Institute of Clinical Medicine, Kuopio University Hospital, Kuopio, 70211, Pohjois-Savo, Finland
| | - Roman Bednarik
- School of Computing, University of Eastern Finland, Länsikatu 15, Joensuu, 80100, Pohjois-Karjala, Finland
| |
Collapse
|
8
|
Using deep learning to identify the recurrent laryngeal nerve during thyroidectomy. Sci Rep 2021; 11:14306. [PMID: 34253767 PMCID: PMC8275665 DOI: 10.1038/s41598-021-93202-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Accepted: 06/22/2021] [Indexed: 11/16/2022] Open
Abstract
Surgeons must visually distinguish soft-tissues, such as nerves, from surrounding anatomy to prevent complications and optimize patient outcomes. An accurate nerve segmentation and analysis tool could provide useful insight for surgical decision-making. Here, we present an end-to-end, automatic deep learning computer vision algorithm to segment and measure nerves. Unlike traditional medical imaging, our unconstrained setup with accessible handheld digital cameras, along with the unstructured open surgery scene, makes this task uniquely challenging. We investigate one common procedure, thyroidectomy, during which surgeons must avoid damaging the recurrent laryngeal nerve (RLN), which is responsible for human speech. We evaluate our segmentation algorithm on a diverse dataset across varied and challenging settings of operating room image capture, and show strong segmentation performance in the optimal image capture condition. This work lays the foundation for future research in real-time tissue discrimination and integration of accessible, intelligent tools into open surgery to provide actionable insights.
Collapse
|
9
|
Bamba Y, Ogawa S, Itabashi M, Shindo H, Kameoka S, Okamoto T, Yamamoto M. Object and anatomical feature recognition in surgical video images based on a convolutional neural network. Int J Comput Assist Radiol Surg 2021; 16:2045-2054. [PMID: 34169465 PMCID: PMC8224261 DOI: 10.1007/s11548-021-02434-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 06/17/2021] [Indexed: 12/14/2022]
Abstract
Purpose Artificial intelligence-enabled techniques can process large amounts of surgical data and may be utilized for clinical decision support to recognize or forecast adverse events in an actual intraoperative scenario. To develop an image-guided navigation technology that will help in surgical education, we explored the performance of a convolutional neural network (CNN)-based computer vision system in detecting intraoperative objects. Methods The surgical videos used for annotation were recorded during surgeries conducted in the Department of Surgery of Tokyo Women’s Medical University from 2019 to 2020. Abdominal endoscopic images were cut out from manually captured surgical videos. An open-source programming framework for CNN was used to design a model that could recognize and segment objects in real time through IBM Visual Insights. The model was used to detect the GI tract, blood, vessels, uterus, forceps, ports, gauze and clips in the surgical images. Results The accuracy, precision and recall of the model were 83%, 80% and 92%, respectively. The mean average precision (mAP), the calculated mean of the precision for each object, was 91%. Among surgical tools, the highest recall and precision of 96.3% and 97.9%, respectively, were achieved for forceps. Among the anatomical structures, the highest recall and precision of 92.9% and 91.3%, respectively, were achieved for the GI tract. Conclusion The proposed model could detect objects in operative images with high accuracy, highlighting the possibility of using AI-based object recognition techniques for intraoperative navigation. Real-time object recognition will play a major role in navigation surgery and surgical education. Supplementary Information The online version contains supplementary material available at 10.1007/s11548-021-02434-w.
Collapse
Affiliation(s)
- Yoshiko Bamba
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan.
| | - Shimpei Ogawa
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan
| | - Michio Itabashi
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan
| | | | | | - Takahiro Okamoto
- Department of Breast Endocrinology Surgery, Tokyo Women's Medical University, Tokyo, Japan
| | - Masakazu Yamamoto
- Department of Surgery, Institute of Gastroenterology, Tokyo Women's Medical University, 8-1, Kawadacho Shinjuku-ku, Tokyo, 162-8666, Japan
| |
Collapse
|
10
|
Cho SM, Kim YG, Jeong J, Kim I, Lee HJ, Kim N. Automatic tip detection of surgical instruments in biportal endoscopic spine surgery. Comput Biol Med 2021; 133:104384. [PMID: 33864974 DOI: 10.1016/j.compbiomed.2021.104384] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 03/29/2021] [Accepted: 04/05/2021] [Indexed: 11/27/2022]
Abstract
BACKGROUND Recent advances in robotics and deep learning can be used in endoscopic surgeries and can provide numerous advantages by freeing one of the surgeon's hands. This study aims to automatically detect the tip of the instrument, localize a point, and evaluate the detection accuracy in biportal endoscopic spine surgery (BESS). The tip detection could serve as a preliminary study for the development of vision intelligence in robotic endoscopy. METHODS The dataset contains 2310 frames from 9 BESS videos with x and y coordinates of the tip annotated by an expert. We trained two state-of-the-art detectors, RetinaNet and YOLOv2, with bounding boxes centered around the tip annotations with specific margin sizes to determine the optimal margin size for detecting the tip of the instrument and localizing the point. We calculated the recall, precision, and F1-score with a fixed box size for both ground truth tip coordinates and predicted midpoints to compare the performance of the models trained with different margin size bounding boxes. RESULTS For RetinaNet, a margin size of 150 pixels was optimal with a recall of 1.000, precision of 0.733, and F1-score of 0.846. For YOLOv2, a margin size of 150 pixels was optimal with a recall of 0.864, precision of 0.808, F1-score of 0.835. Also, the optimal margin size of 150 pixels of RetinaNet was used to cross-validate its overall robustness. The resulting mean recall, precision, and F1-score were 1.000 ± 0.000, 0.767 ± 0.033, and 0.868 ± 0.022, respectively. CONCLUSIONS In this study, we evaluated an automatic tip detection method for surgical instruments in endoscopic surgery, compared two state-of-the-art detection algorithms, RetinaNet and YOLOv2, and validated the robustness with cross-validation. This method can be applied in different types of endoscopy tip detection.
Collapse
Affiliation(s)
- Sue Min Cho
- Department of Convergence Medicine, Biomedical Engineering Research Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Young-Gon Kim
- Department of Convergence Medicine, Biomedical Engineering Research Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea
| | - Jinhoon Jeong
- Department of Convergence Medicine, Biomedical Engineering Research Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea
| | - Inhwan Kim
- Department of Convergence Medicine, Biomedical Engineering Research Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea
| | - Ho-Jin Lee
- Department of Orthopaedic Surgery, Chungnam National University School of Medicine, Seoul, South Korea
| | - Namkug Kim
- Department of Convergence Medicine, Biomedical Engineering Research Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea.
| |
Collapse
|
11
|
Kanakatte A, Ramaswamy A, Gubbi J, Ghose A, Purushothaman B. Surgical tool segmentation and localization using spatio-temporal deep network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:1658-1661. [PMID: 33018314 DOI: 10.1109/embc44109.2020.9176676] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Laparoscopic cholecystectomy surgery is a minimally invasive surgery to remove the gallbladder, where surgical instruments are inserted through small incisions in the abdomen with the help of a laparoscope. Identification of tool presence and precise segmentation of tools from the video is very important in understanding the quality of the surgery and training budding surgeons. Precise segmentation of tools is required to track the tools during real-time surgeries. In this paper, a new pixel-wise instance segmentation algorithm is proposed, which segments and localizes the surgical tool using spatio-temporal deep network. The performance of the proposed has been compared with the state-of-the-art image-based instance segmentation method using the Cholec80 dataset. It is also compared with methods in the literature using frame-level presence detection and spatial detection with good results.
Collapse
|
12
|
Yang C, Zhao Z, Hu S. Image-based laparoscopic tool detection and tracking using convolutional neural networks: a review of the literature. Comput Assist Surg (Abingdon) 2020; 25:15-28. [PMID: 32886540 DOI: 10.1080/24699322.2020.1801842] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
Intraoperative detection and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware, such as tracking systems or the robot encoders, are cumbersome and lack accuracy, surgical vision is evolving as a promising technique to detect and track the instruments using only endoscopic images. The present paper presents a review of the literature regarding image-based laparoscopic tool detection and tracking using convolutional neural networks (CNNs) and consists of four primary parts: (1) fundamentals of CNN; (2) public datasets; (3) CNN-based methods for the detection and tracking of laparoscopic instruments; and (4) discussion and conclusion. To help researchers quickly understand the various existing CNN-based algorithms, some basic information and a quantitative estimation of several performances are analyzed and compared from the perspective of 'partial CNN approaches' and 'full CNN approaches'. Moreover, we highlight the challenges related to research of CNN-based detection algorithms and provide possible future developmental directions.
Collapse
Affiliation(s)
- Congmin Yang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Zijian Zhao
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Sanyuan Hu
- Department of General surgery, First Affiliated Hospital of Shandong First Medical University, Jinan, China
| |
Collapse
|
13
|
Tanzi L, Piazzolla P, Vezzetti E. Intraoperative surgery room management: A deep learning perspective. Int J Med Robot 2020; 16:1-12. [PMID: 32510857 DOI: 10.1002/rcs.2136] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 04/21/2020] [Accepted: 06/03/2020] [Indexed: 12/22/2022]
Abstract
PURPOSE The current study aimed to systematically review the literature addressing the use of deep learning (DL) methods in intraoperative surgery applications, focusing on the data collection, the objectives of these tools and, more technically, the DL-based paradigms utilized. METHODS A literature search with classic databases was performed: we identified, with the use of specific keywords, a total of 996 papers. Among them, we selected 52 for effective analysis, focusing on articles published after January 2015. RESULTS The preliminary results of the implementation of DL in clinical setting are encouraging. Almost all the surgery sub-fields have seen the advent of artificial intelligence (AI) applications and the results outperformed the previous techniques in the majority of the cases. From these results, a conceptualization of an intelligent operating room (IOR) is also presented. CONCLUSION This evaluation outlined how AI and, in particular, DL are revolutionizing the surgery field, with numerous applications, such as context detection and room management. This process is evolving years by years into the realization of an IOR, equipped with technologies perfectly suited to drastically improve the surgical workflow.
Collapse
|
14
|
Development of a Deep Learning-Based Algorithm to Detect the Distal End of a Surgical Instrument. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10124245] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
This work aims to develop an algorithm to detect the distal end of a surgical instrument using object detection with deep learning. We employed nine video recordings of carotid endarterectomies for training and testing. We obtained regions of interest (ROI; 32 × 32 pixels), at the end of the surgical instrument on the video images, as supervised data. We applied data augmentation to these ROIs. We employed a You Only Look Once Version 2 (YOLOv2) -based convolutional neural network as the network model for training. The detectors were validated to evaluate average detection precision. The proposed algorithm used the central coordinates of the bounding boxes predicted by YOLOv2. Using the test data, we calculated the detection rate. The average precision (AP) for the ROIs, without data augmentation, was 0.4272 ± 0.108. The AP with data augmentation, of 0.7718 ± 0.0824, was significantly higher than that without data augmentation. The detection rates, including the calculated coordinates of the center points in the centers of 8 × 8 pixels and 16 × 16 pixels, were 0.6100 ± 0.1014 and 0.9653 ± 0.0177, respectively. We expect that the proposed algorithm will be efficient for the analysis of surgical records.
Collapse
|