1
|
Ghamsarian N, El-Shabrawi Y, Nasirihaghighi S, Putzgruber-Adamitsch D, Zinkernagel M, Wolf S, Schoeffmann K, Sznitman R. Cataract-1K Dataset for Deep-Learning-Assisted Analysis of Cataract Surgery Videos. Sci Data 2024; 11:373. [PMID: 38609405 PMCID: PMC11014927 DOI: 10.1038/s41597-024-03193-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 03/28/2024] [Indexed: 04/14/2024] Open
Abstract
In recent years, the landscape of computer-assisted interventions and post-operative surgical video analysis has been dramatically reshaped by deep-learning techniques, resulting in significant advancements in surgeons' skills, operation room management, and overall surgical outcomes. However, the progression of deep-learning-powered surgical technologies is profoundly reliant on large-scale datasets and annotations. In particular, surgical scene understanding and phase recognition stand as pivotal pillars within the realm of computer-assisted surgery and post-operative assessment of cataract surgery videos. In this context, we present the largest cataract surgery video dataset that addresses diverse requisites for constructing computerized surgical workflow analysis and detecting post-operative irregularities in cataract surgery. We validate the quality of annotations by benchmarking the performance of several state-of-the-art neural network architectures for phase recognition and surgical scene segmentation. Besides, we initiate the research on domain adaptation for instrument segmentation in cataract surgery by evaluating cross-domain instrument segmentation performance in cataract surgery videos. The dataset and annotations are publicly available in Synapse.
Collapse
Affiliation(s)
- Negin Ghamsarian
- Center for Artificial Intelligence in Medicine (CAIM), Department of Medicine, University of Bern, Bern, Switzerland
| | - Yosuf El-Shabrawi
- Department of Ophthalmology, Klinikum Klagenfurt, Klagenfurt, Austria
| | - Sahar Nasirihaghighi
- Department of Information Technology, University of Klagenfurt, Klagenfurt, Austria
| | | | | | - Sebastian Wolf
- Department of Ophthalmology, Inselspital, Bern, Switzerland
| | - Klaus Schoeffmann
- Department of Information Technology, University of Klagenfurt, Klagenfurt, Austria.
| | - Raphael Sznitman
- Center for Artificial Intelligence in Medicine (CAIM), Department of Medicine, University of Bern, Bern, Switzerland
| |
Collapse
|
2
|
Müller S, Jain M, Sachdeva B, Shah PN, Holz FG, Finger RP, Murali K, Wintergerst MWM, Schultz T. Artificial Intelligence in Cataract Surgery: A Systematic Review. Transl Vis Sci Technol 2024; 13:20. [PMID: 38618893 PMCID: PMC11033603 DOI: 10.1167/tvst.13.4.20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 02/12/2024] [Indexed: 04/16/2024] Open
Abstract
Purpose The purpose of this study was to assess the current use and reliability of artificial intelligence (AI)-based algorithms for analyzing cataract surgery videos. Methods A systematic review of the literature about intra-operative analysis of cataract surgery videos with machine learning techniques was performed. Cataract diagnosis and detection algorithms were excluded. Resulting algorithms were compared, descriptively analyzed, and metrics summarized or visually reported. The reproducibility and reliability of the methods and results were assessed using a modified version of the Medical Image Computing and Computer-Assisted (MICCAI) checklist. Results Thirty-eight of the 550 screened studies were included, 20 addressed the challenge of instrument detection or tracking, 9 focused on phase discrimination, and 8 predicted skill and complications. Instrument detection achieves an area under the receiver operator characteristic curve (ROC AUC) between 0.976 and 0.998, instrument tracking an mAP between 0.685 and 0.929, phase recognition an ROC AUC between 0.773 and 0.990, and complications or surgical skill performs with an ROC AUC between 0.570 and 0.970. Conclusions The studies showed a wide variation in quality and pose a challenge regarding replication due to a small number of public datasets (none for manual small incision cataract surgery) and seldom published source code. There is no standard for reported outcome metrics and validation of the models on external datasets is rare making comparisons difficult. The data suggests that tracking of instruments and phase detection work well but surgical skill and complication recognition remains a challenge for deep learning. Translational Relevance This overview of cataract surgery analysis with AI models provides translational value for improving training of the clinician by identifying successes and challenges.
Collapse
Affiliation(s)
- Simon Müller
- University Hospital Bonn, Department of Ophthalmology, Bonn, Germany
| | | | - Bhuvan Sachdeva
- Microsoft Research, Bengaluru, India
- Sankara Eye Hospital, Bengaluru, Karnataka, India
| | | | - Frank G. Holz
- University Hospital Bonn, Department of Ophthalmology, Bonn, Germany
| | - Robert P. Finger
- University Hospital Bonn, Department of Ophthalmology, Bonn, Germany
- Department of Ophthalmology, University Medical Center Mannheim, Heidelberg University, Mannheim, Germany
| | | | | | - Thomas Schultz
- B-IT and Department of Computer Science, University of Bonn, Bonn, Germany
- Lamarr Institute for Machine Learning and Artificial Intelligence, Dortmund, Germany
| |
Collapse
|
3
|
Kostiuchik G, Sharan L, Mayer B, Wolf I, Preim B, Engelhardt S. Surgical phase and instrument recognition: how to identify appropriate dataset splits. Int J Comput Assist Radiol Surg 2024:10.1007/s11548-024-03063-9. [PMID: 38285380 DOI: 10.1007/s11548-024-03063-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/08/2024] [Indexed: 01/30/2024]
Abstract
PURPOSE Machine learning approaches can only be reliably evaluated if training, validation, and test data splits are representative and not affected by the absence of classes. Surgical workflow and instrument recognition are two tasks that are complicated in this manner, because of heavy data imbalances resulting from different length of phases and their potential erratic occurrences. Furthermore, sub-properties like instrument (co-)occurrence are usually not particularly considered when defining the split. METHODS We present a publicly available data visualization tool that enables interactive exploration of dataset partitions for surgical phase and instrument recognition. The application focuses on the visualization of the occurrence of phases, phase transitions, instruments, and instrument combinations across sets. Particularly, it facilitates assessment of dataset splits, especially regarding identification of sub-optimal dataset splits. RESULTS We performed analysis of the datasets Cholec80, CATARACTS, CaDIS, M2CAI-workflow, and M2CAI-tool using the proposed application. We were able to uncover phase transitions, individual instruments, and combinations of surgical instruments that were not represented in one of the sets. Addressing these issues, we identify possible improvements in the splits using our tool. A user study with ten participants demonstrated that the participants were able to successfully solve a selection of data exploration tasks. CONCLUSION In highly unbalanced class distributions, special care should be taken with respect to the selection of an appropriate dataset split because it can greatly influence the assessments of machine learning approaches. Our interactive tool allows for determination of better splits to improve current practices in the field. The live application is available at https://cardio-ai.github.io/endovis-ml/ .
Collapse
Affiliation(s)
- Georgii Kostiuchik
- Department of Cardiac Surgery, Heidelberg University Hospital, Heidelberg, Germany.
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany.
| | - Lalith Sharan
- Department of Cardiac Surgery, Heidelberg University Hospital, Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
| | - Benedikt Mayer
- Department of Simulation and Graphics, University of Magdeburg, Magdeburg, Germany
| | - Ivo Wolf
- Department of Computer Science, Mannheim University of Applied Sciences, Mannheim, Germany
| | - Bernhard Preim
- Department of Simulation and Graphics, University of Magdeburg, Magdeburg, Germany
| | - Sandy Engelhardt
- Department of Cardiac Surgery, Heidelberg University Hospital, Heidelberg, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Heidelberg/Mannheim, Heidelberg, Germany
| |
Collapse
|
4
|
Mahmoud O, Zhang H, Matton N, Mian SI, Tannen B, Nallasamy N. CatStep: Automated Cataract Surgical Phase Classification and Boundary Segmentation Leveraging Inflated 3D-Convolutional Neural Network Architectures and BigCat. OPHTHALMOLOGY SCIENCE 2024; 4:100405. [PMID: 38054105 PMCID: PMC10694765 DOI: 10.1016/j.xops.2023.100405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 09/06/2023] [Accepted: 09/15/2023] [Indexed: 12/07/2023]
Abstract
Objective Accurate identification of surgical phases during cataract surgery is essential for improving surgical feedback and performance analysis. Time spent in each surgical phase is an indicator of performance, and segmenting out specific phases for further analysis can simplify providing both qualitative and quantitative feedback on surgical maneuvers. Study Design Retrospective surgical video analysis. Subjects One hundred ninety cataract surgical videos from the BigCat dataset (comprising nearly 4 million frames, each labeled with 1 of 11 nonoverlapping surgical phases). Methods Four machine learning architectures were developed for segmentation of surgical phases. Models were trained using cataract surgical videos from the BigCat dataset. Main Outcome Measures Models were evaluated using metrics applied to frame-by-frame output and, uniquely in this work, metrics applied to phase output. Results The final model, CatStep, a combination of a temporally sensitive model (Inflated 3D Densenet) and a spatially sensitive model (Densenet169), achieved an F1-score of 0.91 and area under the receiver operating characteristic curve of 0.95. Phase-level metrics showed considerable boundary segmentation performance with a median absolute error of phase start and end time of just 0.3 seconds and 0.1 seconds, respectively, a segmental F1-score @70 of 0.94, an oversegmentation score of 0.89, and a segmental edit score of 0.92. Conclusion This study demonstrates the feasibility of high-performance automated surgical phase identification for cataract surgery and highlights the potential for improved surgical feedback and performance analysis. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Ossama Mahmoud
- Department of Ophthalmology and Visual Sciences, Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
- School of Medicine, Wayne State University, Detroit, Michigan
| | - Han Zhang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan
| | - Nicholas Matton
- Department of Computer Science, University of Michigan, Ann Arbor, Michigan
| | - Shahzad I. Mian
- Department of Ophthalmology and Visual Sciences, Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
| | - Bradford Tannen
- Department of Ophthalmology and Visual Sciences, Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
| | - Nambi Nallasamy
- Department of Ophthalmology and Visual Sciences, Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
5
|
Casas-Yrurzum S, Gimeno J, Casanova-Salas P, García-Pereira I, García del Olmo E, Salvador A, Guijarro R, Zaragoza C, Fernández M. A new mixed reality tool for training in minimally invasive robotic-assisted surgery. Health Inf Sci Syst 2023; 11:34. [PMID: 37545486 PMCID: PMC10397172 DOI: 10.1007/s13755-023-00238-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2023] [Indexed: 08/08/2023] Open
Abstract
Robotic-assisted surgery (RAS) is developing an increasing role in surgical practice. Therefore, it is of the utmost importance to introduce this paradigm into surgical training programs. However, the steep learning curve of RAS remains a problem that hinders the development and widespread use of this surgical paradigm. For this reason, it is important to be able to train surgeons in the use of RAS procedures. RAS involves distinctive features that makes its learning different to other minimally invasive surgical procedures. One of these features is that the surgeons operate using a stereoscopic console. Therefore, it is necessary to perform RAS training stereoscopically. This article presents a mixed-reality (MR) tool for the stereoscopic visualization, annotation and collaborative display of RAS surgical procedures. The tool is an MR application because it can display real stereoscopic content and augment it with virtual elements (annotations) properly registered in 3D and tracked over time. This new tool allows the registration of surgical procedures, teachers (experts) and students (trainees), so that the teacher can share a set of videos with their students, annotate them with virtual information and use a shared virtual pointer with the students. The students can visualize the videos within a web environment using their personal mobile phones or a desktop stereo system. The use of the tool has been assessed by a group of 15 surgeons during a robotic-surgery master's course. The results show that surgeons consider that this tool can be very useful in RAS training.
Collapse
Affiliation(s)
- Sergio Casas-Yrurzum
- Institute of Robotics and Information Technology and Communication (IRTIC), University of Valencia, Valencia, Spain
| | - Jesús Gimeno
- Institute of Robotics and Information Technology and Communication (IRTIC), University of Valencia, Valencia, Spain
| | - Pablo Casanova-Salas
- Institute of Robotics and Information Technology and Communication (IRTIC), University of Valencia, Valencia, Spain
| | - Inma García-Pereira
- Institute of Robotics and Information Technology and Communication (IRTIC), University of Valencia, Valencia, Spain
| | - Eva García del Olmo
- General and Gastrointestinal Surgery, Fundación Investigación Consorcio Hospital General Universitario de Valencia (FIHGUV), Valencia, Spain
| | - Antonio Salvador
- General and Gastrointestinal Surgery, Fundación Investigación Consorcio Hospital General Universitario de Valencia (FIHGUV), Valencia, Spain
| | - Ricardo Guijarro
- Thoracic Surgery, Fundación Investigación Consorcio Hospital General Universitario de Valencia (FIHGUV), Valencia, Spain
| | - Cristóbal Zaragoza
- General and Gastrointestinal Surgery, Fundación Investigación Consorcio Hospital General Universitario de Valencia (FIHGUV), Valencia, Spain
| | - Marcos Fernández
- Institute of Robotics and Information Technology and Communication (IRTIC), University of Valencia, Valencia, Spain
| |
Collapse
|
6
|
Demir KC, Schieber H, Weise T, Roth D, May M, Maier A, Yang SH. Deep Learning in Surgical Workflow Analysis: A Review of Phase and Step Recognition. IEEE J Biomed Health Inform 2023; 27:5405-5417. [PMID: 37665700 DOI: 10.1109/jbhi.2023.3311628] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
Abstract
OBJECTIVE In the last two decades, there has been a growing interest in exploring surgical procedures with statistical models to analyze operations at different semantic levels. This information is necessary for developing context-aware intelligent systems, which can assist the physicians during operations, evaluate procedures afterward or help the management team to effectively utilize the operating room. The objective is to extract reliable patterns from surgical data for the robust estimation of surgical activities performed during operations. The purpose of this article is to review the state-of-the-art deep learning methods that have been published after 2018 for analyzing surgical workflows, with a focus on phase and step recognition. METHODS Three databases, IEEE Xplore, Scopus, and PubMed were searched, and additional studies are added through a manual search. After the database search, 343 studies were screened and a total of 44 studies are selected for this review. CONCLUSION The use of temporal information is essential for identifying the next surgical action. Contemporary methods used mainly RNNs, hierarchical CNNs, and Transformers to preserve long-distance temporal relations. The lack of large publicly available datasets for various procedures is a great challenge for the development of new and robust models. As supervised learning strategies are used to show proof-of-concept, self-supervised, semi-supervised, or active learning methods are used to mitigate dependency on annotated data. SIGNIFICANCE The present study provides a comprehensive review of recent methods in surgical workflow analysis, summarizes commonly used architectures, datasets, and discusses challenges.
Collapse
|
7
|
Nwoye CI, Yu T, Sharma S, Murali A, Alapatt D, Vardazaryan A, Yuan K, Hajek J, Reiter W, Yamlahi A, Smidt FH, Zou X, Zheng G, Oliveira B, Torres HR, Kondo S, Kasai S, Holm F, Özsoy E, Gui S, Li H, Raviteja S, Sathish R, Poudel P, Bhattarai B, Wang Z, Rui G, Schellenberg M, Vilaça JL, Czempiel T, Wang Z, Sheet D, Thapa SK, Berniker M, Godau P, Morais P, Regmi S, Tran TN, Fonseca J, Nölke JH, Lima E, Vazquez E, Maier-Hein L, Navab N, Mascagni P, Seeliger B, Gonzalez C, Mutter D, Padoy N. CholecTriplet2022: Show me a tool and tell me the triplet - An endoscopic vision challenge for surgical action triplet detection. Med Image Anal 2023; 89:102888. [PMID: 37451133 DOI: 10.1016/j.media.2023.102888] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 06/23/2023] [Accepted: 06/28/2023] [Indexed: 07/18/2023]
Abstract
Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of ‹instrument, verb, target› triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results across multiple metrics, visual and procedural challenges; their significance, and useful insights for future research directions and applications in surgery.
Collapse
Affiliation(s)
| | - Tong Yu
- ICube, University of Strasbourg, CNRS, France
| | | | | | | | | | - Kun Yuan
- ICube, University of Strasbourg, CNRS, France; Technical University Munich, Germany
| | | | | | - Amine Yamlahi
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Finn-Henri Smidt
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Xiaoyang Zou
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, China
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, China
| | - Bruno Oliveira
- 2Ai School of Technology, IPCA, Barcelos, Portugal; Life and Health Science Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; Algoritimi Center, School of Engineering, University of Minho, Guimeraes, Portugal
| | - Helena R Torres
- 2Ai School of Technology, IPCA, Barcelos, Portugal; Life and Health Science Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal; Algoritimi Center, School of Engineering, University of Minho, Guimeraes, Portugal
| | | | | | | | - Ege Özsoy
- Technical University Munich, Germany
| | | | - Han Li
- Southern University of Science and Technology, China
| | | | | | | | | | | | | | - Melanie Schellenberg
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | | | | | - Zhenkun Wang
- Southern University of Science and Technology, China
| | | | - Shrawan Kumar Thapa
- Nepal Applied Mathematics and Informatics Institute for research (NAAMII), Nepal
| | | | - Patrick Godau
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Pedro Morais
- 2Ai School of Technology, IPCA, Barcelos, Portugal
| | - Sudarshan Regmi
- Nepal Applied Mathematics and Informatics Institute for research (NAAMII), Nepal
| | - Thuy Nuong Tran
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jaime Fonseca
- Algoritimi Center, School of Engineering, University of Minho, Guimeraes, Portugal
| | - Jan-Hinrich Nölke
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Estevão Lima
- Life and Health Science Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal
| | | | - Lena Maier-Hein
- Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | - Pietro Mascagni
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Barbara Seeliger
- ICube, University of Strasbourg, CNRS, France; University Hospital of Strasbourg, France; IHU Strasbourg, France
| | | | - Didier Mutter
- University Hospital of Strasbourg, France; IHU Strasbourg, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France
| |
Collapse
|
8
|
Ramesh S, Dall'Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N. Weakly Supervised Temporal Convolutional Networks for Fine-Grained Surgical Activity Recognition. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2592-2602. [PMID: 37030859 DOI: 10.1109/tmi.2023.3262847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos. We introduce a step-phase dependency loss to exploit the weak supervision signal. We then employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos, for temporal activity segmentation and recognition. We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.
Collapse
|
9
|
Ramesh S, Dall'Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N. TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos. Int J Comput Assist Radiol Surg 2023; 18:1665-1672. [PMID: 36944845 PMCID: PMC10491694 DOI: 10.1007/s11548-023-02864-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 03/01/2023] [Indexed: 03/23/2023]
Abstract
PURPOSE Automatic recognition of surgical activities from intraoperative surgical videos is crucial for developing intelligent support systems for computer-assisted interventions. Current state-of-the-art recognition methods are based on deep learning where data augmentation has shown the potential to improve the generalization of these methods. This has spurred work on automated and simplified augmentation strategies for image classification and object detection on datasets of still images. Extending such augmentation methods to videos is not straightforward, as the temporal dimension needs to be considered. Furthermore, surgical videos pose additional challenges as they are composed of multiple, interconnected, and long-duration activities. METHODS This work proposes a new simplified augmentation method, called TRandAugment, specifically designed for long surgical videos, that treats each video as an assemble of temporal segments and applies consistent but random transformations to each segment. The proposed augmentation method is used to train an end-to-end spatiotemporal model consisting of a CNN (ResNet50) followed by a TCN. RESULTS The effectiveness of the proposed method is demonstrated on two surgical video datasets, namely Bypass40 and CATARACTS, and two tasks, surgical phase and step recognition. TRandAugment adds a performance boost of 1-6% over previous state-of-the-art methods, that uses manually designed augmentations. CONCLUSION This work presents a simplified and automated augmentation method for long surgical videos. The proposed method has been validated on different datasets and tasks indicating the importance of devising temporal augmentation methods for long surgical videos.
Collapse
Affiliation(s)
- Sanat Ramesh
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy.
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France.
| | - Diego Dall'Alba
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Cristians Gonzalez
- University Hospital of Strasbourg, 67000, Strasbourg, France
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
| | - Tong Yu
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France
| | - Pietro Mascagni
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, 00168, Rome, Italy
| | - Didier Mutter
- University Hospital of Strasbourg, 67000, Strasbourg, France
- IRCAD, 67000, Strasbourg, France
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
| | | | - Paolo Fiorini
- Altair Robotics Lab, University of Verona, 37134, Verona, Italy
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, 67000, Strasbourg, France
- Institute of Image-Guided Surgery, IHU Strasbourg, 67000, Strasbourg, France
| |
Collapse
|
10
|
Yu T, Mascagni P, Verde J, Marescaux J, Mutter D, Padoy N. Live laparoscopic video retrieval with compressed uncertainty. Med Image Anal 2023; 88:102866. [PMID: 37356320 DOI: 10.1016/j.media.2023.102866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 04/14/2023] [Accepted: 06/07/2023] [Indexed: 06/27/2023]
Abstract
Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care. However the primitive and most common approach to retrieval, involving text in the form of keywords, is severely limited when dealing with complex media formats. Content-based retrieval offers a way to overcome this limitation, by using rich media as the query itself. Surgical video-to-video retrieval in particular is a new and largely unexplored research problem with high clinical value, especially in the real-time case: using real-time video hashing, search can be achieved directly inside of the operating room. Indeed, the process of hashing converts large data entries into compact binary arrays or hashes, enabling large-scale search operations at a very fast rate. However, due to fluctuations over the course of a video, not all bits in a given hash are equally reliable. In this work, we propose a method capable of mitigating this uncertainty while maintaining a light computational footprint. We present superior retrieval results (3%-4% top 10 mean average precision) on a multi-task evaluation protocol for surgery, using cholecystectomy phases, bypass phases, and coming from an entirely new dataset introduced here, surgical events across six different surgery types. Success on this multi-task benchmark shows the generalizability of our approach for surgical video retrieval.
Collapse
Affiliation(s)
- Tong Yu
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France.
| | - Pietro Mascagni
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France; Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | | | | | - Didier Mutter
- IHU Strasbourg, France; University Hospital of Strasbourg, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, France; IHU Strasbourg, France
| |
Collapse
|
11
|
Ramesh S, Srivastav V, Alapatt D, Yu T, Murali A, Sestini L, Nwoye CI, Hamoud I, Sharma S, Fleurentin A, Exarchakis G, Karargyris A, Padoy N. Dissecting self-supervised learning methods for surgical computer vision. Med Image Anal 2023; 88:102844. [PMID: 37270898 DOI: 10.1016/j.media.2023.102844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 05/08/2023] [Accepted: 05/15/2023] [Indexed: 06/06/2023]
Abstract
The field of surgical computer vision has undergone considerable breakthroughs in recent years with the rising popularity of deep neural network-based methods. However, standard fully-supervised approaches for training such models require vast amounts of annotated data, imposing a prohibitively high cost; especially in the clinical domain. Self-Supervised Learning (SSL) methods, which have begun to gain traction in the general computer vision community, represent a potential solution to these annotation costs, allowing to learn useful representations from only unlabeled data. Still, the effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. In this work, we address this critical need by investigating four state-of-the-art SSL methods (MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection. We examine their parameterization, then their behavior with respect to training data quantities in semi-supervised settings. Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL - up to 7.4% on phase recognition and 20% on tool presence detection - as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%. Further results obtained on a highly diverse selection of surgical datasets exhibit strong generalization properties. The code is available at https://github.com/CAMMA-public/SelfSupSurg.
Collapse
Affiliation(s)
- Sanat Ramesh
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France; Altair Robotics Lab, Department of Computer Science, University of Verona, Verona 37134, Italy
| | - Vinkle Srivastav
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France.
| | - Deepak Alapatt
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France
| | - Tong Yu
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France
| | - Aditya Murali
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France
| | - Luca Sestini
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France; Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano 20133, Italy
| | | | - Idris Hamoud
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France
| | - Saurav Sharma
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France
| | | | - Georgios Exarchakis
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France; IHU Strasbourg, Strasbourg 67000, France
| | - Alexandros Karargyris
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France; IHU Strasbourg, Strasbourg 67000, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, Strasbourg 67000, France; IHU Strasbourg, Strasbourg 67000, France
| |
Collapse
|
12
|
Frisch Y, Fuchs M, Mukhopadhyay A. Temporally consistent sequence-to-sequence translation of cataract surgeries. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02925-y. [PMID: 37219806 PMCID: PMC10329626 DOI: 10.1007/s11548-023-02925-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 04/17/2023] [Indexed: 05/24/2023]
Abstract
PURPOSE Image-to-image translation methods can address the lack of diversity in publicly available cataract surgery data. However, applying image-to-image translation to videos-which are frequently used in medical downstream applications-induces artifacts. Additional spatio-temporal constraints are needed to produce realistic translations and improve the temporal consistency of translated image sequences. METHODS We introduce a motion-translation module that translates optical flows between domains to impose such constraints. We combine it with a shared latent space translation model to improve image quality. Evaluations are conducted regarding translated sequences' image quality and temporal consistency, where we propose novel quantitative metrics for the latter. Finally, the downstream task of surgical phase classification is evaluated when retraining it with additional synthetic translated data. RESULTS Our proposed method produces more consistent translations than state-of-the-art baselines. Moreover, it stays competitive in terms of the per-image translation quality. We further show the benefit of consistently translated cataract surgery sequences for improving the downstream task of surgical phase prediction. CONCLUSION The proposed module increases the temporal consistency of translated sequences. Furthermore, imposed temporal constraints increase the usability of translated data in downstream tasks. This allows overcoming some of the hurdles of surgical data acquisition and annotation and enables improving models' performance by translating between existing datasets of sequential frames.
Collapse
Affiliation(s)
- Yannik Frisch
- Computer Science, Technical University Darmstadt, Fraunhoferstraße 5, 64283, Darmstadt, Hessen, Germany.
| | - Moritz Fuchs
- Computer Science, Technical University Darmstadt, Fraunhoferstraße 5, 64283, Darmstadt, Hessen, Germany
| | - Anirban Mukhopadhyay
- Computer Science, Technical University Darmstadt, Fraunhoferstraße 5, 64283, Darmstadt, Hessen, Germany
| |
Collapse
|
13
|
Nwoye CI, Alapatt D, Yu T, Vardazaryan A, Xia F, Zhao Z, Xia T, Jia F, Yang Y, Wang H, Yu D, Zheng G, Duan X, Getty N, Sanchez-Matilla R, Robu M, Zhang L, Chen H, Wang J, Wang L, Zhang B, Gerats B, Raviteja S, Sathish R, Tao R, Kondo S, Pang W, Ren H, Abbing JR, Sarhan MH, Bodenstedt S, Bhasker N, Oliveira B, Torres HR, Ling L, Gaida F, Czempiel T, Vilaça JL, Morais P, Fonseca J, Egging RM, Wijma IN, Qian C, Bian G, Li Z, Balasubramanian V, Sheet D, Luengo I, Zhu Y, Ding S, Aschenbrenner JA, van der Kar NE, Xu M, Islam M, Seenivasan L, Jenke A, Stoyanov D, Mutter D, Mascagni P, Seeliger B, Gonzalez C, Padoy N. CholecTriplet2021: A benchmark challenge for surgical action triplet recognition. Med Image Anal 2023; 86:102803. [PMID: 37004378 DOI: 10.1016/j.media.2023.102803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 12/13/2022] [Accepted: 03/23/2023] [Indexed: 03/29/2023]
Abstract
Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of ‹instrument, verb, target› combination delivers more comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and the assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms from the competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
Collapse
|
14
|
Roß T, Bruno P, Reinke A, Wiesenfarth M, Koeppel L, Full PM, Pekdemir B, Godau P, Trofimova D, Isensee F, Adler TJ, Tran TN, Moccia S, Calimeri F, Müller-Stich BP, Kopp-Schneider A, Maier-Hein L. Beyond rankings: Learning (more) from algorithm validation. Med Image Anal 2023; 86:102765. [PMID: 36965252 DOI: 10.1016/j.media.2023.102765] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 05/24/2022] [Accepted: 02/08/2023] [Indexed: 03/06/2023]
Abstract
Challenges have become the state-of-the-art approach to benchmark image analysis algorithms in a comparative manner. While the validation on identical data sets was a great step forward, results analysis is often restricted to pure ranking tables, leaving relevant questions unanswered. Specifically, little effort has been put into the systematic investigation on what characterizes images in which state-of-the-art algorithms fail. To address this gap in the literature, we (1) present a statistical framework for learning from challenges and (2) instantiate it for the specific task of instrument instance segmentation in laparoscopic videos. Our framework relies on the semantic meta data annotation of images, which serves as foundation for a General Linear Mixed Models (GLMM) analysis. Based on 51,542 meta data annotations performed on 2,728 images, we applied our approach to the results of the Robust Medical Instrument Segmentation Challenge (ROBUST-MIS) challenge 2019 and revealed underexposure, motion and occlusion of instruments as well as the presence of smoke or other objects in the background as major sources of algorithm failure. Our subsequent method development, tailored to the specific remaining issues, yielded a deep learning model with state-of-the-art overall performance and specific strengths in the processing of images in which previous methods tended to fail. Due to the objectivity and generic applicability of our approach, it could become a valuable tool for validation in the field of medical image analysis and beyond.
Collapse
Affiliation(s)
- Tobias Roß
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; Medical Faculty, Heidelberg University, Heidelberg, Germany; Helmholtz Imaging, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | - Pierangela Bruno
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; Department of Mathematics and Computer Science, University of Calabria, Rende, Italy
| | - Annika Reinke
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; Helmholtz Imaging, German Cancer Research Center (DKFZ), Heidelberg, Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Germany
| | - Manuel Wiesenfarth
- Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Lisa Koeppel
- Section Clinical Tropical Medicine, Heidelberg University, Heidelberg, Germany
| | - Peter M Full
- Medical Faculty, Heidelberg University, Heidelberg, Germany; Division of Medical Image Computing (MIC), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Bünyamin Pekdemir
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Patrick Godau
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Germany
| | - Darya Trofimova
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; HIP Applied Computer Vision Lab, MIC, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Fabian Isensee
- Helmholtz Imaging, German Cancer Research Center (DKFZ), Heidelberg, Germany; Division of Medical Image Computing (MIC), German Cancer Research Center (DKFZ), Heidelberg, Germany; HIP Applied Computer Vision Lab, MIC, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Tim J Adler
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Thuy N Tran
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sara Moccia
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Italy
| | - Francesco Calimeri
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy
| | - Beat P Müller-Stich
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | | | - Lena Maier-Hein
- Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany; Medical Faculty, Heidelberg University, Heidelberg, Germany; Helmholtz Imaging, German Cancer Research Center (DKFZ), Heidelberg, Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Germany; Germany and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| |
Collapse
|
15
|
Use of Machine Learning to Assess Cataract Surgery Skill Level With Tool Detection. OPHTHALMOLOGY SCIENCE 2023; 3:100235. [PMID: 36444216 PMCID: PMC9700302 DOI: 10.1016/j.xops.2022.100235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 10/05/2022] [Accepted: 10/18/2022] [Indexed: 11/25/2022]
Abstract
Purpose To develop a method for objective analysis of the reproducible steps in routine cataract surgery. Design Prospective study; machine learning. Participants Deidentified faculty and trainee surgical videos. Methods Consecutive cataract surgeries performed by a faculty or trainee surgeon in an ophthalmology residency program over 6 months were collected and labeled according to degrees of difficulty. An existing image classification network, ResNet 152, was fine-tuned for tool detection in cataract surgery to allow for automatic identification of each unique surgical instrument. Individual microscope video frame windows were subsequently encoded as a vector. The relation between vector encodings and perceived skill using k-fold user-out cross-validation was examined. Algorithms were evaluated using area under the receiver operating characteristic curve (AUC) and the classification accuracy. Main Outcome Measures Accuracy of tool detection and skill assessment. Results In total, 391 consecutive cataract procedures with 209 routine cases were used. Our model achieved an AUC ranging from 0.933 to 0.998 for tool detection. For skill classification, AUC was 0.550 (95% confidence interval [CI], 0.547–0.553) with an accuracy of 54.3% (95% CI, 53.9%–54.7%) for a single snippet, AUC was 0.570 (0.565–0.575) with an accuracy of 57.8% (56.8%–58.7%) for a single surgery, and AUC was 0.692 (0.659–0.758) with an accuracy of 63.3% (56.8%–69.8%) for a single user given all their trials. Conclusions Our research shows that machine learning can accurately and independently identify distinct cataract surgery tools in videos, which is crucial for comparing the use of the tool in a step. However, it is more challenging for machine learning to accurately differentiate overall and specific step skill to assess the level of training or expertise. Financial Disclosure(s) The author(s) have no proprietary or commercial interest in any materials discussed in this article.
Collapse
|
16
|
Wagner M, Müller-Stich BP, Kisilenko A, Tran D, Heger P, Mündermann L, Lubotsky DM, Müller B, Davitashvili T, Capek M, Reinke A, Reid C, Yu T, Vardazaryan A, Nwoye CI, Padoy N, Liu X, Lee EJ, Disch C, Meine H, Xia T, Jia F, Kondo S, Reiter W, Jin Y, Long Y, Jiang M, Dou Q, Heng PA, Twick I, Kirtac K, Hosgor E, Bolmgren JL, Stenzel M, von Siemens B, Zhao L, Ge Z, Sun H, Xie D, Guo M, Liu D, Kenngott HG, Nickel F, Frankenberg MV, Mathis-Ullrich F, Kopp-Schneider A, Maier-Hein L, Speidel S, Bodenstedt S. Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark. Med Image Anal 2023; 86:102770. [PMID: 36889206 DOI: 10.1016/j.media.2023.102770] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 02/03/2023] [Accepted: 02/08/2023] [Indexed: 02/23/2023]
Abstract
PURPOSE Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center video dataset. In this work we investigated the generalizability of phase recognition algorithms in a multicenter setting including more difficult recognition tasks such as surgical action and surgical skill. METHODS To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 h was created. Labels included framewise annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 international Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 research teams trained and submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. RESULTS F1-scores were achieved for phase recognition between 23.9% and 67.7% (n = 9 teams), for instrument presence detection between 38.5% and 63.8% (n = 8 teams), but for action recognition only between 21.8% and 23.3% (n = 5 teams). The average absolute error for skill assessment was 0.78 (n = 1 team). CONCLUSION Surgical workflow and skill analysis are promising technologies to support the surgical team, but there is still room for improvement, as shown by our comparison of machine learning algorithms. This novel HeiChole benchmark can be used for comparable evaluation and validation of future work. In future studies, it is of utmost importance to create more open, high-quality datasets in order to allow the development of artificial intelligence and cognitive robotics in surgery.
Collapse
Affiliation(s)
- Martin Wagner
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany.
| | - Beat-Peter Müller-Stich
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Anna Kisilenko
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Duc Tran
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Patrick Heger
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
| | - Lars Mündermann
- Data Assisted Solutions, Corporate Research & Technology, KARL STORZ SE & Co. KG, Dr. Karl-Storz-Str. 34, 78332 Tuttlingen
| | - David M Lubotsky
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Benjamin Müller
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Tornike Davitashvili
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Manuela Capek
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Annika Reinke
- Div. Computer Assisted Medical Interventions, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; HIP Helmholtz Imaging Platform, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg
| | - Carissa Reid
- Division of Biostatistics, German Cancer Research Center, Im Neuenheimer Feld 280, Heidelberg, Germany
| | - Tong Yu
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Armine Vardazaryan
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Chinedu Innocent Nwoye
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Xinyang Liu
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Children's National Hospital, 111 Michigan Ave NW, Washington, DC 20010, USA
| | - Eung-Joo Lee
- University of Maryland, College Park, 2405 A V Williams Building, College Park, MD 20742, USA
| | - Constantin Disch
- Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany
| | - Hans Meine
- Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany; University of Bremen, FB3, Medical Image Computing Group, ℅ Fraunhofer MEVIS, Am Fallturm 1, 28359 Bremen, Germany
| | - Tong Xia
- Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Fucang Jia
- Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Satoshi Kondo
- Konika Minolta, Inc., 1-2, Sakura-machi, Takatsuki, Oasak 569-8503, Japan
| | - Wolfgang Reiter
- Wintegral GmbH, Ehrenbreitsteiner Str. 36, 80993 München, Germany
| | - Yueming Jin
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Yonghao Long
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Meirui Jiang
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Qi Dou
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Pheng Ann Heng
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Isabell Twick
- Caresyntax GmbH, Komturstr. 18A, 12099 Berlin, Germany
| | - Kadir Kirtac
- Caresyntax GmbH, Komturstr. 18A, 12099 Berlin, Germany
| | - Enes Hosgor
- Caresyntax GmbH, Komturstr. 18A, 12099 Berlin, Germany
| | | | | | | | - Long Zhao
- Hikvision Research Institute, Hangzhou, China
| | - Zhenxiao Ge
- Hikvision Research Institute, Hangzhou, China
| | - Haiming Sun
- Hikvision Research Institute, Hangzhou, China
| | - Di Xie
- Hikvision Research Institute, Hangzhou, China
| | - Mengqi Guo
- School of Computing, National University of Singapore, Computing 1, No.13 Computing Drive, 117417, Singapore
| | - Daochang Liu
- National Engineering Research Center of Visual Technology, School of Computer Science, Peking University, Beijing, China
| | - Hannes G Kenngott
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
| | - Felix Nickel
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
| | - Moritz von Frankenberg
- Department of Surgery, Salem Hospital of the Evangelische Stadtmission Heidelberg, Zeppelinstrasse 11-33, 69121 Heidelberg, Germany
| | - Franziska Mathis-Ullrich
- Health Robotics and Automation Laboratory, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Geb. 40.28, KIT Campus Süd, Engler-Bunte-Ring 8, 76131 Karlsruhe, Germany
| | - Annette Kopp-Schneider
- Division of Biostatistics, German Cancer Research Center, Im Neuenheimer Feld 280, Heidelberg, Germany
| | - Lena Maier-Hein
- Div. Computer Assisted Medical Interventions, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; HIP Helmholtz Imaging Platform, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg; Medical Faculty, Heidelberg University, Im Neuenheimer Feld 672, 69120 Heidelberg
| | - Stefanie Speidel
- Div. Translational Surgical Oncology, National Center for Tumor Diseases Dresden, Fetscherstraße 74, 01307 Dresden, Germany; Cluster of Excellence "Centre for Tactile Internet with Human-in-the-Loop" (CeTI) of Technische Universität Dresden, 01062 Dresden, Germany
| | - Sebastian Bodenstedt
- Div. Translational Surgical Oncology, National Center for Tumor Diseases Dresden, Fetscherstraße 74, 01307 Dresden, Germany; Cluster of Excellence "Centre for Tactile Internet with Human-in-the-Loop" (CeTI) of Technische Universität Dresden, 01062 Dresden, Germany
| |
Collapse
|
17
|
Jin Y, Long Y, Gao X, Stoyanov D, Dou Q, Heng PA. Trans-SVNet: hybrid embedding aggregation Transformer for surgical workflow analysis. Int J Comput Assist Radiol Surg 2022; 17:2193-2202. [PMID: 36129573 DOI: 10.1007/s11548-022-02743-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 08/31/2022] [Indexed: 11/05/2022]
Abstract
PURPOSE Real-time surgical workflow analysis has been a key component for computer-assisted intervention system to improve cognitive assistance. Most existing methods solely rely on conventional temporal models and encode features with a successive spatial-temporal arrangement. Supportive benefits of intermediate features are partially lost from both visual and temporal aspects. In this paper, we rethink feature encoding to attend and preserve the critical information for accurate workflow recognition and anticipation. METHODS We introduce Transformer in surgical workflow analysis, to reconsider complementary effects of spatial and temporal representations. We propose a hybrid embedding aggregation Transformer, named Trans-SVNet, to effectively interact with the designed spatial and temporal embeddings, by employing spatial embedding to query temporal embedding sequence. We jointly optimized by loss objectives from both analysis tasks to leverage their high correlation. RESULTS We extensively evaluate our method on three large surgical video datasets. Our method consistently outperforms the state-of-the-arts across three datasets on workflow recognition task. Jointly learning with anticipation, recognition results can gain a large improvement. Our approach also shows its effectiveness on anticipation with promising performance achieved. Our model achieves a real-time inference speed of 0.0134 second per frame. CONCLUSION Experimental results demonstrate the efficacy of our hybrid embeddings integration by rediscovering the crucial cues from complementary spatial-temporal embeddings. The better performance by multi-task learning indicates that anticipation task brings the additional knowledge to recognition task. Promising effectiveness and efficiency of our method also show its promising potential to be used in operating room.
Collapse
Affiliation(s)
- Yueming Jin
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), Department of Computer Science, University College, London, UK
| | - Yonghao Long
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, HK, China
| | - Xiaojie Gao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, HK, China
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), Department of Computer Science, University College, London, UK
| | - Qi Dou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, HK, China. .,Institute of Medical Intelligence and XR, The Chinese University of Hong Kong, Shatin, HK, China.
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, HK, China.,Institute of Medical Intelligence and XR, The Chinese University of Hong Kong, Shatin, HK, China
| |
Collapse
|
18
|
Surgical Tool Datasets for Machine Learning Research: A Survey. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01640-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractThis paper is a comprehensive survey of datasets for surgical tool detection and related surgical data science and machine learning techniques and algorithms. The survey offers a high level perspective of current research in this area, analyses the taxonomy of approaches adopted by researchers using surgical tool datasets, and addresses key areas of research, such as the datasets used, evaluation metrics applied and deep learning techniques utilised. Our presentation and taxonomy provides a framework that facilitates greater understanding of current work, and highlights the challenges and opportunities for further innovative and useful research.
Collapse
|
19
|
Chen HB, Li Z, Fu P, Ni ZL, Bian GB. Spatio-Temporal Causal Transformer for Multi-Grained Surgical Phase Recognition. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1663-1666. [PMID: 36086459 DOI: 10.1109/embc48229.2022.9871004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Automatic surgical phase recognition plays a key role in surgical workflow analysis and overall optimization in clinical work. In the complicated surgical procedures, similar inter-class appearance and drastic variability in phase duration make this still a challenging task. In this paper, a spatio-temporal transformer is proposed for online surgical phase recognition with different granularity. To extract rich spatial information, a spatial transformer is used to model global spatial dependencies of each time index. To overcome the variability in phase duration, a temporal transformer captures the multi-scale temporal context of different time indexes with a dual pyramid pattern. Our method is thoroughly validated on the public Cholec80 dataset with 7 coarse-grained phases and the CATARACTS2020 dataset with 19 fine-grained phases, outperforming state-of-the-art approaches with 91.4% and 84.2% accuracy, taking only 24.5M parameters.
Collapse
|
20
|
Wang T, Xia J, Li R, Wang R, Stanojcic N, Li JPO, Long E, Wang J, Zhang X, Li J, Wu X, Liu Z, Chen J, Chen H, Nie D, Ni H, Chen R, Chen W, Yin S, Lin D, Yan P, Xia Z, Lin S, Huang K, Lin H. Intelligent cataract surgery supervision and evaluation via deep learning. Int J Surg 2022; 104:106740. [PMID: 35760343 DOI: 10.1016/j.ijsu.2022.106740] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/16/2022] [Accepted: 06/16/2022] [Indexed: 11/17/2022]
Abstract
PURPOSE To assess the performance of a deep learning (DL) algorithm for evaluating and supervising cataract extraction using phacoemulsification with intraocular lens (IOL) implantation based on cataract surgery (CS) videos. MATERIALS AND METHODS DeepSurgery was trained using 186 standard CS videos to recognize 12 CS steps and was validated in two datasets that contained 50 and 21 CS videos, respectively. A supervision test including 50 CS videos was used to assess the DeepSurgery guidance and alert function. In addition, a real-time test containing 54 CSs was used to compare the DeepSurgery grading performance to an expert panel and residents. RESULTS DeepSurgery achieved stable performance for all 12 recognition steps, including the duration between two pairs of adjacent steps in internal validation with an ACC of 95.06% and external validations with ACCs of 88.77% and 88.34%. DeepSurgery also recognized the chronology of surgical steps and alerted surgeons to order of incorrect steps. Six main steps are automatically and simultaneously quantified during the evaluation process (centesimal system). In a real-time comparative test, the DeepSurgery step recognition performance was robust (ACC of 90.30%). In addition, DeepSurgery and an expert panel achieved comparable performance when assessing the surgical steps (kappa ranged from 0.58 to 0.77). CONCLUSIONS DeepSurgery represents a potential approach to provide a real-time supervision and an objective surgical evaluation system for routine CS and to improve surgical outcomes.
Collapse
Affiliation(s)
- Ting Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Jun Xia
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Ruiyang Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Ruixin Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Nick Stanojcic
- Department of Ophthalmology, St. Thomas' Hospital, London, United Kingdom
| | - Ji-Peng Olivia Li
- Moorfields Eye Hospital NHS Foundation Trust, London, United Kingdom
| | - Erping Long
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Jinghui Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Xiayin Zhang
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Jianbin Li
- Department of Ophthalmology, Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Xiaohang Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Zhenzhen Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Jingjing Chen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Hui Chen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Danyao Nie
- Shenzhen Eye Hospital, Shenzhen Key Laboratory of Ophthalmology, Shenzhen University School of Medicine, Shenzhen, China
| | - Huanqi Ni
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Ruoxi Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Wenben Chen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Shiyi Yin
- Department of Ophthalmology, Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Duru Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China
| | - Pisong Yan
- Cloud Intelligent Care Technology (Guangzhou) Co., Ltd., Guangzhou, China
| | - Zeyang Xia
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Shengzhi Lin
- Guangzhou Oculotronics Medical Instrument Co., Ltd, Guangzhou, China
| | - Kai Huang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
| | - Haotian Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, Guangdong, China; Hainan Eye Hospital and Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Haikou, China; Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
21
|
Seidlitz S, Sellner J, Odenthal J, Özdemir B, Studier-Fischer A, Knödler S, Ayala L, Adler TJ, Kenngott HG, Tizabi M, Wagner M, Nickel F, Müller-Stich BP, Maier-Hein L. Robust deep learning-based semantic organ segmentation in hyperspectral images. Med Image Anal 2022; 80:102488. [DOI: 10.1016/j.media.2022.102488] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 03/28/2022] [Accepted: 05/20/2022] [Indexed: 12/15/2022]
|
22
|
Matton N, Qalieh A, Zhang Y, Annadanam A, Thibodeau A, Li T, Shankar A, Armenti S, Mian SI, Tannen B, Nallasamy N. Analysis of Cataract Surgery Instrument Identification Performance of Convolutional and Recurrent Neural Network Ensembles Leveraging BigCat. Transl Vis Sci Technol 2022; 11:1. [PMID: 35363261 PMCID: PMC8976933 DOI: 10.1167/tvst.11.4.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Purpose To develop a method for accurate automated real-time identification of instruments in cataract surgery videos. Methods Cataract surgery videos were collected at University of Michigan's Kellogg Eye Center between 2020 and 2021. Videos were annotated for the presence of instruments to aid in the development, validation, and testing of machine learning (ML) models for multiclass, multilabel instrument identification. Results A new cataract surgery database, BigCat, was assembled, containing 190 videos with over 3.9 million annotated frames, the largest reported cataract surgery annotation database to date. Using a dense convolutional neural network (CNN) and a recursive averaging method, we were able to achieve a test F1 score of 0.9528 and test area under the receiver operator characteristic curve of 0.9985 for surgical instrument identification. These prove to be state-of-the-art results compared to previous works, while also only using a fraction of the model parameters of the previous architectures. Conclusions Accurate automated surgical instrument identification is possible with lightweight CNNs and large datasets. Increasingly complex model architecture is not necessary to retain a well-performing model. Recurrent neural network architectures add additional complexity to a model and are unnecessary to attain state-of-the-art performance. Translational Relevance Instrument identification in the operative field can be used for further applications such as evaluating surgical trainee skill level and developing early warning detection systems for use during surgery.
Collapse
Affiliation(s)
- Nicholas Matton
- Department of Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Adel Qalieh
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Yibing Zhang
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Anvesh Annadanam
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Alexa Thibodeau
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Tingyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Anand Shankar
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Stephen Armenti
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Shahzad I Mian
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Bradford Tannen
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Nambi Nallasamy
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA.,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
23
|
Garcia Nespolo R, Yi D, Cole E, Valikodath N, Luciano C, Leiderman YI. Evaluation of Artificial Intelligence-Based Intraoperative Guidance Tools for Phacoemulsification Cataract Surgery. JAMA Ophthalmol 2022; 140:170-177. [PMID: 35024773 PMCID: PMC8855235 DOI: 10.1001/jamaophthalmol.2021.5742] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
IMPORTANCE Complications that arise from phacoemulsification procedures can lead to worse visual outcomes. Real-time image processing with artificial intelligence tools can extract data to deliver surgical guidance, potentially enhancing the surgical environment. OBJECTIVE To evaluate the ability of a deep neural network to track the pupil, identify the surgical phase, and activate specific computer vision tools to aid the surgeon during phacoemulsification cataract surgery by providing visual feedback in real time. DESIGN, SETTING, AND PARTICIPANTS This cross-sectional study evaluated deidentified surgical videos of phacoemulsification cataract operations performed by faculty and trainee surgeons in a university-based ophthalmology department between July 1, 2020, and January 1, 2021, in a population-based cohort of patients. EXPOSURES A region-based convolutional neural network was used to receive frames from the video source and, in real time, locate the pupil and in parallel identify the surgical phase being performed. Computer vision-based algorithms were applied according to the phase identified, providing visual feedback to the surgeon. MAIN OUTCOMES AND MEASURES Outcomes were area under the receiver operator characteristic curve and area under the precision-recall curve for surgical phase classification and Dice score (harmonic mean of the precision and recall [sensitivity]) for detection of the pupil boundary. Network performance was assessed as video output in frames per second. A usability survey was administered to volunteer cataract surgeons previously unfamiliar with the platform. RESULTS The region-based convolutional neural network model achieved area under the receiver operating characteristic curve values of 0.996 for capsulorhexis, 0.972 for phacoemulsification, 0.997 for cortex removal, and 0.880 for idle phase recognition. The final algorithm reached a Dice score of 90.23% for pupil segmentation and a mean (SD) processing speed of 97 (34) frames per second. Among the 11 cataract surgeons surveyed, 8 (72%) were mostly or extremely likely to use the current platform during surgery for complex cataract. CONCLUSIONS AND RELEVANCE A computer vision approach using deep neural networks was able to pupil track, identify the surgical phase being executed, and activate surgical guidance tools. These results suggest that an artificial intelligence-based surgical guidance platform has the potential to enhance the surgeon experience in phacoemulsification cataract surgery. This proof-of-concept investigation suggests that a pipeline from a surgical microscope could be integrated with neural networks and computer vision tools to provide surgical guidance in real time.
Collapse
Affiliation(s)
- Rogerio Garcia Nespolo
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago,Richard and Loan Hill Department of Biomedical Engineering, University of Illinois at Chicago, Chicago
| | - Darvin Yi
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago,Richard and Loan Hill Department of Biomedical Engineering, University of Illinois at Chicago, Chicago
| | - Emily Cole
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago
| | - Nita Valikodath
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago
| | - Cristian Luciano
- Richard and Loan Hill Department of Biomedical Engineering, University of Illinois at Chicago, Chicago
| | - Yannek I. Leiderman
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago,Richard and Loan Hill Department of Biomedical Engineering, University of Illinois at Chicago, Chicago
| |
Collapse
|
24
|
Edwards PJE, Psychogyios D, Speidel S, Maier-Hein L, Stoyanov D. SERV-CT: A disparity dataset from cone-beam CT for validation of endoscopic 3D reconstruction. Med Image Anal 2022; 76:102302. [PMID: 34906918 PMCID: PMC8961000 DOI: 10.1016/j.media.2021.102302] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 11/01/2021] [Accepted: 11/04/2021] [Indexed: 11/27/2022]
Abstract
In computer vision, reference datasets from simulation and real outdoor scenes have been highly successful in promoting algorithmic development in stereo reconstruction. Endoscopic stereo reconstruction for surgical scenes gives rise to specific problems, including the lack of clear corner features, highly specular surface properties and the presence of blood and smoke. These issues present difficulties for both stereo reconstruction itself and also for standardised dataset production. Previous datasets have been produced using computed tomography (CT) or structured light reconstruction on phantom or ex vivo models. We present a stereo-endoscopic reconstruction validation dataset based on cone-beam CT (SERV-CT). Two ex vivo small porcine full torso cadavers were placed within the view of the endoscope with both the endoscope and target anatomy visible in the CT scan. Subsequent orientation of the endoscope was manually aligned to match the stereoscopic view and benchmark disparities, depths and occlusions are calculated. The requirement of a CT scan limited the number of stereo pairs to 8 from each ex vivo sample. For the second sample an RGB surface was acquired to aid alignment of smooth, featureless surfaces. Repeated manual alignments showed an RMS disparity accuracy of around 2 pixels and a depth accuracy of about 2 mm. A simplified reference dataset is provided consisting of endoscope image pairs with corresponding calibration, disparities, depths and occlusions covering the majority of the endoscopic image and a range of tissue types, including smooth specular surfaces, as well as significant variation of depth. We assessed the performance of various stereo algorithms from online available repositories. There is a significant variation between algorithms, highlighting some of the challenges of surgical endoscopic images. The SERV-CT dataset provides an easy to use stereoscopic validation for surgical applications with smooth reference disparities and depths covering the majority of the endoscopic image. This complements existing resources well and we hope will aid the development of surgical endoscopic anatomical reconstruction algorithms.
Collapse
Affiliation(s)
- P J Eddie Edwards
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK.
| | - Dimitris Psychogyios
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK
| | - Stefanie Speidel
- Division of Translational Surgical Oncology, National Center for Tumor Diseases (NCT) Dresden, Dresden, 01307, Germany
| | - Lena Maier-Hein
- Division of Medical and Biological Informatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London (UCL), Charles Bell House, 43-45 Foley Street, London W1W 7TS, UK
| |
Collapse
|
25
|
Tognetto D, Giglio R, Vinciguerra AL, Milan S, Rejdak R, Rejdak M, Zaluska-Ogryzek K, Zweifel S, Toro MD. Artificial intelligence applications and cataract management: A systematic review. Surv Ophthalmol 2021; 67:817-829. [PMID: 34606818 DOI: 10.1016/j.survophthal.2021.09.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 09/27/2021] [Accepted: 09/27/2021] [Indexed: 11/26/2022]
Abstract
Artificial intelligence (AI)-based applications exhibit the potential to improve the quality and efficiency of patient care in different fields, including cataract management. A systematic review of the different applications of AI-based software on all aspects of a cataract patient's management, from diagnosis to follow-up, was carried out in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. All selected articles were analyzed to assess the level of evidence according to the Oxford Centre for Evidence-Based Medicine 2011 guidelines, and the quality of evidence according to the Grading of Recommendations Assessment, Development and Evaluation system. Of the articles analyzed, 49 met the inclusion criteria. No data synthesis was possible for the heterogeneity of available data and the design of the available studies. The AI-driven diagnosis seemed to be comparable and, in selected cases, to even exceed the accuracy of experienced clinicians in classifying disease, supporting the operating room scheduling, and intraoperative and postoperative management of complications. Considering the heterogeneity of data analyzed, however, further randomized controlled trials to assess the efficacy and safety of AI application in the management of cataract should be highly warranted.
Collapse
Affiliation(s)
- Daniele Tognetto
- Eye Clinic, Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
| | - Rosa Giglio
- Eye Clinic, Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy.
| | - Alex Lucia Vinciguerra
- Eye Clinic, Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
| | - Serena Milan
- Eye Clinic, Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
| | - Robert Rejdak
- Chair and Department of General and Pediatric Ophthalmology, Medical University of Lublin, Lublin, Poland
| | | | | | | | - Mario Damiano Toro
- Department of Ophthalmology, University of Zurich, Zurich; Department of Medical Sciences, Collegium Medicum, Cardinal Stefan Wyszyński University, Warsaw, Poland
| |
Collapse
|
26
|
Abstract
PURPOSE OF REVIEW Artificial intelligence and deep learning have become important tools in extracting data from ophthalmic surgery to evaluate, teach, and aid the surgeon in all phases of surgical management. The purpose of this review is to highlight the ever-increasing intersection of computer vision, machine learning, and ophthalmic microsurgery. RECENT FINDINGS Deep learning algorithms are being applied to help evaluate and teach surgical trainees. Artificial intelligence tools are improving real-time surgical instrument tracking, phase segmentation, as well as enhancing the safety of robotic-assisted vitreoretinal surgery. SUMMARY Similar to strides appreciated in ophthalmic medical disease, artificial intelligence will continue to become an important part of surgical management of ocular conditions. Machine learning applications will help push the boundaries of what surgeons can accomplish to improve patient outcomes.
Collapse
Affiliation(s)
- Kapil Mishra
- Department of Ophthalmology, Byers Eye Institute at Stanford, Stanford University School of Medicine, Palo Alto, California, USA
| | | |
Collapse
|
27
|
Rampat R, Deshmukh R, Chen X, Ting DSW, Said DG, Dua HS, Ting DSJ. Artificial Intelligence in Cornea, Refractive Surgery, and Cataract: Basic Principles, Clinical Applications, and Future Directions. Asia Pac J Ophthalmol (Phila) 2021; 10:268-281. [PMID: 34224467 PMCID: PMC7611495 DOI: 10.1097/apo.0000000000000394] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
ABSTRACT Corneal diseases, uncorrected refractive errors, and cataract represent the major causes of blindness globally. The number of refractive surgeries, either cornea- or lens-based, is also on the rise as the demand for perfect vision continues to increase. With the recent advancement and potential promises of artificial intelligence (AI) technologies demonstrated in the realm of ophthalmology, particularly retinal diseases and glaucoma, AI researchers and clinicians are now channeling their focus toward the less explored ophthalmic areas related to the anterior segment of the eye. Conditions that rely on anterior segment imaging modalities, including slit-lamp photography, anterior segment optical coherence tomography, corneal tomography, in vivo confocal microscopy and/or optical biometers, are the most commonly explored areas. These include infectious keratitis, keratoconus, corneal grafts, ocular surface pathologies, preoperative screening before refractive surgery, intraocular lens calculation, and automated refraction, among others. In this review, we aimed to provide a comprehensive update on the utilization of AI in anterior segment diseases, with particular emphasis on the recent advancement in the past few years. In addition, we demystify some of the basic principles and terminologies related to AI, particularly machine learning and deep learning, to help improve the understanding, research and clinical implementation of these AI technologies among the ophthalmologists and vision scientists. As we march toward the era of digital health, guidelines such as CONSORT-AI, SPIRIT-AI, and STARD-AI will play crucial roles in guiding and standardizing the conduct and reporting of AI-related trials, ultimately promoting their potential for clinical translation.
Collapse
Affiliation(s)
| | - Rashmi Deshmukh
- Department of Ophthalmology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Xin Chen
- School of Computer Science, University of Nottingham, Nottingham, UK
| | - Daniel S. W. Ting
- Duke-NUS Medical School, National University of Singapore, Singapore
- Singapore National Eye Centre / Singapore Eye Research Institute, Singapore
| | - Dalia G. Said
- Academic Ophthalmology, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, UK
- Department of Ophthalmology, Queen’s Medical Centre, Nottingham, UK
| | - Harminder S. Dua
- Academic Ophthalmology, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, UK
- Department of Ophthalmology, Queen’s Medical Centre, Nottingham, UK
| | - Darren S. J. Ting
- Singapore National Eye Centre / Singapore Eye Research Institute, Singapore
- Academic Ophthalmology, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, UK
- Department of Ophthalmology, Queen’s Medical Centre, Nottingham, UK
| |
Collapse
|
28
|
Grammatikopoulou M, Flouty E, Kadkhodamohammadi A, Quellec G, Chow A, Nehme J, Luengo I, Stoyanov D. CaDIS: Cataract dataset for surgical RGB-image segmentation. Med Image Anal 2021; 71:102053. [PMID: 33864969 DOI: 10.1016/j.media.2021.102053] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 03/17/2021] [Accepted: 03/23/2021] [Indexed: 01/02/2023]
Abstract
Video feedback provides a wealth of information about surgical procedures and is the main sensory cue for surgeons. Scene understanding is crucial to computer assisted interventions (CAI) and to post-operative analysis of the surgical procedure. A fundamental building block of such capabilities is the identification and localization of surgical instruments and anatomical structures through semantic segmentation. Deep learning has advanced semantic segmentation techniques in the recent years but is inherently reliant on the availability of labelled datasets for model training. This paper introduces a dataset for semantic segmentation of cataract surgery videos complementing the publicly available CATARACTS challenge dataset. In addition, we benchmark the performance of several state-of-the-art deep learning models for semantic segmentation on the presented dataset. The dataset is publicly available at https://cataracts-semantic-segmentation2020.grand-challenge.org/.
Collapse
Affiliation(s)
| | | | | | | | - Andre Chow
- Digital Surgery LTD, 230 City Road, London, EC1V 2QY, UK
| | - Jean Nehme
- Digital Surgery LTD, 230 City Road, London, EC1V 2QY, UK
| | - Imanol Luengo
- Digital Surgery LTD, 230 City Road, London, EC1V 2QY, UK
| | - Danail Stoyanov
- Digital Surgery LTD, 230 City Road, London, EC1V 2QY, UK; Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, Gower Street, London, WC1E 6BT, UK
| |
Collapse
|
29
|
Alnafisee N, Zafar S, Vedula SS, Sikder S. Current methods for assessing technical skill in cataract surgery. J Cataract Refract Surg 2021; 47:256-264. [PMID: 32675650 DOI: 10.1097/j.jcrs.0000000000000322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/19/2020] [Indexed: 12/18/2022]
Abstract
Surgery is a major source of errors in patient care. Preventing complications from surgical errors in the operating room is estimated to lead to reduction of up to 41 846 readmissions and save $620.3 million per year. It is now established that poor technical skill is associated with an increased risk of severe adverse events postoperatively and traditional models to train surgeons are being challenged by rapid advances in technology, an intensified patient-safety culture, and a need for value-driven health systems. This review discusses the current methods available for evaluating technical skills in cataract surgery and the recent technological advancements that have enabled capture and analysis of large amounts of complex surgical data for more automated objective skills assessment.
Collapse
Affiliation(s)
- Nouf Alnafisee
- From the The Wilmer Eye Institute, Johns Hopkins University School of Medicine (Alnafisee, Zafar, Sikder), Baltimore, and the Department of Computer Science, Malone Center for Engineering in Healthcare, The Johns Hopkins University Whiting School of Engineering (Vedula), Baltimore, Maryland, USA
| | | | | | | |
Collapse
|
30
|
Ward TM, Mascagni P, Ban Y, Rosman G, Padoy N, Meireles O, Hashimoto DA. Computer vision in surgery. Surgery 2020; 169:1253-1256. [PMID: 33272610 DOI: 10.1016/j.surg.2020.10.039] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 10/09/2020] [Accepted: 10/10/2020] [Indexed: 12/17/2022]
Abstract
The fields of computer vision (CV) and artificial intelligence (AI) have undergone rapid advancements in the past decade, many of which have been applied to the analysis of intraoperative video. These advances are driven by wide-spread application of deep learning, which leverages multiple layers of neural networks to teach computers complex tasks. Prior to these advances, applications of AI in the operating room were limited by our relative inability to train computers to accurately understand images with traditional machine learning (ML) techniques. The development and refining of deep neural networks that can now accurately identify objects in images and remember past surgical events has sparked a surge in the applications of CV to analyze intraoperative video and has allowed for the accurate identification of surgical phases (steps) and instruments across a variety of procedures. In some cases, CV can even identify operative phases with accuracy similar to surgeons. Future research will likely expand on this foundation of surgical knowledge using larger video datasets and improved algorithms with greater accuracy and interpretability to create clinically useful AI models that gain widespread adoption and augment the surgeon's ability to provide safer care for patients everywhere.
Collapse
Affiliation(s)
- Thomas M Ward
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Pietro Mascagni
- ICube, University of Strasbourg, CNRS, IHU Strasbourg, France; Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Yutong Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA
| | - Guy Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, IHU Strasbourg, France
| | - Ozanan Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Daniel A Hashimoto
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA.
| |
Collapse
|
31
|
Bakshi SK, Lin SR, Ting DSW, Chiang MF, Chodosh J. The era of artificial intelligence and virtual reality: transforming surgical education in ophthalmology. Br J Ophthalmol 2020; 105:1325-1328. [DOI: 10.1136/bjophthalmol-2020-316845] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 07/06/2020] [Accepted: 07/08/2020] [Indexed: 12/11/2022]
Abstract
Training the modern ophthalmic surgeon is a challenging process. Microsurgical education can benefit from innovative methods to practice surgery in low-risk simulations, assess and refine skills in the operating room through video content analytics, and learn at a distance from experienced surgeons. Developments in emerging technologies may allow us to pursue novel forms of instruction and build on current educational models. Artificial intelligence, which has already seen numerous applications in ophthalmology, may be used to facilitate surgical tracking and evaluation. Within immersive technology, growth in the space of virtual reality head-mounted displays has created intriguing possibilities for operating room simulation and observation. Here, we explore the applications of these technologies and comment on their future in ophthalmic surgical education.
Collapse
|
32
|
Goh JHL, Lim ZW, Fang X, Anees A, Nusinovici S, Rim TH, Cheng CY, Tham YC. Artificial Intelligence for Cataract Detection and Management. Asia Pac J Ophthalmol (Phila) 2020; 9:88-95. [PMID: 32349116 DOI: 10.1097/01.apo.0000656988.16221.04] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The rising popularity of artificial intelligence (AI) in ophthalmology is fuelled by the ever-increasing clinical "big data" that can be used for algorithm development. Cataract is one of the leading causes of visual impairment worldwide. However, compared with other major age-related eye diseases, such as diabetic retinopathy, age-related macular degeneration, and glaucoma, AI development in the domain of cataract is still relatively underexplored. In this regard, several previous studies explored algorithms for automated cataract assessment using either slit lamp of color fundus photographs. However, several other study groups proposed or derived new AI-based calculation for pre-cataract surgery intraocular lens power. Along with advancements in digitization of clinical data, data curation for future cataract-related AI developmental work is bound to undergo significant improvements in the foreseeable future. Even though most of these previous studies reported early promising performances, limitations such as lack of robust, high-quality training data, and lack of external validations remain. In the next phase of work, apart from algorithm's performance, it will also be pertinent to evaluate deployment angles, feasibility, efficiency, and cost-effectiveness of these new cataract-related AI systems.
Collapse
Affiliation(s)
- Jocelyn Hui Lin Goh
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
- School of Chemical and Biomedical Engineering, Division of Bioengineering, Nanyang Technological University, Singapore
| | - Zhi Wei Lim
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
- Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Xiaoling Fang
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
- Department of Ophthalmology, Shanghai Eye Disease Prevention & Treatment Center, Shanghai Eye Hospital, Shanghai, China
| | - Ayesha Anees
- Institute of High Performance Computing, A∗STAR, Singapore
| | - Simon Nusinovici
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
| | - Tyler Hyungtaek Rim
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
- Duke-NUS Medical School, Singapore
| | - Ching-Yu Cheng
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
- Duke-NUS Medical School, Singapore
- Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore
| | - Yih-Chung Tham
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
- Duke-NUS Medical School, Singapore
| |
Collapse
|
33
|
Assisted phase and step annotation for surgical videos. Int J Comput Assist Radiol Surg 2020; 15:673-680. [PMID: 32040704 DOI: 10.1007/s11548-019-02108-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Accepted: 12/17/2019] [Indexed: 12/21/2022]
Abstract
PURPOSE Annotation of surgical videos is a time-consuming task which requires specific knowledge. In this paper, we present and evaluate a deep learning-based method that includes pre-annotation of the phases and steps in surgical videos and user assistance in the annotation process. METHODS We propose a classification function that automatically detects errors and infers temporal coherence in predictions made by a convolutional neural network. First, we trained three different architectures of neural networks to assess the method on two surgical procedures: cholecystectomy and cataract surgery. The proposed method was then implemented in an annotation software to test its ability to assist surgical video annotation. A user study was conducted to validate our approach, in which participants had to annotate the phases and the steps of a cataract surgery video. The annotation and the completion time were recorded. RESULTS The participants who used the assistance system were 7% more accurate on the step annotation and 10 min faster than the participants who used the manual system. The results of the questionnaire showed that the assistance system did not disturb the participants and did not complicate the task. CONCLUSION The annotation process is a difficult and time-consuming task essential to train deep learning algorithms. In this publication, we propose a method to assist the annotation of surgical workflows which was validated through a user study. The proposed assistance system significantly improved annotation duration and accuracy.
Collapse
|
34
|
Czempiel T, Paschali M, Keicher M, Simson W, Feussner H, Kim ST, Navab N. TeCNO: Surgical Phase Recognition with Multi-stage Temporal Convolutional Networks. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION – MICCAI 2020 2020. [DOI: 10.1007/978-3-030-59716-0_33] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|