1
|
Carciumaru TZ, Tang CM, Farsi M, Bramer WM, Dankelman J, Raman C, Dirven CMF, Gholinejad M, Vasilic D. Systematic review of machine learning applications using nonoptical motion tracking in surgery. NPJ Digit Med 2025; 8:28. [PMID: 39809851 PMCID: PMC11733004 DOI: 10.1038/s41746-024-01412-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 12/21/2024] [Indexed: 01/16/2025] Open
Abstract
This systematic review explores machine learning (ML) applications in surgical motion analysis using non-optical motion tracking systems (NOMTS), alone or with optical methods. It investigates objectives, experimental designs, model effectiveness, and future research directions. From 3632 records, 84 studies were included, with Artificial Neural Networks (38%) and Support Vector Machines (11%) being the most common ML models. Skill assessment was the primary objective (38%). NOMTS used included internal device kinematics (56%), electromagnetic (17%), inertial (15%), mechanical (11%), and electromyography (1%) sensors. Surgical settings were robotic (60%), laparoscopic (18%), open (16%), and others (6%). Procedures focused on bench-top tasks (67%), clinical models (17%), clinical simulations (9%), and non-clinical simulations (7%). Over 90% accuracy was achieved in 36% of studies. Literature shows NOMTS and ML can enhance surgical precision, assessment, and training. Future research should advance ML in surgical environments, ensure model interpretability and reproducibility, and use larger datasets for accurate evaluation.
Collapse
Affiliation(s)
- Teona Z Carciumaru
- Department of Plastic and Reconstructive Surgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands.
- Department of Neurosurgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands.
| | - Cadey M Tang
- Department of Plastic and Reconstructive Surgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Mohsen Farsi
- Department of Plastic and Reconstructive Surgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Wichor M Bramer
- Medical Library, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Jenny Dankelman
- Department of Biomechanical Engineering, Delft University of Technology, Delft, the Netherlands
| | - Chirag Raman
- Department of Pattern Recognition and Bioinformatics, Delft University of Technology, Delft, the Netherlands
| | - Clemens M F Dirven
- Department of Neurosurgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Maryam Gholinejad
- Department of Plastic and Reconstructive Surgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands
- Department of Biomechanical Engineering, Delft University of Technology, Delft, the Netherlands
| | - Dalibor Vasilic
- Department of Plastic and Reconstructive Surgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| |
Collapse
|
2
|
Chen K, Bandara DSV, Arata J. A real-time approach for surgical activity recognition and prediction based on transformer models in robot-assisted surgery. Int J Comput Assist Radiol Surg 2025:10.1007/s11548-024-03306-9. [PMID: 39799528 DOI: 10.1007/s11548-024-03306-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 12/04/2024] [Indexed: 01/15/2025]
Abstract
PURPOSE This paper presents a deep learning approach to recognize and predict surgical activity in robot-assisted minimally invasive surgery (RAMIS). Our primary objective is to deploy the developed model for implementing a real-time surgical risk monitoring system within the realm of RAMIS. METHODS We propose a modified Transformer model with the architecture comprising no positional encoding, 5 fully connected layers, 1 encoder, and 3 decoders. This model is specifically designed to address 3 primary tasks in surgical robotics: gesture recognition, prediction, and end-effector trajectory prediction. Notably, it operates solely on kinematic data obtained from the joints of robotic arm. RESULTS The model's performance was evaluated on JHU-ISI Gesture and Skill Assessment Working Set dataset, achieving highest accuracy of 94.4% for gesture recognition, 84.82% for gesture prediction, and significantly low distance error of 1.34 mm with a prediction of 1 s in advance. Notably, the computational time per iteration was minimal recorded at only 4.2 ms. CONCLUSION The results demonstrated the excellence of our proposed model compared to previous studies highlighting its potential for integration in real-time systems. We firmly believe that our model could significantly elevate realms of surgical activity recognition and prediction within RAS and make a substantial and meaningful contribution to the healthcare sector.
Collapse
Affiliation(s)
- Ketai Chen
- Advanced Medical Devices Laboratory, Kyushu University, Nishi-ku, Fukuoka, 819-0382, Japan
| | - D S V Bandara
- Advanced Medical Devices Laboratory, Kyushu University, Nishi-ku, Fukuoka, 819-0382, Japan.
| | - Jumpei Arata
- Advanced Medical Devices Laboratory, Kyushu University, Nishi-ku, Fukuoka, 819-0382, Japan
| |
Collapse
|
3
|
Das A, Sidiqi B, Mennillo L, Mao Z, Brudfors M, Xochicale M, Khan DZ, Newall N, Hanrahan JG, Clarkson MJ, Stoyanov D, Marcus HJ, Bano S. Automated surgical skill assessment in endoscopic pituitary surgery using real-time instrument tracking on a high-fidelity bench-top phantom. Healthc Technol Lett 2024; 11:336-344. [PMID: 39720762 PMCID: PMC11665785 DOI: 10.1049/htl2.12101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2024] [Accepted: 11/11/2024] [Indexed: 12/26/2024] Open
Abstract
Improved surgical skill is generally associated with improved patient outcomes, although assessment is subjective, labour intensive, and requires domain-specific expertise. Automated data-driven metrics can alleviate these difficulties, as demonstrated by existing machine learning instrument tracking models. However, these models are tested on limited datasets of laparoscopic surgery, with a focus on isolated tasks and robotic surgery. Here, a new public dataset is introduced: the nasal phase of simulated endoscopic pituitary surgery. Simulated surgery allows for a realistic yet repeatable environment, meaning the insights gained from automated assessment can be used by novice surgeons to hone their skills on the simulator before moving to real surgery. Pituitary Real-time INstrument Tracking Network (PRINTNet) has been created as a baseline model for this automated assessment. Consisting of DeepLabV3 for classification and segmentation, StrongSORT for tracking, and the NVIDIA Holoscan for real-time performance, PRINTNet achieved 71.9% multiple object tracking precision running at 22 frames per second. Using this tracking output, a multilayer perceptron achieved 87% accuracy in predicting surgical skill level (novice or expert), with the 'ratio of total procedure time to instrument visible time' correlated with higher surgical skill. The new publicly available dataset can be found at https://doi.org/10.5522/04/26511049.
Collapse
Affiliation(s)
- Adrito Das
- UCL Hawkes InstituteUniversity College LondonLondonUK
| | - Bilal Sidiqi
- UCL Hawkes InstituteUniversity College LondonLondonUK
| | | | - Zhehua Mao
- UCL Hawkes InstituteUniversity College LondonLondonUK
| | | | - Miguel Xochicale
- UCL Hawkes InstituteUniversity College LondonLondonUK
- School of Biomedical Engineering and Imaging SciencesKing's College LondonLondonUK
| | - Danyal Z. Khan
- UCL Hawkes InstituteUniversity College LondonLondonUK
- Department of NeurosurgeryNational Hospital for Neurology and NeurosurgeryLondonUK
| | - Nicola Newall
- UCL Hawkes InstituteUniversity College LondonLondonUK
- Department of NeurosurgeryNational Hospital for Neurology and NeurosurgeryLondonUK
| | - John G. Hanrahan
- UCL Hawkes InstituteUniversity College LondonLondonUK
- Department of NeurosurgeryNational Hospital for Neurology and NeurosurgeryLondonUK
| | - Matthew J. Clarkson
- UCL Hawkes InstituteUniversity College LondonLondonUK
- Department of Medical Physics and Biomedical EngineeringUniversity College LondonLondonUK
| | | | - Hani J. Marcus
- UCL Hawkes InstituteUniversity College LondonLondonUK
- Department of NeurosurgeryNational Hospital for Neurology and NeurosurgeryLondonUK
| | - Sophia Bano
- UCL Hawkes InstituteUniversity College LondonLondonUK
| |
Collapse
|
4
|
Gorgy A, Xu HH, Hawary HE, Nepon H, Lee J, Vorstenbosch J. Integrating AI into Breast Reconstruction Surgery: Exploring Opportunities, Applications, and Challenges. Plast Surg (Oakv) 2024:22925503241292349. [PMID: 39545210 PMCID: PMC11559540 DOI: 10.1177/22925503241292349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 08/25/2024] [Accepted: 09/08/2024] [Indexed: 11/17/2024] Open
Abstract
Background: Artificial intelligence (AI) has significantly influenced various sectors, including healthcare, by enhancing machine capabilities in assisting with human tasks. In surgical fields, where precision and timely decision-making are crucial, AI's integration could revolutionize clinical quality and health resource optimization. This study explores the current and future applications of AI technologies in reconstructive breast surgery, aiming for broader implementation. Methods: We conducted systematic reviews through PubMed, Web of Science, and Google Scholar using relevant keywords and MeSH terms. The focus was on the main AI subdisciplines: machine learning, computer vision, natural language processing, and robotics. This review includes studies discussing AI applications across preoperative, intraoperative, postoperative, and academic settings in breast plastic surgery. Results: AI is currently utilized preoperatively to predict surgical risks and outcomes, enhancing patient counseling and informed consent processes. During surgery, AI supports the identification of anatomical landmarks and dissection strategies and provides 3-dimensional visualizations. Robotic applications are promising for procedures like microsurgical anastomoses, flap harvesting, and dermal matrix anchoring. Postoperatively, AI predicts discharge times and customizes follow-up schedules, which improves resource allocation and patient management at home. Academically, AI offers personalized training feedback to surgical trainees and aids research in breast reconstruction. Despite these advancements, concerns regarding privacy, costs, and operational efficacy persist and are critically examined in this review. Conclusions: The application of AI in breast plastic and reconstructive surgery presents substantial benefits and diverse potentials. However, much remains to be explored and developed. This study aims to consolidate knowledge and encourage ongoing research and development within the field, thereby empowering the plastic surgery community to leverage AI technologies effectively and responsibly for advancing breast reconstruction surgery.
Collapse
Affiliation(s)
- Andrew Gorgy
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - Hong Hao Xu
- Faculty of Medicine, Laval University, Quebec City, Quebec, Canada
| | - Hassan El Hawary
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - Hillary Nepon
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - James Lee
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - Joshua Vorstenbosch
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| |
Collapse
|
5
|
Silva C, Nascimento D, Dantas GG, Fonseca K, Hespanhol L, Rego A, Araújo-Filho I. Impact of artificial intelligence on the training of general surgeons of the future: a scoping review of the advances and challenges. Acta Cir Bras 2024; 39:e396224. [PMID: 39319900 PMCID: PMC11414521 DOI: 10.1590/acb396224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 08/01/2024] [Indexed: 09/26/2024] Open
Abstract
PURPOSE To explore artificial intelligence's impact on surgical education, highlighting its advantages and challenges. METHODS A comprehensive search across databases such as PubMed, Scopus, Scientific Electronic Library Online (SciELO), Embase, Web of Science, and Google Scholar was conducted to compile relevant studies. RESULTS Artificial intelligence offers several advantages in surgical training. It enables highly realistic simulation environments for the safe practice of complex procedures. Artificial intelligence provides personalized real-time feedback, improving trainees' skills. It efficiently processes clinical data, enhancing diagnostics and surgical planning. Artificial intelligence-assisted surgeries promise precision and minimally invasive procedures. Challenges include data security, resistance to artificial intelligence adoption, and ethical considerations. CONCLUSIONS Stricter policies and regulatory compliance are needed for data privacy. Addressing surgeons' and educators' reluctance to embrace artificial intelligence is crucial. Integrating artificial intelligence into curricula and providing ongoing training are vital. Ethical, bioethical, and legal aspects surrounding artificial intelligence demand attention. Establishing clear ethical guidelines, ensuring transparency, and implementing supervision and accountability are essential. As artificial intelligence evolves in surgical training, research and development remain crucial. Future studies should explore artificial intelligence-driven personalized training and monitor ethical and legal regulations. In summary, artificial intelligence is shaping the future of general surgeons, offering advanced simulations, personalized feedback, and improved patient care. However, addressing data security, adoption resistance, and ethical concerns is vital. Adapting curricula and providing continuous training are essential to maximize artificial intelligence's potential, promoting ethical and safe surgery.
Collapse
Affiliation(s)
- Caroliny Silva
- Universidade Federal do Rio Grande do Norte – General Surgery Department – Natal (RN) – Brazil
| | - Daniel Nascimento
- Universidade Federal do Rio Grande do Norte – General Surgery Department – Natal (RN) – Brazil
| | - Gabriela Gomes Dantas
- Universidade Federal do Rio Grande do Norte – General Surgery Department – Natal (RN) – Brazil
| | - Karoline Fonseca
- Universidade Federal do Rio Grande do Norte – General Surgery Department – Natal (RN) – Brazil
| | - Larissa Hespanhol
- Universidade Federal de Campina Grande – General Surgery Department – Campina Grande (PB) – Brazil
| | - Amália Rego
- Liga Contra o Câncer – Institute of Teaching, Research, and Innovation – Natal (RN) – Brazil
| | - Irami Araújo-Filho
- Universidade Federal do Rio Grande do Norte – General Surgery Department – Natal (RN) – Brazil
| |
Collapse
|
6
|
Olsen RG, Svendsen MBS, Tolsgaard MG, Konge L, Røder A, Bjerrum F. Automated performance metrics and surgical gestures: two methods for assessment of technical skills in robotic surgery. J Robot Surg 2024; 18:297. [PMID: 39068261 PMCID: PMC11283394 DOI: 10.1007/s11701-024-02051-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 07/15/2024] [Indexed: 07/30/2024]
Abstract
The objective of this study is to compare automated performance metrics (APM) and surgical gestures for technical skills assessment during simulated robot-assisted radical prostatectomy (RARP). Ten novices and six experienced RARP surgeons performed simulated RARPs on the RobotiX Mentor (Surgical Science, Sweden). Simulator APM were automatically recorded, and surgical videos were manually annotated with five types of surgical gestures. The consequences of the pass/fail levels, which were based on contrasting groups' methods, were compared for APM and surgical gestures. Intra-class correlation coefficient (ICC) analysis and a Bland-Altman plot were used to explore the correlation between APM and surgical gestures. Pass/fail levels for both APM and surgical gesture could fully distinguish between the skill levels of the surgeons with a specificity and sensitivity of 100%. The overall ICC (one-way, random) was 0.70 (95% CI: 0.34-0.88), showing moderate agreement between the methods. The Bland-Altman plot showed a high agreement between the two methods for assessing experienced surgeons but disagreed on the novice surgeons' skill level. APM and surgical gestures could both fully distinguish between novices and experienced surgeons in a simulated setting. Both methods of analyzing technical skills have their advantages and disadvantages and, as of now, those are only to a limited extent available in the clinical setting. The development of assessment methods in a simulated setting enables testing before implementing it in a clinical setting.
Collapse
Affiliation(s)
- Rikke Groth Olsen
- Copenhagen Academy for Medical Education and Simulation (CAMES), Ryesgade 53B, 2100, Copenhagen, Denmark.
- Department of Urology, Copenhagen Prostate Cancer Center, Copenhagen University Hospital-Rigshospitalet, Copenhagen, Denmark.
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Morten Bo Søndergaard Svendsen
- Copenhagen Academy for Medical Education and Simulation (CAMES), Ryesgade 53B, 2100, Copenhagen, Denmark
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Martin G Tolsgaard
- Copenhagen Academy for Medical Education and Simulation (CAMES), Ryesgade 53B, 2100, Copenhagen, Denmark
| | - Lars Konge
- Copenhagen Academy for Medical Education and Simulation (CAMES), Ryesgade 53B, 2100, Copenhagen, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Røder
- Department of Urology, Copenhagen Prostate Cancer Center, Copenhagen University Hospital-Rigshospitalet, Copenhagen, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Flemming Bjerrum
- Copenhagen Academy for Medical Education and Simulation (CAMES), Ryesgade 53B, 2100, Copenhagen, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Gastrounit, Surgical Section, Copenhagen University Hospital-Amager and Hvidovre, Hvidovre, Denmark
| |
Collapse
|
7
|
Cui Z, Ma R, Yang CH, Malpani A, Chu TN, Ghazi A, Davis JW, Miles BJ, Lau C, Liu Y, Hung AJ. Capturing relationships between suturing sub-skills to improve automatic suturing assessment. NPJ Digit Med 2024; 7:152. [PMID: 38862627 PMCID: PMC11167055 DOI: 10.1038/s41746-024-01143-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 05/22/2024] [Indexed: 06/13/2024] Open
Abstract
Suturing skill scores have demonstrated strong predictive capabilities for patient functional recovery. The suturing can be broken down into several substep components, including needle repositioning, needle entry angle, etc. Artificial intelligence (AI) systems have been explored to automate suturing skill scoring. Traditional approaches to skill assessment typically focus on evaluating individual sub-skills required for particular substeps in isolation. However, surgical procedures require the integration and coordination of multiple sub-skills to achieve successful outcomes. Significant associations among the technical sub-skill have been established by existing studies. In this paper, we propose a framework for joint skill assessment that takes into account the interconnected nature of sub-skills required in surgery. The prior known relationships among sub-skills are firstly identified. Our proposed AI system is then empowered by the prior known relationships to perform the suturing skill scoring for each sub-skill domain simultaneously. Our approach can effectively improve skill assessment performance through the prior known relationships among sub-skills. Through the proposed approach to joint skill assessment, we aspire to enhance the evaluation of surgical proficiency and ultimately improve patient outcomes in surgery.
Collapse
Affiliation(s)
- Zijun Cui
- University of Southern California, Los Angeles, CA, USA
| | - Runzhuo Ma
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Cherine H Yang
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | | | - Timothy N Chu
- University of Southern California, Los Angeles, CA, USA
| | - Ahmed Ghazi
- Johns Hopkins University, Baltimore, MD, USA
| | - John W Davis
- University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | | | | | - Yan Liu
- University of Southern California, Los Angeles, CA, USA
| | - Andrew J Hung
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
8
|
Yamada Y, Colan J, Davila A, Hasegawa Y. Multimodal semi-supervised learning for online recognition of multi-granularity surgical workflows. Int J Comput Assist Radiol Surg 2024; 19:1075-1083. [PMID: 38558289 PMCID: PMC11178653 DOI: 10.1007/s11548-024-03101-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 03/04/2024] [Indexed: 04/04/2024]
Abstract
Purpose Surgical workflow recognition is a challenging task that requires understanding multiple aspects of surgery, such as gestures, phases, and steps. However, most existing methods focus on single-task or single-modal models and rely on costly annotations for training. To address these limitations, we propose a novel semi-supervised learning approach that leverages multimodal data and self-supervision to create meaningful representations for various surgical tasks. Methods Our representation learning approach conducts two processes. In the first stage, time contrastive learning is used to learn spatiotemporal visual features from video data, without any labels. In the second stage, multimodal VAE fuses the visual features with kinematic data to obtain a shared representation, which is fed into recurrent neural networks for online recognition. Results Our method is evaluated on two datasets: JIGSAWS and MISAW. We confirmed that it achieved comparable or better performance in multi-granularity workflow recognition compared to fully supervised models specialized for each task. On the JIGSAWS Suturing dataset, we achieve a gesture recognition accuracy of 83.3%. In addition, our model is more efficient in annotation usage, as it can maintain high performance with only half of the labels. On the MISAW dataset, we achieve 84.0% AD-Accuracy in phase recognition and 56.8% AD-Accuracy in step recognition. Conclusion Our multimodal representation exhibits versatility across various surgical tasks and enhances annotation efficiency. This work has significant implications for real-time decision-making systems within the operating room.
Collapse
Affiliation(s)
- Yutaro Yamada
- Department of Micro-Nano Mechanical Science and Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, 464-8603, Japan.
| | - Jacinto Colan
- Department of Micro-Nano Mechanical Science and Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, 464-8603, Japan
| | - Ana Davila
- Institutes of Innovation for Future Society, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, 464-8601, Japan
| | - Yasuhisa Hasegawa
- Department of Micro-Nano Mechanical Science and Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi, 464-8603, Japan
| |
Collapse
|
9
|
Yang Y, Wang H, Wang J, Dong K, Ding S. Semantic-Preserving Surgical Video Retrieval With Phase and Behavior Coordinated Hashing. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:807-819. [PMID: 37788194 DOI: 10.1109/tmi.2023.3321382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Medical professionals rely on surgical video retrieval to discover relevant content within large numbers of videos for surgical education and knowledge transfer. However, the existing retrieval techniques often fail to obtain user-expected results since they ignore valuable semantics in surgical videos. The incorporation of rich semantics into video retrieval is challenging in terms of the hierarchical relationship modeling and coordination between coarse- and fine-grained semantics. To address these issues, this paper proposes a novel semantic-preserving surgical video retrieval (SPSVR) framework, which incorporates surgical phase and behavior semantics using a dual-level hashing module to capture their hierarchical relationship. This module preserves the semantics in binary hash codes by transforming the phase and behavior similarities into high- and low-level similarities in a shared Hamming space. The binary codes are optimized by performing a reconstruction task, a high-level similarity preservation task, and a low-level similarity preservation task, using a coordinated optimization strategy for efficient learning. A self-supervised learning scheme is adopted to capture behavior semantics from video clips so that the indexing of behaviors is unencumbered by fine-grained annotation and recognition. Experiments on four surgical video datasets for two different disciplines demonstrate the robust performance of the proposed framework. In addition, the results of the clinical validation experiments indicate the ability of the proposed method to retrieve the results expected by surgeons. The code can be found at https://github.com/trigger26/SPSVR.
Collapse
|
10
|
Goodman ED, Patel KK, Zhang Y, Locke W, Kennedy CJ, Mehrotra R, Ren S, Guan M, Zohar O, Downing M, Chen HW, Clark JZ, Berrigan MT, Brat GA, Yeung-Levy S. Analyzing Surgical Technique in Diverse Open Surgical Videos With Multitask Machine Learning. JAMA Surg 2024; 159:185-192. [PMID: 38055227 PMCID: PMC10701669 DOI: 10.1001/jamasurg.2023.6262] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 09/04/2023] [Indexed: 12/07/2023]
Abstract
Objective To overcome limitations of open surgery artificial intelligence (AI) models by curating the largest collection of annotated videos and to leverage this AI-ready data set to develop a generalizable multitask AI model capable of real-time understanding of clinically significant surgical behaviors in prospectively collected real-world surgical videos. Design, Setting, and Participants The study team programmatically queried open surgery procedures on YouTube and manually annotated selected videos to create the AI-ready data set used to train a multitask AI model for 2 proof-of-concept studies, one generating surgical signatures that define the patterns of a given procedure and the other identifying kinematics of hand motion that correlate with surgeon skill level and experience. The Annotated Videos of Open Surgery (AVOS) data set includes 1997 videos from 23 open-surgical procedure types uploaded to YouTube from 50 countries over the last 15 years. Prospectively recorded surgical videos were collected from a single tertiary care academic medical center. Deidentified videos were recorded of surgeons performing open surgical procedures and analyzed for correlation with surgical training. Exposures The multitask AI model was trained on the AI-ready video data set and then retrospectively applied to the prospectively collected video data set. Main Outcomes and Measures Analysis of open surgical videos in near real-time, performance on AI-ready and prospectively collected videos, and quantification of surgeon skill. Results Using the AI-ready data set, the study team developed a multitask AI model capable of real-time understanding of surgical behaviors-the building blocks of procedural flow and surgeon skill-across space and time. Through principal component analysis, a single compound skill feature was identified, composed of a linear combination of kinematic hand attributes. This feature was a significant discriminator between experienced surgeons and surgical trainees across 101 prospectively collected surgical videos of 14 operators. For each unit increase in the compound feature value, the odds of the operator being an experienced surgeon were 3.6 times higher (95% CI, 1.67-7.62; P = .001). Conclusions and Relevance In this observational study, the AVOS-trained model was applied to analyze prospectively collected open surgical videos and identify kinematic descriptors of surgical skill related to efficiency of hand motion. The ability to provide AI-deduced insights into surgical structure and skill is valuable in optimizing surgical skill acquisition and ultimately improving surgical care.
Collapse
Affiliation(s)
- Emmett D. Goodman
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Krishna K. Patel
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Yilun Zhang
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - William Locke
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Chris J. Kennedy
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Rohan Mehrotra
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Stephen Ren
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Melody Guan
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
| | - Orr Zohar
- Department of Biomedical Data Science, Stanford University, Stanford, California
- Department of Electrical Engineering, Stanford University, Stanford, California
| | - Maren Downing
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - Hao Wei Chen
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - Jevin Z. Clark
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - Margaret T. Berrigan
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - Gabriel A. Brat
- Department of Surgery, Beth Israel Deaconess Medical Center, Boston, Massachusetts
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Serena Yeung-Levy
- Department of Computer Science, Stanford University, Stanford, California
- Department of Biomedical Data Science, Stanford University, Stanford, California
- Department of Electrical Engineering, Stanford University, Stanford, California
- Clinical Excellence Research Center, Stanford University School of Medicine, Stanford, California
| |
Collapse
|
11
|
Olsen RG, Svendsen MBS, Tolsgaard MG, Konge L, Røder A, Bjerrum F. Surgical gestures can be used to assess surgical competence in robot-assisted surgery : A validity investigating study of simulated RARP. J Robot Surg 2024; 18:47. [PMID: 38244130 PMCID: PMC10799775 DOI: 10.1007/s11701-023-01807-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 12/23/2023] [Indexed: 01/22/2024]
Abstract
To collect validity evidence for the assessment of surgical competence through the classification of general surgical gestures for a simulated robot-assisted radical prostatectomy (RARP). We used 165 video recordings of novice and experienced RARP surgeons performing three parts of the RARP procedure on the RobotiX Mentor. We annotated the surgical tasks with different surgical gestures: dissection, hemostatic control, application of clips, needle handling, and suturing. The gestures were analyzed using idle time (periods with minimal instrument movements) and active time (whenever a surgical gesture was annotated). The distribution of surgical gestures was described using a one-dimensional heat map, snail tracks. All surgeons had a similar percentage of idle time but novices had longer phases of idle time (mean time: 21 vs. 15 s, p < 0.001). Novices used a higher total number of surgical gestures (number of phases: 45 vs. 35, p < 0.001) and each phase was longer compared with those of the experienced surgeons (mean time: 10 vs. 8 s, p < 0.001). There was a different pattern of gestures between novices and experienced surgeons as seen by a different distribution of the phases. General surgical gestures can be used to assess surgical competence in simulated RARP and can be displayed as a visual tool to show how performance is improving. The established pass/fail level may be used to ensure the competence of the residents before proceeding with supervised real-life surgery. The next step is to investigate if the developed tool can optimize automated feedback during simulator training.
Collapse
Affiliation(s)
- Rikke Groth Olsen
- Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, The Capital Region of Denmark, Ryesgade 53B, 2100, Copenhagen, Denmark.
- Department of Urology, Copenhagen Prostate Cancer Center, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark.
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Morten Bo Søndergaard Svendsen
- Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, The Capital Region of Denmark, Ryesgade 53B, 2100, Copenhagen, Denmark
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Martin G Tolsgaard
- Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, The Capital Region of Denmark, Ryesgade 53B, 2100, Copenhagen, Denmark
| | - Lars Konge
- Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, The Capital Region of Denmark, Ryesgade 53B, 2100, Copenhagen, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Røder
- Department of Urology, Copenhagen Prostate Cancer Center, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Flemming Bjerrum
- Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, The Capital Region of Denmark, Ryesgade 53B, 2100, Copenhagen, Denmark
- Department of Gastrointestinal and Hepatic Diseases, Copenhagen University Hospital - Herlev and Gentofte, Herlev, Denmark
| |
Collapse
|
12
|
Boal MWE, Anastasiou D, Tesfai F, Ghamrawi W, Mazomenos E, Curtis N, Collins JW, Sridhar A, Kelly J, Stoyanov D, Francis NK. Evaluation of objective tools and artificial intelligence in robotic surgery technical skills assessment: a systematic review. Br J Surg 2024; 111:znad331. [PMID: 37951600 PMCID: PMC10771126 DOI: 10.1093/bjs/znad331] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 09/18/2023] [Accepted: 09/19/2023] [Indexed: 11/14/2023]
Abstract
BACKGROUND There is a need to standardize training in robotic surgery, including objective assessment for accreditation. This systematic review aimed to identify objective tools for technical skills assessment, providing evaluation statuses to guide research and inform implementation into training curricula. METHODS A systematic literature search was conducted in accordance with the PRISMA guidelines. Ovid Embase/Medline, PubMed and Web of Science were searched. Inclusion criterion: robotic surgery technical skills tools. Exclusion criteria: non-technical, laparoscopy or open skills only. Manual tools and automated performance metrics (APMs) were analysed using Messick's concept of validity and the Oxford Centre of Evidence-Based Medicine (OCEBM) Levels of Evidence and Recommendation (LoR). A bespoke tool analysed artificial intelligence (AI) studies. The Modified Downs-Black checklist was used to assess risk of bias. RESULTS Two hundred and forty-seven studies were analysed, identifying: 8 global rating scales, 26 procedure-/task-specific tools, 3 main error-based methods, 10 simulators, 28 studies analysing APMs and 53 AI studies. Global Evaluative Assessment of Robotic Skills and the da Vinci Skills Simulator were the most evaluated tools at LoR 1 (OCEBM). Three procedure-specific tools, 3 error-based methods and 1 non-simulator APMs reached LoR 2. AI models estimated outcomes (skill or clinical), demonstrating superior accuracy rates in the laboratory with 60 per cent of methods reporting accuracies over 90 per cent, compared to real surgery ranging from 67 to 100 per cent. CONCLUSIONS Manual and automated assessment tools for robotic surgery are not well validated and require further evaluation before use in accreditation processes.PROSPERO: registration ID CRD42022304901.
Collapse
Affiliation(s)
- Matthew W E Boal
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
| | - Dimitrios Anastasiou
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Medical Physics and Biomedical Engineering, UCL, London, UK
| | - Freweini Tesfai
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
| | - Walaa Ghamrawi
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
| | - Evangelos Mazomenos
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Medical Physics and Biomedical Engineering, UCL, London, UK
| | - Nathan Curtis
- Department of General Surgey, Dorset County Hospital NHS Foundation Trust, Dorchester, UK
| | - Justin W Collins
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - Ashwin Sridhar
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - John Kelly
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- University College London Hospitals NHS Foundation Trust, London, UK
| | - Danail Stoyanov
- Wellcome/ESPRC Centre for Interventional Surgical Sciences (WEISS), University College London (UCL), London, UK
- Computer Science, UCL, London, UK
| | - Nader K Francis
- The Griffin Institute, Northwick Park & St Marks’ Hospital, London, UK
- Division of Surgery and Interventional Science, Research Department of Targeted Intervention, UCL, London, UK
- Yeovil District Hospital, Somerset Foundation NHS Trust, Yeovil, Somerset, UK
| |
Collapse
|
13
|
Kulik D, Bell CR, Holden MS. FAST skill assessment from kinematics data using convolutional neural networks. Int J Comput Assist Radiol Surg 2024; 19:43-49. [PMID: 37093528 DOI: 10.1007/s11548-023-02908-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 04/03/2023] [Indexed: 04/25/2023]
Abstract
PURPOSE FAST is a point of care ultrasound study that evaluates for the presence of free fluid, typically hemoperitoneum in trauma patients. FAST is an essential skill for Emergency Physicians. Thus, it requires objective evaluation tools that can reduce the necessity of direct observation for proficiency assessment. In this work, we use deep neural networks to automatically assess operators' FAST skills. METHODS We propose a deep convolutional neural network for FAST proficiency assessment based on motion data. Prior work has shown that operators demonstrate different domain-specific dexterity metrics that can distinguish novices, intermediates, and experts. Therefore, we augment our dataset with this domain knowledge and employ fine-tuning to improve the model's classification capabilities. Our model, however, does not require specific points of interest (POIs) to be defined for scanning. RESULTS The results show that the proposed deep convolutional neural network can classify FAST proficiency with 87.5% accuracy and 0.884, 0.886, 0.247 sensitivity for novices, intermediates, and experts, respectively. It demonstrates the potential of using kinematics data as an input in FAST skill assessment tasks. We also show that the proposed domain-specific features and region fine-tuning increase the model's classification accuracy and sensitivity. CONCLUSIONS Variations in probe motion at different learning stages can be derived from kinematics data. These variations can be used for automatic and objective skill assessment without prior identification of clinical POIs. The proposed approach can improve the quality and objectivity of FAST proficiency evaluation. Furthermore, skill assessment combining ultrasound images and kinematics data can provide a more rigorous and diversified evaluation than using ultrasound images alone.
Collapse
Affiliation(s)
- Daniil Kulik
- School of Computer Science, Carleton University, 1125 Colonel By Dr, Ottawa, K1S 5B6, ON, Canada.
| | - Colin R Bell
- Department of Emergency Medicine and Cumming School of Medicine, University of Calgary, 3330 Hospital Dr NW, Calgary, AB, T2N 4N1, Canada
| | - Matthew S Holden
- School of Computer Science, Carleton University, 1125 Colonel By Dr, Ottawa, K1S 5B6, ON, Canada
| |
Collapse
|
14
|
Hegde SR, Namazi B, Iyengar N, Cao S, Desir A, Marques C, Mahnken H, Dumas RP, Sankaranarayanan G. Automated segmentation of phases, steps, and tasks in laparoscopic cholecystectomy using deep learning. Surg Endosc 2024; 38:158-170. [PMID: 37945709 DOI: 10.1007/s00464-023-10482-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 09/17/2023] [Indexed: 11/12/2023]
Abstract
BACKGROUND Video-based review is paramount for operative performance assessment but can be laborious when performed manually. Hierarchical Task Analysis (HTA) is a well-known method that divides any procedure into phases, steps, and tasks. HTA requires large datasets of videos with consistent definitions at each level. Our aim was to develop an AI model for automated segmentation of phases, steps, and tasks for laparoscopic cholecystectomy videos using a standardized HTA. METHODS A total of 160 laparoscopic cholecystectomy videos were collected from a publicly available dataset known as cholec80 and from our own institution. All videos were annotated for the beginning and ending of a predefined set of phases, steps, and tasks. Deep learning models were then separately developed and trained for the three levels using a 3D Convolutional Neural Network architecture. RESULTS Four phases, eight steps, and nineteen tasks were defined through expert consensus. The training set for our deep learning models contained 100 videos with an additional 20 videos for hyperparameter optimization and tuning. The remaining 40 videos were used for testing the performance. The overall accuracy for phases, steps, and tasks were 0.90, 0.81, and 0.65 with the average F1 score of 0.86, 0.76 and 0.48 respectively. Control of bleeding and bile spillage tasks were most variable in definition, operative management, and clinical relevance. CONCLUSION The use of hierarchical task analysis for surgical video analysis has numerous applications in AI-based automated systems. Our results show that our tiered method of task analysis can successfully be used to train a DL model.
Collapse
Affiliation(s)
- Shruti R Hegde
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Babak Namazi
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Niyenth Iyengar
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Sarah Cao
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Alexis Desir
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Carolina Marques
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Heidi Mahnken
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Ryan P Dumas
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA
| | - Ganesh Sankaranarayanan
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX, 75390-9159, USA.
| |
Collapse
|
15
|
Daneshgar Rahbar M, Mousavi Mojab SZ. Enhanced U-Net with GridMask (EUGNet): A Novel Approach for Robotic Surgical Tool Segmentation. J Imaging 2023; 9:282. [PMID: 38132700 PMCID: PMC10744415 DOI: 10.3390/jimaging9120282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/13/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023] Open
Abstract
This study proposed enhanced U-Net with GridMask (EUGNet) image augmentation techniques focused on pixel manipulation, emphasizing GridMask augmentation. This study introduces EUGNet, which incorporates GridMask augmentation to address U-Net's limitations. EUGNet features a deep contextual encoder, residual connections, class-balancing loss, adaptive feature fusion, GridMask augmentation module, efficient implementation, and multi-modal fusion. These innovations enhance segmentation accuracy and robustness, making it well-suited for medical image analysis. The GridMask algorithm is detailed, demonstrating its distinct approach to pixel elimination, enhancing model adaptability to occlusions and local features. A comprehensive dataset of robotic surgical scenarios and instruments is used for evaluation, showcasing the framework's robustness. Specifically, there are improvements of 1.6 percentage points in balanced accuracy for the foreground, 1.7 points in intersection over union (IoU), and 1.7 points in mean Dice similarity coefficient (DSC). These improvements are highly significant and have a substantial impact on inference speed. The inference speed, which is a critical factor in real-time applications, has seen a noteworthy reduction. It decreased from 0.163 milliseconds for the U-Net without GridMask to 0.097 milliseconds for the U-Net with GridMask.
Collapse
Affiliation(s)
- Mostafa Daneshgar Rahbar
- Department of Electrical and Computer Engineering, Lawrence Technological University, Southfield, MI 48075, USA
| | | |
Collapse
|
16
|
Chen G, Li L, Hubert J, Luo B, Yang K, Wang X. Effectiveness of a vision-based handle trajectory monitoring system in studying robotic suture operation. J Robot Surg 2023; 17:2791-2798. [PMID: 37728690 DOI: 10.1007/s11701-023-01713-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 09/02/2023] [Indexed: 09/21/2023]
Abstract
Data on surgical robots are not openly accessible, limiting further study of the operation trajectory of surgeons' hands. Therefore, a trajectory monitoring system should be developed to examine objective indicators reflecting the characteristic parameters of operations. 20 robotic experts and 20 first-year residents without robotic experience were included in this study. A dry-lab suture task was used to acquire relevant hand performance data. Novices completed training on the simulator and then performed the task, while the expert team completed the task after warm-up. Stitching errors were measured using a visual recognition method. Videos of operations were obtained using the camera array mounted on the robot, and the hand trajectory of the surgeons was reconstructed. The stitching accuracy, robotic control parameters, balance and dexterity parameters, and operation efficiency parameters were compared. Experts had smaller center distance (p < 0.001) and larger proximal distance between the hands (p < 0.001) compared with novices. The path and volume ratios between the left and right hands of novices were larger than those of experts (both p < 0.001) and the total volume of the operation range of experts was smaller (p < 0.001). The surgeon trajectory optical monitoring system is an effective and non-subjective method to distinguish skill differences. This demonstrates the potential of pan-platform use to evaluate task completion and help surgeons improve their robotic learning curve.
Collapse
Affiliation(s)
- Gaojie Chen
- Department of Urology, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China
- Medicine-Remote Mapping Associated Laboratory, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China
| | - Lu Li
- Department of Urology, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China
- Medicine-Remote Mapping Associated Laboratory, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China
| | - Jacques Hubert
- Department of Urology, CHRU Nancy Brabois University Hospital, Vandoeuvre-Lès-Nancy, France
- IADI-UL-INSERM (U1254), University Hospital, Vandoeuvre-Lès-Nancy, France
| | - Bin Luo
- Medicine-Remote Mapping Associated Laboratory, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China
- State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, Hubei, China
| | - Kun Yang
- Department of Urology, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China.
- Medicine-Remote Mapping Associated Laboratory, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China.
| | - Xinghuan Wang
- Department of Urology, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China.
- Medicine-Remote Mapping Associated Laboratory, ZhongNan Hospital, Wuhan University, No. 169 Donghu Road, Wuhan, 430071, Hubei, China.
| |
Collapse
|
17
|
Hutchinson K, Reyes I, Li Z, Alemzadeh H. COMPASS: a formal framework and aggregate dataset for generalized surgical procedure modeling. Int J Comput Assist Radiol Surg 2023; 18:2143-2154. [PMID: 37145250 DOI: 10.1007/s11548-023-02922-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Accepted: 04/14/2023] [Indexed: 05/06/2023]
Abstract
PURPOSE We propose a formal framework for the modeling and segmentation of minimally invasive surgical tasks using a unified set of motion primitives (MPs) to enable more objective labeling and the aggregation of different datasets. METHODS We model dry-lab surgical tasks as finite state machines, representing how the execution of MPs as the basic surgical actions results in the change of surgical context, which characterizes the physical interactions among tools and objects in the surgical environment. We develop methods for labeling surgical context based on video data and for automatic translation of context to MP labels. We then use our framework to create the COntext and Motion Primitive Aggregate Surgical Set (COMPASS), including six dry-lab surgical tasks from three publicly available datasets (JIGSAWS, DESK, and ROSMA), with kinematic and video data and context and MP labels. RESULTS Our context labeling method achieves near-perfect agreement between consensus labels from crowd-sourcing and expert surgeons. Segmentation of tasks to MPs results in the creation of the COMPASS dataset that nearly triples the amount of data for modeling and analysis and enables the generation of separate transcripts for the left and right tools. CONCLUSION The proposed framework results in high quality labeling of surgical data based on context and fine-grained MPs. Modeling surgical tasks with MPs enables the aggregation of different datasets and the separate analysis of left and right hands for bimanual coordination assessment. Our formal framework and aggregate dataset can support the development of explainable and multi-granularity models for improved surgical process analysis, skill assessment, error detection, and autonomy.
Collapse
Affiliation(s)
- Kay Hutchinson
- Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, 22903, USA.
| | - Ian Reyes
- Department of Computer Science, University of Virginia, Charlottesville, VA, 22903, USA
- IBM, RTP, Durham, NC, 27709, USA
| | - Zongyu Li
- Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, 22903, USA
| | - Homa Alemzadeh
- Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, 22903, USA
- Department of Computer Science, University of Virginia, Charlottesville, VA, 22903, USA
| |
Collapse
|
18
|
Xu J, Anastasiou D, Booker J, Burton OE, Layard Horsfall H, Salvadores Fernandez C, Xue Y, Stoyanov D, Tiwari MK, Marcus HJ, Mazomenos EB. A Deep Learning Approach to Classify Surgical Skill in Microsurgery Using Force Data from a Novel Sensorised Surgical Glove. SENSORS (BASEL, SWITZERLAND) 2023; 23:8947. [PMID: 37960645 PMCID: PMC10650455 DOI: 10.3390/s23218947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 10/26/2023] [Accepted: 11/01/2023] [Indexed: 11/15/2023]
Abstract
Microsurgery serves as the foundation for numerous operative procedures. Given its highly technical nature, the assessment of surgical skill becomes an essential component of clinical practice and microsurgery education. The interaction forces between surgical tools and tissues play a pivotal role in surgical success, making them a valuable indicator of surgical skill. In this study, we employ six distinct deep learning architectures (LSTM, GRU, Bi-LSTM, CLDNN, TCN, Transformer) specifically designed for the classification of surgical skill levels. We use force data obtained from a novel sensorized surgical glove utilized during a microsurgical task. To enhance the performance of our models, we propose six data augmentation techniques. The proposed frameworks are accompanied by a comprehensive analysis, both quantitative and qualitative, including experiments conducted with two cross-validation schemes and interpretable visualizations of the network's decision-making process. Our experimental results show that CLDNN and TCN are the top-performing models, achieving impressive accuracy rates of 96.16% and 97.45%, respectively. This not only underscores the effectiveness of our proposed architectures, but also serves as compelling evidence that the force data obtained through the sensorized surgical glove contains valuable information regarding surgical skill.
Collapse
Affiliation(s)
- Jialang Xu
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
| | - Dimitrios Anastasiou
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
| | - James Booker
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London WC1N 3BG, UK
| | - Oliver E. Burton
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London WC1N 3BG, UK
| | - Hugo Layard Horsfall
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London WC1N 3BG, UK
| | - Carmen Salvadores Fernandez
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Nanoengineered Systems Laboratory, UCL Mechanical Engineering, University College London, London WC1E 7JE, UK
| | - Yang Xue
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Nanoengineered Systems Laboratory, UCL Mechanical Engineering, University College London, London WC1E 7JE, UK
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Department of Computer Science, University College London, London WC1E 6BT, UK
| | - Manish K. Tiwari
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Nanoengineered Systems Laboratory, UCL Mechanical Engineering, University College London, London WC1E 7JE, UK
| | - Hani J. Marcus
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London WC1N 3BG, UK
| | - Evangelos B. Mazomenos
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London W1W 7TY, UK; (J.X.); (D.A.); (J.B.); (O.E.B.); (H.L.H.); (C.S.F.); (Y.X.); (D.S.); (M.K.T.); (H.J.M.)
- Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
| |
Collapse
|
19
|
Ortenzi M, Rapoport Ferman J, Antolin A, Bar O, Zohar M, Perry O, Asselmann D, Wolf T. A novel high accuracy model for automatic surgical workflow recognition using artificial intelligence in laparoscopic totally extraperitoneal inguinal hernia repair (TEP). Surg Endosc 2023; 37:8818-8828. [PMID: 37626236 PMCID: PMC10615930 DOI: 10.1007/s00464-023-10375-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 07/30/2023] [Indexed: 08/27/2023]
Abstract
INTRODUCTION Artificial intelligence and computer vision are revolutionizing the way we perceive video analysis in minimally invasive surgery. This emerging technology has increasingly been leveraged successfully for video segmentation, documentation, education, and formative assessment. New, sophisticated platforms allow pre-determined segments chosen by surgeons to be automatically presented without the need to review entire videos. This study aimed to validate and demonstrate the accuracy of the first reported AI-based computer vision algorithm that automatically recognizes surgical steps in videos of totally extraperitoneal (TEP) inguinal hernia repair. METHODS Videos of TEP procedures were manually labeled by a team of annotators trained to identify and label surgical workflow according to six major steps. For bilateral hernias, an additional change of focus step was also included. The videos were then used to train a computer vision AI algorithm. Performance accuracy was assessed in comparison to the manual annotations. RESULTS A total of 619 full-length TEP videos were analyzed: 371 were used to train the model, 93 for internal validation, and the remaining 155 as a test set to evaluate algorithm accuracy. The overall accuracy for the complete procedure was 88.8%. Per-step accuracy reached the highest value for the hernia sac reduction step (94.3%) and the lowest for the preperitoneal dissection step (72.2%). CONCLUSIONS These results indicate that the novel AI model was able to provide fully automated video analysis with a high accuracy level. High-accuracy models leveraging AI to enable automation of surgical video analysis allow us to identify and monitor surgical performance, providing mathematical metrics that can be stored, evaluated, and compared. As such, the proposed model is capable of enabling data-driven insights to improve surgical quality and demonstrate best practices in TEP procedures.
Collapse
Affiliation(s)
- Monica Ortenzi
- Theator Inc., Palo Alto, CA, USA.
- Department of General and Emergency Surgery, Polytechnic University of Marche, Ancona, Italy.
| | | | | | - Omri Bar
- Theator Inc., Palo Alto, CA, USA
| | | | | | | | | |
Collapse
|
20
|
Baghdadi A, Guo E, Lama S, Singh R, Chow M, Sutherland GR. Force Profile as Surgeon-Specific Signature. ANNALS OF SURGERY OPEN 2023; 4:e326. [PMID: 37746608 PMCID: PMC10513276 DOI: 10.1097/as9.0000000000000326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 07/22/2023] [Indexed: 09/26/2023] Open
Abstract
Objective To investigate the notion that a surgeon's force profile can be the signature of their identity and performance. Summary background data Surgeon performance in the operating room is an understudied topic. The advent of deep learning methods paired with a sensorized surgical device presents an opportunity to incorporate quantitative insight into surgical performance and processes. Using a device called the SmartForceps System and through automated analytics, we have previously reported surgeon force profile, surgical skill, and task classification. However, an investigation of whether an individual surgeon can be identified by surgical technique has yet to be studied. Methods In this study, we investigate multiple neural network architectures to identify the surgeon associated with their time-series tool-tissue forces using bipolar forceps data. The surgeon associated with each 10-second window of force data was labeled, and the data were randomly split into 80% for model training and validation (10% validation) and 20% for testing. Data imbalance was mitigated through subsampling from more populated classes with a random size adjustment based on 0.1% of sample counts in the respective class. An exploratory analysis of force segments was performed to investigate underlying patterns differentiating individual surgical techniques. Results In a dataset of 2819 ten-second time segments from 89 neurosurgical cases, the best-performing model achieved a micro-average area under the curve of 0.97, a testing F1-score of 0.82, a sensitivity of 82%, and a precision of 82%. This model was a time-series ResNet model to extract features from the time-series data followed by a linearized output into the XGBoost algorithm. Furthermore, we found that convolutional neural networks outperformed long short-term memory networks in performance and speed. Using a weighted average approach, an ensemble model was able to identify an expert surgeon with 83.8% accuracy using a validation dataset. Conclusions Our results demonstrate that each surgeon has a unique force profile amenable to identification using deep learning methods. We anticipate our models will enable a quantitative framework to provide bespoke feedback to surgeons and to track their skill progression longitudinally. Furthermore, the ability to recognize individual surgeons introduces the mechanism of correlating outcome to surgeon performance.
Collapse
Affiliation(s)
- Amir Baghdadi
- From the Project neuroArm, Department of Clinical Neurosciences, and Hotchkiss Brain Institute University of Calgary, Calgary, Alberta, Canada
| | - Eddie Guo
- From the Project neuroArm, Department of Clinical Neurosciences, and Hotchkiss Brain Institute University of Calgary, Calgary, Alberta, Canada
| | - Sanju Lama
- From the Project neuroArm, Department of Clinical Neurosciences, and Hotchkiss Brain Institute University of Calgary, Calgary, Alberta, Canada
| | - Rahul Singh
- From the Project neuroArm, Department of Clinical Neurosciences, and Hotchkiss Brain Institute University of Calgary, Calgary, Alberta, Canada
| | - Michael Chow
- Department of Surgery, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada
| | - Garnette R. Sutherland
- From the Project neuroArm, Department of Clinical Neurosciences, and Hotchkiss Brain Institute University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
21
|
Ramesh S, Dall'Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N. Weakly Supervised Temporal Convolutional Networks for Fine-Grained Surgical Activity Recognition. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2592-2602. [PMID: 37030859 DOI: 10.1109/tmi.2023.3262847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition with fewer step annotated videos. We introduce a step-phase dependency loss to exploit the weak supervision signal. We then employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos, for temporal activity segmentation and recognition. We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.
Collapse
|
22
|
Pan-Doh N, Sikder S, Woreta FA, Handa JT. Using the language of surgery to enhance ophthalmology surgical education. Surg Open Sci 2023; 14:52-59. [PMID: 37528917 PMCID: PMC10387608 DOI: 10.1016/j.sopen.2023.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 07/09/2023] [Indexed: 08/03/2023] Open
Abstract
Background Currently, surgical education utilizes a combination of the apprentice model, wet-lab training, and simulation, but due to reliance on subjective data, the quality of teaching and assessment can be variable. The "language of surgery," an established concept in engineering literature whose incorporation into surgical education has been limited, is defined as the description of each surgical maneuver using quantifiable metrics. This concept is different from the traditional notion of surgical language, generally thought of as the qualitative definitions and terminology used by surgeons. Methods A literature search was conducted through April 2023 using MEDLINE/PubMed using search terms to investigate wet-lab, virtual simulators, and robotics in ophthalmology, along with the language of surgery and surgical education. Articles published before 2005 were mostly excluded, although a few were included on a case-by-case basis. Results Surgical maneuvers can be quantified by leveraging technological advances in virtual simulators, video recordings, and surgical robots to create a language of surgery. By measuring and describing maneuver metrics, the learning surgeon can adjust surgical movements in an appropriately graded fashion that is based on objective and standardized data. The main contribution is outlining a structured education framework that details how surgical education could be improved by incorporating the language of surgery, using ophthalmology surgical education as an example. Conclusion By describing each surgical maneuver in quantifiable, objective, and standardized terminology, a language of surgery can be created that can be used to learn, teach, and assess surgical technical skill with an approach that minimizes bias. Key message The "language of surgery," defined as the quantification of each surgical movement's characteristics, is an established concept in the engineering literature. Using ophthalmology surgical education as an example, we describe a structured education framework based on the language of surgery to improve surgical education. Classifications Surgical education, robotic surgery, ophthalmology, education standardization, computerized assessment, simulations in teaching. Competencies Practice-Based Learning and Improvement.
Collapse
Affiliation(s)
- Nathan Pan-Doh
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Shameema Sikder
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Fasika A. Woreta
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - James T. Handa
- Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
23
|
Wang J, Zhang X, Chen X, Song Z. A touch-free human-robot collaborative surgical navigation robotic system based on hand gesture recognition. Front Neurosci 2023; 17:1200576. [PMID: 37342464 PMCID: PMC10277510 DOI: 10.3389/fnins.2023.1200576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 05/15/2023] [Indexed: 06/23/2023] Open
Abstract
Robot-assisted minimally invasive surgery (RAMIS) has gained significant traction in clinical practice in recent years. However, most surgical robots rely on touch-based human-robot interaction (HRI), which increases the risk of bacterial diffusion. This risk is particularly concerning when surgeons must operate various equipment with their bare hands, necessitating repeated sterilization. Thus, achieving touch-free and precise manipulation with a surgical robot is challenging. To address this challenge, we propose a novel HRI interface based on gesture recognition, leveraging hand-keypoint regression and hand-shape reconstruction methods. By encoding the 21 keypoints from the recognized hand gesture, the robot can successfully perform the corresponding action according to predefined rules, which enables the robot to perform fine-tuning of surgical instruments without the need for physical contact with the surgeon. We evaluated the surgical applicability of the proposed system through both phantom and cadaver studies. In the phantom experiment, the average needle tip location error was 0.51 mm, and the mean angle error was 0.34 degrees. In the simulated nasopharyngeal carcinoma biopsy experiment, the needle insertion error was 0.16 mm, and the angle error was 0.10 degrees. These results indicate that the proposed system achieves clinically acceptable accuracy and can assist surgeons in performing contactless surgery with hand gesture interaction.
Collapse
Affiliation(s)
- Jie Wang
- Academy for Engineering and Technology, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai, China
| | - Xinkang Zhang
- Academy for Engineering and Technology, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai, China
| | - Xinrong Chen
- Academy for Engineering and Technology, Fudan University, Shanghai, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai, China
| | - Zhijian Song
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai, China
- Digital Medical Research Center, School of Basic Medical Science, Fudan University, Shanghai, China
| |
Collapse
|
24
|
Pan M, Wang S, Li J, Li J, Yang X, Liang K. An Automated Skill Assessment Framework Based on Visual Motion Signals and a Deep Neural Network in Robot-Assisted Minimally Invasive Surgery. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094496. [PMID: 37177699 PMCID: PMC10181496 DOI: 10.3390/s23094496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/27/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023]
Abstract
Surgical skill assessment can quantify the quality of the surgical operation via the motion state of the surgical instrument tip (SIT), which is considered one of the effective primary means by which to improve the accuracy of surgical operation. Traditional methods have displayed promising results in skill assessment. However, this success is predicated on the SIT sensors, making these approaches impractical when employing the minimally invasive surgical robot with such a tiny end size. To address the assessment issue regarding the operation quality of robot-assisted minimally invasive surgery (RAMIS), this paper proposes a new automatic framework for assessing surgical skills based on visual motion tracking and deep learning. The new method innovatively combines vision and kinematics. The kernel correlation filter (KCF) is introduced in order to obtain the key motion signals of the SIT and classify them by using the residual neural network (ResNet), realizing automated skill assessment in RAMIS. To verify its effectiveness and accuracy, the proposed method is applied to the public minimally invasive surgical robot dataset, the JIGSAWS. The results show that the method based on visual motion tracking technology and a deep neural network model can effectively and accurately assess the skill of robot-assisted surgery in near real-time. In a fairly short computational processing time of 3 to 5 s, the average accuracy of the assessment method is 92.04% and 84.80% in distinguishing two and three skill levels. This study makes an important contribution to the safe and high-quality development of RAMIS.
Collapse
Affiliation(s)
- Mingzhang Pan
- College of Mechanical Engineering, Guangxi University, Nanning 530004, China
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Nanning 530004, China
| | - Shuo Wang
- College of Mechanical Engineering, Guangxi University, Nanning 530004, China
| | - Jingao Li
- College of Mechanical Engineering, Guangxi University, Nanning 530004, China
| | - Jing Li
- College of Mechanical Engineering, Guangxi University, Nanning 530004, China
| | - Xiuze Yang
- College of Mechanical Engineering, Guangxi University, Nanning 530004, China
| | - Ke Liang
- College of Mechanical Engineering, Guangxi University, Nanning 530004, China
- Guangxi Key Laboratory of Manufacturing System & Advanced Manufacturing Technology, School of Mechanical Engineering, Guangxi University, Nanning 530004, China
| |
Collapse
|
25
|
Jackson KL, Durić Z, Engdahl SM, Santago II AC, DeStefano S, Gerber LH. Computer-assisted approaches for measuring, segmenting, and analyzing functional upper extremity movement: a narrative review of the current state, limitations, and future directions. FRONTIERS IN REHABILITATION SCIENCES 2023; 4:1130847. [PMID: 37113748 PMCID: PMC10126348 DOI: 10.3389/fresc.2023.1130847] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/23/2023] [Indexed: 04/29/2023]
Abstract
The analysis of functional upper extremity (UE) movement kinematics has implications across domains such as rehabilitation and evaluating job-related skills. Using movement kinematics to quantify movement quality and skill is a promising area of research but is currently not being used widely due to issues associated with cost and the need for further methodological validation. Recent developments by computationally-oriented research communities have resulted in potentially useful methods for evaluating UE function that may make kinematic analyses easier to perform, generally more accessible, and provide more objective information about movement quality, the importance of which has been highlighted during the COVID-19 pandemic. This narrative review provides an interdisciplinary perspective on the current state of computer-assisted methods for analyzing UE kinematics with a specific focus on how to make kinematic analyses more accessible to domain experts. We find that a variety of methods exist to more easily measure and segment functional UE movement, with a subset of those methods being validated for specific applications. Future directions include developing more robust methods for measurement and segmentation, validating these methods in conjunction with proposed kinematic outcome measures, and studying how to integrate kinematic analyses into domain expert workflows in a way that improves outcomes.
Collapse
Affiliation(s)
- Kyle L. Jackson
- Department of Computer Science, George Mason University, Fairfax, VA, United States
- MITRE Corporation, McLean, VA, United States
| | - Zoran Durić
- Department of Computer Science, George Mason University, Fairfax, VA, United States
- Center for Adaptive Systems and Brain-Body Interactions, George Mason University, Fairfax, VA, United States
| | - Susannah M. Engdahl
- Center for Adaptive Systems and Brain-Body Interactions, George Mason University, Fairfax, VA, United States
- Department of Bioengineering, George Mason University, Fairfax, VA, United States
- American Orthotic & Prosthetic Association, Alexandria, VA, United States
| | | | | | - Lynn H. Gerber
- Center for Adaptive Systems and Brain-Body Interactions, George Mason University, Fairfax, VA, United States
- College of Public Health, George Mason University, Fairfax, VA, United States
- Inova Health System, Falls Church, VA, United States
| |
Collapse
|
26
|
Chadebecq F, Lovat LB, Stoyanov D. Artificial intelligence and automation in endoscopy and surgery. Nat Rev Gastroenterol Hepatol 2023; 20:171-182. [PMID: 36352158 DOI: 10.1038/s41575-022-00701-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/03/2022] [Indexed: 11/10/2022]
Abstract
Modern endoscopy relies on digital technology, from high-resolution imaging sensors and displays to electronics connecting configurable illumination and actuation systems for robotic articulation. In addition to enabling more effective diagnostic and therapeutic interventions, the digitization of the procedural toolset enables video data capture of the internal human anatomy at unprecedented levels. Interventional video data encapsulate functional and structural information about a patient's anatomy as well as events, activity and action logs about the surgical process. This detailed but difficult-to-interpret record from endoscopic procedures can be linked to preoperative and postoperative records or patient imaging information. Rapid advances in artificial intelligence, especially in supervised deep learning, can utilize data from endoscopic procedures to develop systems for assisting procedures leading to computer-assisted interventions that can enable better navigation during procedures, automation of image interpretation and robotically assisted tool manipulation. In this Perspective, we summarize state-of-the-art artificial intelligence for computer-assisted interventions in gastroenterology and surgery.
Collapse
Affiliation(s)
- François Chadebecq
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, UK
| | - Laurence B Lovat
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, UK
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences, University College London, London, UK.
| |
Collapse
|
27
|
Wagner M, Müller-Stich BP, Kisilenko A, Tran D, Heger P, Mündermann L, Lubotsky DM, Müller B, Davitashvili T, Capek M, Reinke A, Reid C, Yu T, Vardazaryan A, Nwoye CI, Padoy N, Liu X, Lee EJ, Disch C, Meine H, Xia T, Jia F, Kondo S, Reiter W, Jin Y, Long Y, Jiang M, Dou Q, Heng PA, Twick I, Kirtac K, Hosgor E, Bolmgren JL, Stenzel M, von Siemens B, Zhao L, Ge Z, Sun H, Xie D, Guo M, Liu D, Kenngott HG, Nickel F, Frankenberg MV, Mathis-Ullrich F, Kopp-Schneider A, Maier-Hein L, Speidel S, Bodenstedt S. Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark. Med Image Anal 2023; 86:102770. [PMID: 36889206 DOI: 10.1016/j.media.2023.102770] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 02/03/2023] [Accepted: 02/08/2023] [Indexed: 02/23/2023]
Abstract
PURPOSE Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center video dataset. In this work we investigated the generalizability of phase recognition algorithms in a multicenter setting including more difficult recognition tasks such as surgical action and surgical skill. METHODS To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 h was created. Labels included framewise annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 international Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 research teams trained and submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. RESULTS F1-scores were achieved for phase recognition between 23.9% and 67.7% (n = 9 teams), for instrument presence detection between 38.5% and 63.8% (n = 8 teams), but for action recognition only between 21.8% and 23.3% (n = 5 teams). The average absolute error for skill assessment was 0.78 (n = 1 team). CONCLUSION Surgical workflow and skill analysis are promising technologies to support the surgical team, but there is still room for improvement, as shown by our comparison of machine learning algorithms. This novel HeiChole benchmark can be used for comparable evaluation and validation of future work. In future studies, it is of utmost importance to create more open, high-quality datasets in order to allow the development of artificial intelligence and cognitive robotics in surgery.
Collapse
Affiliation(s)
- Martin Wagner
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany.
| | - Beat-Peter Müller-Stich
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Anna Kisilenko
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Duc Tran
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Patrick Heger
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
| | - Lars Mündermann
- Data Assisted Solutions, Corporate Research & Technology, KARL STORZ SE & Co. KG, Dr. Karl-Storz-Str. 34, 78332 Tuttlingen
| | - David M Lubotsky
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Benjamin Müller
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Tornike Davitashvili
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Manuela Capek
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; National Center for Tumor Diseases (NCT) Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Annika Reinke
- Div. Computer Assisted Medical Interventions, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; HIP Helmholtz Imaging Platform, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg
| | - Carissa Reid
- Division of Biostatistics, German Cancer Research Center, Im Neuenheimer Feld 280, Heidelberg, Germany
| | - Tong Yu
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Armine Vardazaryan
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Chinedu Innocent Nwoye
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, France. 300 bd Sébastien Brant - CS 10413, F-67412 Illkirch Cedex, France; IHU Strasbourg, France. 1 Place de l'hôpital, 67000 Strasbourg, France
| | - Xinyang Liu
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Children's National Hospital, 111 Michigan Ave NW, Washington, DC 20010, USA
| | - Eung-Joo Lee
- University of Maryland, College Park, 2405 A V Williams Building, College Park, MD 20742, USA
| | - Constantin Disch
- Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany
| | - Hans Meine
- Fraunhofer Institute for Digital Medicine MEVIS, Max-von-Laue-Str. 2, 28359 Bremen, Germany; University of Bremen, FB3, Medical Image Computing Group, ℅ Fraunhofer MEVIS, Am Fallturm 1, 28359 Bremen, Germany
| | - Tong Xia
- Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Fucang Jia
- Lab for Medical Imaging and Digital Surgery, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Satoshi Kondo
- Konika Minolta, Inc., 1-2, Sakura-machi, Takatsuki, Oasak 569-8503, Japan
| | - Wolfgang Reiter
- Wintegral GmbH, Ehrenbreitsteiner Str. 36, 80993 München, Germany
| | - Yueming Jin
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Yonghao Long
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Meirui Jiang
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Qi Dou
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Pheng Ann Heng
- Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong
| | - Isabell Twick
- Caresyntax GmbH, Komturstr. 18A, 12099 Berlin, Germany
| | - Kadir Kirtac
- Caresyntax GmbH, Komturstr. 18A, 12099 Berlin, Germany
| | - Enes Hosgor
- Caresyntax GmbH, Komturstr. 18A, 12099 Berlin, Germany
| | | | | | | | - Long Zhao
- Hikvision Research Institute, Hangzhou, China
| | - Zhenxiao Ge
- Hikvision Research Institute, Hangzhou, China
| | - Haiming Sun
- Hikvision Research Institute, Hangzhou, China
| | - Di Xie
- Hikvision Research Institute, Hangzhou, China
| | - Mengqi Guo
- School of Computing, National University of Singapore, Computing 1, No.13 Computing Drive, 117417, Singapore
| | - Daochang Liu
- National Engineering Research Center of Visual Technology, School of Computer Science, Peking University, Beijing, China
| | - Hannes G Kenngott
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
| | - Felix Nickel
- Department for General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany
| | - Moritz von Frankenberg
- Department of Surgery, Salem Hospital of the Evangelische Stadtmission Heidelberg, Zeppelinstrasse 11-33, 69121 Heidelberg, Germany
| | - Franziska Mathis-Ullrich
- Health Robotics and Automation Laboratory, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Geb. 40.28, KIT Campus Süd, Engler-Bunte-Ring 8, 76131 Karlsruhe, Germany
| | - Annette Kopp-Schneider
- Division of Biostatistics, German Cancer Research Center, Im Neuenheimer Feld 280, Heidelberg, Germany
| | - Lena Maier-Hein
- Div. Computer Assisted Medical Interventions, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; HIP Helmholtz Imaging Platform, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 223, 69120 Heidelberg Germany; Faculty of Mathematics and Computer Science, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg; Medical Faculty, Heidelberg University, Im Neuenheimer Feld 672, 69120 Heidelberg
| | - Stefanie Speidel
- Div. Translational Surgical Oncology, National Center for Tumor Diseases Dresden, Fetscherstraße 74, 01307 Dresden, Germany; Cluster of Excellence "Centre for Tactile Internet with Human-in-the-Loop" (CeTI) of Technische Universität Dresden, 01062 Dresden, Germany
| | - Sebastian Bodenstedt
- Div. Translational Surgical Oncology, National Center for Tumor Diseases Dresden, Fetscherstraße 74, 01307 Dresden, Germany; Cluster of Excellence "Centre for Tactile Internet with Human-in-the-Loop" (CeTI) of Technische Universität Dresden, 01062 Dresden, Germany
| |
Collapse
|
28
|
Nagaraj MB, Namazi B, Sankaranarayanan G, Scott DJ. Developing artificial intelligence models for medical student suturing and knot-tying video-based assessment and coaching. Surg Endosc 2023; 37:402-411. [PMID: 35982284 PMCID: PMC9388210 DOI: 10.1007/s00464-022-09509-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 07/23/2022] [Indexed: 01/20/2023]
Abstract
BACKGROUND Early introduction and distributed learning have been shown to improve student comfort with basic requisite suturing skills. The need for more frequent and directed feedback, however, remains an enduring concern for both remote and in-person training. A previous in-person curriculum for our second-year medical students transitioning to clerkships was adapted to an at-home video-based assessment model due to the social distancing implications of COVID-19. We aimed to develop an Artificial Intelligence (AI) model to perform video-based assessment. METHODS Second-year medical students were asked to submit a video of a simple interrupted knot on a penrose drain with instrument tying technique after self-training to proficiency. Proficiency was defined as performing the task under two minutes with no critical errors. All the videos were first manually rated with a pass-fail rating and then subsequently underwent task segmentation. We developed and trained two AI models based on convolutional neural networks to identify errors (instrument holding and knot-tying) and provide automated ratings. RESULTS A total of 229 medical student videos were reviewed (150 pass, 79 fail). Of those who failed, the critical error distribution was 15 knot-tying, 47 instrument-holding, and 17 multiple. A total of 216 videos were used to train the models after excluding the low-quality videos. A k-fold cross-validation (k = 10) was used. The accuracy of the instrument holding model was 89% with an F-1 score of 74%. For the knot-tying model, the accuracy was 91% with an F-1 score of 54%. CONCLUSIONS Medical students require assessment and directed feedback to better acquire surgical skill, but this is often time-consuming and inadequately done. AI techniques can instead be employed to perform automated surgical video analysis. Future work will optimize the current model to identify discrete errors in order to supplement video-based rating with specific feedback.
Collapse
Affiliation(s)
- Madhuri B Nagaraj
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, 75390-9159, USA.
- University of Texas Southwestern Simulation Center, 2001 Inwood Road, Dallas, TX, 75390-9092, USA.
| | - Babak Namazi
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, 75390-9159, USA
| | - Ganesh Sankaranarayanan
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, 75390-9159, USA
| | - Daniel J Scott
- Department of Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, 75390-9159, USA
- University of Texas Southwestern Simulation Center, 2001 Inwood Road, Dallas, TX, 75390-9092, USA
| |
Collapse
|
29
|
Park M, Oh S, Jeong T, Yu S. Multi-Stage Temporal Convolutional Network with Moment Loss and Positional Encoding for Surgical Phase Recognition. Diagnostics (Basel) 2022; 13:107. [PMID: 36611399 PMCID: PMC9818879 DOI: 10.3390/diagnostics13010107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/28/2022] [Accepted: 12/28/2022] [Indexed: 12/31/2022] Open
Abstract
In recent times, many studies concerning surgical video analysis are being conducted due to its growing importance in many medical applications. In particular, it is very important to be able to recognize the current surgical phase because the phase information can be utilized in various ways both during and after surgery. This paper proposes an efficient phase recognition network, called MomentNet, for cholecystectomy endoscopic videos. Unlike LSTM-based network, MomentNet is based on a multi-stage temporal convolutional network. Besides, to improve the phase prediction accuracy, the proposed method adopts a new loss function to supplement the general cross entropy loss function. The new loss function significantly improves the performance of the phase recognition network by constraining un-desirable phase transition and preventing over-segmentation. In addition, MomnetNet effectively applies positional encoding techniques, which are commonly applied in transformer architectures, to the multi-stage temporal convolution network. By using the positional encoding techniques, MomentNet can provide important temporal context, resulting in higher phase prediction accuracy. Furthermore, the MomentNet applies label smoothing technique to suppress overfitting and replaces the backbone network for feature extraction to further improve the network performance. As a result, the MomentNet achieves 92.31% accuracy in the phase recognition task with the Cholec80 dataset, which is 4.55% higher than that of the baseline architecture.
Collapse
Affiliation(s)
- Minyoung Park
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
| | - Seungtaek Oh
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
| | - Taikyeong Jeong
- School of Artificial Intelligence Convergence, Hallym University, Chuncheon 24252, Republic of Korea
| | - Sungwook Yu
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Republic of Korea
| |
Collapse
|
30
|
Gazis A, Karaiskos P, Loukas C. Surgical Gesture Recognition in Laparoscopic Tasks Based on the Transformer Network and Self-Supervised Learning. Bioengineering (Basel) 2022; 9:737. [PMID: 36550943 PMCID: PMC9774918 DOI: 10.3390/bioengineering9120737] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 11/07/2022] [Accepted: 11/25/2022] [Indexed: 12/05/2022] Open
Abstract
In this study, we propose a deep learning framework and a self-supervision scheme for video-based surgical gesture recognition. The proposed framework is modular. First, a 3D convolutional network extracts feature vectors from video clips for encoding spatial and short-term temporal features. Second, the feature vectors are fed into a transformer network for capturing long-term temporal dependencies. Two main models are proposed, based on the backbone framework: C3DTrans (supervised) and SSC3DTrans (self-supervised). The dataset consisted of 80 videos from two basic laparoscopic tasks: peg transfer (PT) and knot tying (KT). To examine the potential of self-supervision, the models were trained on 60% and 100% of the annotated dataset. In addition, the best-performing model was evaluated on the JIGSAWS robotic surgery dataset. The best model (C3DTrans) achieves an accuracy of 88.0%, a 95.2% clip level, and 97.5% and 97.9% (gesture level), for PT and KT, respectively. The SSC3DTrans performed similar to C3DTrans when training on 60% of the annotated dataset (about 84% and 93% clip-level accuracies for PT and KT, respectively). The performance of C3DTrans on JIGSAWS was close to 76% accuracy, which was similar to or higher than prior techniques based on a single video stream, no additional video training, and online processing.
Collapse
Affiliation(s)
| | | | - Constantinos Loukas
- Laboratory of Medical Physics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece
| |
Collapse
|
31
|
Shi C, Zheng Y, Fey AM. Recognition and Prediction of Surgical Gestures and Trajectories Using Transformer Models in Robot-Assisted Surgery. PROCEEDINGS OF THE ... IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS. IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS 2022; 2022:8017-8024. [PMID: 37363719 PMCID: PMC10288529 DOI: 10.1109/iros47612.2022.9981611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Surgical activity recognition and prediction can help provide important context in many Robot-Assisted Surgery (RAS) applications, for example, surgical progress monitoring and estimation, surgical skill evaluation, and shared control strategies during teleoperation. Transformer models were first developed for Natural Language Processing (NLP) to model word sequences and soon the method gained popularity for general sequence modeling tasks. In this paper, we propose the novel use of a Transformer model for three tasks: gesture recognition, gesture prediction, and trajectory prediction during RAS. We modify the original Transformer architecture to be able to generate the current gesture sequence, future gesture sequence, and future trajectory sequence estimations using only the current kinematic data of the surgical robot end-effectors. We evaluate our proposed models on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS) and use Leave-One-User-Out (LOUO) cross validation to ensure generalizability of our results. Our models achieve up to 89.3% gesture recognition accuracy, 84.6% gesture prediction accuracy (1 second ahead) and 2.71mm trajectory prediction error (1 second ahead). Our models are comparable to and able to outperform state-of-the-art methods while using only the kinematic data channel. This approach can enabling near-real time surgical activity recognition and prediction.
Collapse
Affiliation(s)
- Chang Shi
- Walker Department of Mechanical Engineering, The University of Texas at Austin, Austin, TX 78712, USA
| | - Yi Zheng
- Walker Department of Mechanical Engineering, The University of Texas at Austin, Austin, TX 78712, USA
| | - Ann Majewicz Fey
- Walker Department of Mechanical Engineering, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Surgery, UT Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
32
|
Kang D, Kwon D. An ergonomic comfort workspace analysis of master manipulator for robotic laparoscopic surgery with motion scaled teleoperation system. Int J Med Robot 2022; 18:e2448. [DOI: 10.1002/rcs.2448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/22/2022] [Accepted: 07/23/2022] [Indexed: 11/11/2022]
Affiliation(s)
- DongHoon Kang
- Department of Mechanical Engineering Korea Advanced Institute of Science and Technology Daejeon Republic of Korea
| | - Dong‐Soo Kwon
- Department of Mechanical Engineering Korea Advanced Institute of Science and Technology Daejeon Republic of Korea
- EasyEndo Surgical, Inc. TruthHall KAIST Daejeon Republic of Korea
| |
Collapse
|
33
|
Zheng Y, Ershad M, Fey AM. Toward Correcting Anxious Movements Using Haptic Cues on the Da Vinci Surgical Robot. PROCEEDINGS OF THE ... IEEE/RAS-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL ROBOTICS AND BIOMECHATRONICS. IEEE/RAS-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL ROBOTICS AND BIOMECHATRONICS 2022; 2022:10.1109/biorob52689.2022.9925380. [PMID: 37408769 PMCID: PMC10321328 DOI: 10.1109/biorob52689.2022.9925380] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/07/2023]
Abstract
Surgical movements have an important stylistic quality that individuals without formal surgical training can use to identify expertise. In our prior work, we sought to characterize quantitative metrics associated with surgical style and developed a near-real-time detection framework for stylistic deficiencies using a commercial haptic device. In this paper, we implement bimanual stylistic detection on the da Vinci Research Kit (dVRK) and focus on one stylistic deficiency, "Anxious", which may describe movements under stressful conditions. Our goal is to potentially correct these "Anxious" movements by exploring the effects of three different types of haptic cues (time-variant spring, damper, and spring-damper feedback) on performance during a basic surgical training task using the da Vinci Research Kit (dVRK). Eight subjects were recruited to complete peg transfer tasks using a randomized order of haptic cues and with baseline trials between each task. Overall, all cues lead to a significant improvement over baseline economy of volume and time-variant spring haptic cues lead to significant improvements in reducing the classified "Anxious" movements and also corresponded with significantly lower path length and economy of volume for the non-dominant hand. This work is the first step in evaluating our stylistic detection model on a surgical robot and could lay the groundwork for future methods to actively and adaptively reduce the negative effect of stress in the operating room.
Collapse
Affiliation(s)
- Yi Zheng
- Department of Mechanical Engineering, the University of Texas at Austin, 204 East Dean Keeton Street, Austin, TX 78712, USA
| | - Marzieh Ershad
- Intuitive Surgical, Inc., 1020 Kifer Road Sunnyvale, CA 94086
| | - Ann Majewicz Fey
- Department of Mechanical Engineering, the University of Texas at Austin, 204 East Dean Keeton Street, Austin, TX 78712, USA
- Department of Surgery, UT South-western Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390, USA
| |
Collapse
|
34
|
van Amsterdam B, Funke I, Edwards E, Speidel S, Collins J, Sridhar A, Kelly J, Clarkson MJ, Stoyanov D. Gesture Recognition in Robotic Surgery With Multimodal Attention. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1677-1687. [PMID: 35108200 PMCID: PMC7616924 DOI: 10.1109/tmi.2022.3147640] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Automatically recognising surgical gestures from surgical data is an important building block of automated activity recognition and analytics, technical skill assessment, intra-operative assistance and eventually robotic automation. The complexity of articulated instrument trajectories and the inherent variability due to surgical style and patient anatomy make analysis and fine-grained segmentation of surgical motion patterns from robot kinematics alone very difficult. Surgical video provides crucial information from the surgical site with context for the kinematic data and the interaction between the instruments and tissue. Yet sensor fusion between the robot data and surgical video stream is non-trivial because the data have different frequency, dimensions and discriminative capability. In this paper, we integrate multimodal attention mechanisms in a two-stream temporal convolutional network to compute relevance scores and weight kinematic and visual feature representations dynamically in time, aiming to aid multimodal network training and achieve effective sensor fusion. We report the results of our system on the JIGSAWS benchmark dataset and on a new in vivo dataset of suturing segments from robotic prostatectomy procedures. Our results are promising and obtain multimodal prediction sequences with higher accuracy and better temporal structure than corresponding unimodal solutions. Visualization of attention scores also gives physically interpretable insights on network understanding of strengths and weaknesses of each sensor.
Collapse
Affiliation(s)
- Beatrice van Amsterdam
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, UK
| | - Isabel Funke
- Division of Translational Surgical Oncology, National Center for Tumor Diseases (NCT), Partner Site Dresden, Dresden, Germany, and with the Centre for Tactile Internet with Human-in-the-Loop (CeTI), TU Dresden, Dresden, Germany
| | - Eddie Edwards
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, UK
| | - Stefanie Speidel
- Division of Translational Surgical Oncology, National Center for Tumor Diseases (NCT), Partner Site Dresden, Dresden, Germany, and with the Centre for Tactile Internet with Human-in-the-Loop (CeTI), TU Dresden, Dresden, Germany
| | - Justin Collins
- Department of Urooncology, University College London Hospital NHS Foundation Trust, London, UK
| | - Ashwin Sridhar
- Department of Urooncology, University College London Hospital NHS Foundation Trust, London, UK
| | - John Kelly
- Department of Urooncology, University College London Hospital NHS Foundation Trust, London, UK
| | - Matthew J. Clarkson
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, UK
| | - Danail Stoyanov
- Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, UK
| |
Collapse
|
35
|
Matsumoto S, Kawahira H, Oiwa K, Maeda Y, Nozawa A, Lefor AK, Hosoya Y, Sata N. Laparoscopic surgical skill evaluation with motion capture and eyeglass gaze cameras: A pilot study. Asian J Endosc Surg 2022; 15:619-628. [PMID: 35598888 DOI: 10.1111/ases.13065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 03/28/2022] [Accepted: 03/30/2022] [Indexed: 11/30/2022]
Abstract
INTRODUCTION An eyeglass gaze camera and a skeletal coordinate camera without sensors attached to the operator's body were used to monitor gaze and movement during a simulated surgical procedure. These new devices have the potential to change skill assessment for laparoscopic surgery. The suitability of these devices for skill assessment was investigated. MATERIAL AND METHODS Six medical students, six intermediate surgeons, and four experts performed suturing tasks in a dry box. The tip positions of the instruments were identified from video recordings. Performance was evaluated based on instrument movement, gaze, and skeletal coordination. RESULTS Task performance time and skeletal coordinates were not significantly different among skill levels. The total movement distance of the right instrument was significantly different depending on the skill level. The SD of the gaze coordinates was significantly different depending on skill level and was less for experts. The expert's gaze stayed in a small area with little blurring. CONCLUSIONS The SD of gaze point coordinates correlates with laparoscopic surgical skill level. These devices may facilitate objective intraoperative skill evaluation in future studies.
Collapse
Affiliation(s)
- Shiro Matsumoto
- Department of Surgery, Jichi Medical University, Tochigi, Japan
| | - Hiroshi Kawahira
- Medical Simulation Center, Jichi Medical University, Tochigi, Japan
| | - Kosuke Oiwa
- Department of Electrical Engineering and Electronics, Aoyama Gakuin University, Kanagawa, Japan
| | - Yoshitaka Maeda
- Medical Simulation Center, Jichi Medical University, Tochigi, Japan
| | - Akio Nozawa
- Department of Electrical Engineering and Electronics, Aoyama Gakuin University, Kanagawa, Japan
| | | | | | - Naohiro Sata
- Department of Surgery, Jichi Medical University, Tochigi, Japan
| |
Collapse
|
36
|
Soleymani A, Li X, Tavakoli M. A Domain-Adapted Machine Learning Approach for Visual Evaluation and Interpretation of Robot-Assisted Surgery Skills. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3186769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Abed Soleymani
- Electrical and Computer Engineering Department, University of Alberta, Edmonton, AB, Canada
| | - Xingyu Li
- Electrical and Computer Engineering Department, University of Alberta, Edmonton, AB, Canada
| | - Mahdi Tavakoli
- Electrical and Computer Engineering Department, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
37
|
Chen Z, An J, Wu S, Cheng K, You J, Liu J, Jiang J, Yang D, Peng B, Wang X. Surgesture: a novel instrument based on surgical actions for objective skill assessment. Surg Endosc 2022; 36:6113-6121. [PMID: 35737138 DOI: 10.1007/s00464-022-09108-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 02/07/2022] [Indexed: 02/05/2023]
Abstract
BACKGROUND Due to varied surgical skills and the lack of an efficient rating system, we developed Surgesture based on elementary functional surgical gestures performed by surgeons, which could serve as objective metrics to evaluate surgical performance in laparoscopic cholecystectomy (LC). METHODS We defined 14 LC basic Surgestures. Four surgeons annotated Surgestures among LC videos performed by experts and novices. The counts, durations, average action time, and dissection/exposure ratio (D/E ratio) of LC Surgestures were compared. The phase of mobilizing hepatocystic triangle (MHT) was extracted for skill assessment by three professors using a modified Global Operative Assessment of Laparoscopic Skills (mGOALS). RESULTS The novice operation time was significantly longer than the expert operation time (58.12 ± 19.23 min vs. 26.66 ± 8.00 min, P < 0.001), particularly during MHT phase. Novices had significantly more Surgestures than experts in both hands (P < 0.05). The left hand and inefficient Surgesture of novices were dramatically more than those of experts (P < 0.05). The experts demonstrated a significantly higher D/E ratio of duration than novices (0.79 ± 0.37 vs. 2.84 ± 1.98, P < 0.001). The counts and time pattern map of LC Surgestures during MHT demonstrated that novices tended to complete LC with more types of Surgestures and spent more time exposing the surgical scene. The performance metrics of LC Surgesture had significant but weak associations with each aspect of mGOALS. CONCLUSION The newly constructed Surgestures could serve as accessible and quantifiable metrics for demonstrating the operative pattern and distinguishing surgeons with various skills. The association between Surgestures and Global Rating Scale laid the foundation for establishing a bridge to automated objective surgical skill evaluation.
Collapse
Affiliation(s)
- Zixin Chen
- Department of Pancreatic Surgery, West China Hospital of Sichuan University, Chengdu, China.,West China School of Medicine, Sichuan University, Chengdu, China
| | - Jingjing An
- Department of Operating Room, West China Hospital, Chengdu, China.,West China School of Nursing, Sichuan University, Chengdu, China
| | - Shangdi Wu
- Department of Pancreatic Surgery, West China Hospital of Sichuan University, Chengdu, China.,West China School of Medicine, Sichuan University, Chengdu, China
| | - Ke Cheng
- Department of Pancreatic Surgery, West China Hospital of Sichuan University, Chengdu, China.,West China School of Medicine, Sichuan University, Chengdu, China
| | - Jiaying You
- Department of Pancreatic Surgery, West China Hospital of Sichuan University, Chengdu, China.,West China School of Medicine, Sichuan University, Chengdu, China
| | - Jie Liu
- ChengDu Withai Innovations Technology Company, Chengdu, China
| | - Jingwen Jiang
- West China Biomedical Big Data Center of West China Hospital, Chengdu, China
| | - Dewei Yang
- West China Biomedical Big Data Center of West China Hospital, Chengdu, China.,Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Bing Peng
- Department of Pancreatic Surgery, West China Hospital of Sichuan University, Chengdu, China.
| | - Xin Wang
- Department of Pancreatic Surgery, West China Hospital of Sichuan University, Chengdu, China.
| |
Collapse
|
38
|
Guarino A, Lettieri N, Malandrino D, Zaccagnino R, Capo C. Adam or Eve? Automatic users’ gender classification via gestures analysis on touch devices. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07454-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
AbstractGender classification of mobile devices’ users has drawn a great deal of attention for its applications in healthcare, smart spaces, biometric-based access control systems and customization of user interface (UI). Previous works have shown that authentication systems can be more effective when considering soft biometric traits such as the gender, while others highlighted the significance of this trait for enhancing UIs. This paper presents a novel machine learning-based approach to gender classification leveraging the only touch gestures information derived from smartphones’ APIs. To identify the most useful gesture and combination thereof for gender classification, we have considered two strategies: single-view learning, analyzing, one at a time, datasets relating to a single type of gesture, and multi-view learning, analyzing together datasets describing different types of gestures. This is one of the first works to apply such a strategy for gender recognition via gestures analysis on mobile devices. The methods have been evaluated on a large dataset of gestures collected through a mobile application, which includes not only scrolls, swipes, and taps but also pinch-to-zooms and drag-and-drops which are mostly overlooked in the literature. Conversely to the previous literature, we have also provided experiments of the solution in different scenarios, thus proposing a more comprehensive evaluation. The experimental results show that scroll down is the most useful gesture and random forest is the most convenient classifier for gender classification. Based on the (combination of) gestures taken into account, we have obtained F1-score up to 0.89 in validation and 0.85 in testing phase. Furthermore, the multi-view approach is recommended when dealing with unknown devices and combinations of gestures can be effectively adopted, building on the requirements of the system our solution is built-into. Solutions proposed turn out to be both an opportunity for gender-aware technologies and a potential risk deriving from unwanted gender classification.
Collapse
|
39
|
Zhou XH, Xie XL, Feng ZQ, Hou ZG, Bian GB, Li RQ, Ni ZL, Liu SQ, Zhou YJ. A Multilayer and Multimodal-Fusion Architecture for Simultaneous Recognition of Endovascular Manipulations and Assessment of Technical Skills. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:2565-2577. [PMID: 32697730 DOI: 10.1109/tcyb.2020.3004653] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The clinical success of the percutaneous coronary intervention (PCI) is highly dependent on endovascular manipulation skills and dexterous manipulation strategies of interventionalists. However, the analysis of endovascular manipulations and related discussion for technical skill assessment are limited. In this study, a multilayer and multimodal-fusion architecture is proposed to recognize six typical endovascular manipulations. The synchronously acquired multimodal motion signals from ten subjects are used as the inputs of the architecture independently. Six classification-based and two rule-based fusion algorithms are evaluated for performance comparisons. The recognition metrics under the determined architecture are further used to assess technical skills. The experimental results indicate that the proposed architecture can achieve the overall accuracy of 96.41%, much higher than that of a single-layer recognition architecture (92.85%). In addition, the multimodal fusion brings significant performance improvement in comparison with single-modal schemes. Furthermore, the K -means-based skill assessment can obtain an accuracy of 95% to cluster the attempts made by different skill-level groups. These hopeful results indicate the great possibility of the architecture to facilitate clinical skill assessment and skill learning.
Collapse
|
40
|
Miura S, Kaneko T, Kawamura K, Kobayashi Y, Fujie MG. Brain activation measurement for motion gain decision of surgical endoscope manipulation. Int J Med Robot 2022; 18:e2371. [DOI: 10.1002/rcs.2371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 05/18/2021] [Accepted: 05/20/2021] [Indexed: 11/07/2022]
Affiliation(s)
- Satoshi Miura
- Department of Mechanical Engineering Tokyo Institute of Technology Tokyo Japan
| | - Taisei Kaneko
- Department of Modern Mechanical Engineering Waseda University Tokyo Japan
| | - Kazuya Kawamura
- Center for Frontier Medical Engineering Chiba University Chiba Japan
| | - Yo Kobayashi
- Healthcare Robotics Institute Future Robotics Organization Waseda University Tokyo Japan
| | - Masakatsu G. Fujie
- Healthcare Robotics Institute Future Robotics Organization Waseda University Tokyo Japan
| |
Collapse
|
41
|
Lam K, Chen J, Wang Z, Iqbal FM, Darzi A, Lo B, Purkayastha S, Kinross JM. Machine learning for technical skill assessment in surgery: a systematic review. NPJ Digit Med 2022; 5:24. [PMID: 35241760 PMCID: PMC8894462 DOI: 10.1038/s41746-022-00566-0] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 01/21/2022] [Indexed: 12/18/2022] Open
Abstract
Accurate and objective performance assessment is essential for both trainees and certified surgeons. However, existing methods can be time consuming, labor intensive, and subject to bias. Machine learning (ML) has the potential to provide rapid, automated, and reproducible feedback without the need for expert reviewers. We aimed to systematically review the literature and determine the ML techniques used for technical surgical skill assessment and identify challenges and barriers in the field. A systematic literature search, in accordance with the PRISMA statement, was performed to identify studies detailing the use of ML for technical skill assessment in surgery. Of the 1896 studies that were retrieved, 66 studies were included. The most common ML methods used were Hidden Markov Models (HMM, 14/66), Support Vector Machines (SVM, 17/66), and Artificial Neural Networks (ANN, 17/66). 40/66 studies used kinematic data, 19/66 used video or image data, and 7/66 used both. Studies assessed the performance of benchtop tasks (48/66), simulator tasks (10/66), and real-life surgery (8/66). Accuracy rates of over 80% were achieved, although tasks and participants varied between studies. Barriers to progress in the field included a focus on basic tasks, lack of standardization between studies, and lack of datasets. ML has the potential to produce accurate and objective surgical skill assessment through the use of methods including HMM, SVM, and ANN. Future ML-based assessment tools should move beyond the assessment of basic tasks and towards real-life surgery and provide interpretable feedback with clinical value for the surgeon.PROSPERO: CRD42020226071.
Collapse
Affiliation(s)
- Kyle Lam
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Junhong Chen
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Zeyu Wang
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Fahad M Iqbal
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Ara Darzi
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Benny Lo
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| | - Sanjay Purkayastha
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK.
| | - James M Kinross
- Department of Surgery and Cancer, 10th Floor Queen Elizabeth the Queen Mother Building, St Mary's Hospital, Imperial College, London, W2 1NY, UK
| |
Collapse
|
42
|
Xue Y, Liu S, Li Y, Wang P, Qian X. A new weakly supervised strategy for surgical tool detection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
43
|
Junger D, Frommer SM, Burgert O. State-of-the-art of situation recognition systems for intraoperative procedures. Med Biol Eng Comput 2022; 60:921-939. [PMID: 35178622 PMCID: PMC8933302 DOI: 10.1007/s11517-022-02520-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 01/30/2022] [Indexed: 11/05/2022]
Abstract
One of the key challenges for automatic assistance is the support of actors in the operating room depending on the status of the procedure. Therefore, context information collected in the operating room is used to gain knowledge about the current situation. In literature, solutions already exist for specific use cases, but it is doubtful to what extent these approaches can be transferred to other conditions. We conducted a comprehensive literature research on existing situation recognition systems for the intraoperative area, covering 274 articles and 95 cross-references published between 2010 and 2019. We contrasted and compared 58 identified approaches based on defined aspects such as used sensor data or application area. In addition, we discussed applicability and transferability. Most of the papers focus on video data for recognizing situations within laparoscopic and cataract surgeries. Not all of the approaches can be used online for real-time recognition. Using different methods, good results with recognition accuracies above 90% could be achieved. Overall, transferability is less addressed. The applicability of approaches to other circumstances seems to be possible to a limited extent. Future research should place a stronger focus on adaptability. The literature review shows differences within existing approaches for situation recognition and outlines research trends. Applicability and transferability to other conditions are less addressed in current work.
Collapse
Affiliation(s)
- D Junger
- School of Informatics, Research Group Computer Assisted Medicine (CaMed), Reutlingen University, Alteburgstr. 150, 72762, Reutlingen, Germany.
| | - S M Frommer
- School of Informatics, Research Group Computer Assisted Medicine (CaMed), Reutlingen University, Alteburgstr. 150, 72762, Reutlingen, Germany
| | - O Burgert
- School of Informatics, Research Group Computer Assisted Medicine (CaMed), Reutlingen University, Alteburgstr. 150, 72762, Reutlingen, Germany
| |
Collapse
|
44
|
Yan J, Huang K, Lindgren K, Bonaci T, Chizeck HJ. Continuous Operator Authentication for Teleoperated Systems Using Hidden Markov Models. ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS 2022. [DOI: 10.1145/3488901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
In this article, we present a novel approach for continuous operator authentication in teleoperated robotic processes based on Hidden Markov Models (HMM). While HMMs were originally developed and widely used in speech recognition, they have shown great performance in human motion and activity modeling. We make an analogy between human language and teleoperated robotic processes (i.e., words are analogous to a teleoperator’s gestures, sentences are analogous to the entire teleoperated task or process) and implement HMMs to model the teleoperated task. To test the continuous authentication performance of the proposed method, we conducted two sets of analyses. We built a virtual reality (VR) experimental environment using a commodity VR headset (HTC Vive) and haptic feedback enabled controller (Sensable PHANToM Omni) to simulate a real teleoperated task. An experimental study with 10 subjects was then conducted. We also performed simulated continuous operator authentication by using the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS). The performance of the model was evaluated based on the continuous (real-time) operator authentication accuracy as well as resistance to a simulated impersonation attack. The results suggest that the proposed method is able to achieve 70% (VR experiment) and 81% (JIGSAWS dataset) continuous classification accuracy with as short as a 1-second sample window. It is also capable of detecting an impersonation attack in real-time.
Collapse
|
45
|
Holden MS, O'Brien M, Malpani A, Naz H, Tseng YW, Ishii L, Swaroop Vedula S, Ishii M, Hager G. Reconstructing the nasal septum from instrument motion during septoplasty surgery. J Med Imaging (Bellingham) 2021; 8:065001. [PMID: 34796250 DOI: 10.1117/1.jmi.8.6.065001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 10/18/2021] [Indexed: 11/14/2022] Open
Abstract
Purpose: Surgery involves modifying anatomy to achieve a goal. Reconstructing anatomy can facilitate surgical care through surgical planning, real-time decision support, or anticipating outcomes. Tool motion is a rich source of data that can be used to quantify anatomy. Our work develops and validates a method for reconstructing the nasal septum from unstructured motion of the Cottle elevator during the elevation phase of septoplasty surgery, without need to explicitly delineate the surface of the septum. Approach: The proposed method uses iterative closest point registration to initially register a template septum to the tool motion. Subsequently, statistical shape modeling with iterative most likely oriented point registration is used to fit the reconstructed septum to Cottle tip position and orientation during flap elevation. Regularization of the shape model and transformation is incorporated. The proposed methods were validated on 10 septoplasty surgeries performed on cadavers by operators of varying experience level. Preoperative CT images of the cadaver septums were segmented as ground truth. Results: We estimated reconstruction error as the difference between the projections of the Cottle tip onto the surface of the reconstructed septum and the ground-truth septum segmented from the CT image. We found translational differences of 2.74 ( 2.06 - 2.81 ) mm and a rotational differences of 8.95 ( 7.11 - 10.55 ) deg between the reconstructed septum and the ground-truth septum [median (interquartile range)], given the optimal regularization parameters. Conclusions: Accurate reconstruction of the nasal septum can be achieved from tool tracking data during septoplasty surgery on cadavers. This enables understanding of the septal anatomy without need for traditional medical imaging. This result may be used to facilitate surgical planning, intraoperative care, or skills assessment.
Collapse
Affiliation(s)
- Matthew S Holden
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States.,Carleton University, School of Computer Science, Ottawa, Canada
| | - Molly O'Brien
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States
| | - Anand Malpani
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States
| | - Hajira Naz
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States
| | - Ya-Wei Tseng
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States
| | - Lisa Ishii
- Johns Hopkins University, School of Medicine, Department of Otolaryngology-Head and Neck Surgery, Baltimore, Maryland, United States
| | - S Swaroop Vedula
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States
| | - Masaru Ishii
- Johns Hopkins University, School of Medicine, Department of Otolaryngology-Head and Neck Surgery, Baltimore, Maryland, United States
| | - Gregory Hager
- Johns Hopkins University, Malone Center for Engineering in Healthcare, Baltimore, Maryland, United States
| |
Collapse
|
46
|
Zhao S, Zhang X, Jin F, Hahn J. An Auxiliary Tasks Based Framework for Automated Medical Skill Assessment with Limited Data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:1613-1617. [PMID: 34891594 DOI: 10.1109/embc46164.2021.9630498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Automated medical skill assessment facilitates medical education by merging varying clinical experiences across instructors for standardizing medical training. However, medical datasets for training such automated assessment rarely have satisfactory sizes due to the cost of data collection, safety concerns and privacy restrictions. Current medical training relies on evaluation rubrics that usually include multiple auxiliary labels to support the overall evaluation from varying aspects of the procedure. In this paper, we explore machine learning algorithms to design a generalizable auxiliary task-based framework for medical skill assessment to address training automated systems with limited data. Our framework exhaustively mines valid auxiliary information in the evaluation rubric to pre-train the feature extractor before training the skill assessment classifier. Notably, a new regression-based multitask weighting method is the key to pre-train a meaningful feature representation comprehensively, ensuring the evaluation rubric is well imitated in the final model. The overall evaluation task can be fine-tuned based on the pre-trained rubric-based feature representation. Our experimental results on two medical skill datasets show that our work can significantly improve performance, achieving 85.9% and 97.4% accuracy in the intubation dataset and surgical skill dataset, respectively.
Collapse
|
47
|
Moglia A, Georgiou K, Georgiou E, Satava RM, Cuschieri A. A systematic review on artificial intelligence in robot-assisted surgery. Int J Surg 2021; 95:106151. [PMID: 34695601 DOI: 10.1016/j.ijsu.2021.106151] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/04/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022]
Abstract
BACKGROUND Despite the extensive published literature on the significant potential of artificial intelligence (AI) there are no reports on its efficacy in improving patient safety in robot-assisted surgery (RAS). The purposes of this work are to systematically review the published literature on AI in RAS, and to identify and discuss current limitations and challenges. MATERIALS AND METHODS A literature search was conducted on PubMed, Web of Science, Scopus, and IEEExplore according to PRISMA 2020 statement. Eligible articles were peer-review studies published in English language from January 1, 2016 to December 31, 2020. Amstar 2 was used for quality assessment. Risk of bias was evaluated with the Newcastle Ottawa Quality assessment tool. Data of the studies were visually presented in tables using SPIDER tool. RESULTS Thirty-five publications, representing 3436 patients, met the search criteria and were included in the analysis. The selected reports concern: motion analysis (n = 17), urology (n = 12), gynecology (n = 1), other specialties (n = 1), training (n = 3), and tissue retraction (n = 1). Precision for surgical tools detection varied from 76.0% to 90.6%. Mean absolute error on prediction of urinary continence after robot-assisted radical prostatectomy (RARP) ranged from 85.9 to 134.7 days. Accuracy on prediction of length of stay after RARP was 88.5%. Accuracy on recognition of the next surgical task during robot-assisted partial nephrectomy (RAPN) achieved 75.7%. CONCLUSION The reviewed studies were of low quality. The findings are limited by the small size of the datasets. Comparison between studies on the same topic was restricted due to algorithms and datasets heterogeneity. There is no proof that currently AI can identify the critical tasks of RAS operations, which determine patient outcome. There is an urgent need for studies on large datasets and external validation of the AI algorithms used. Furthermore, the results should be transparent and meaningful to surgeons, enabling them to inform patients in layman's words. REGISTRATION Review Registry Unique Identifying Number: reviewregistry1225.
Collapse
Affiliation(s)
- Andrea Moglia
- EndoCAS, Center for Computer Assisted Surgery, University of Pisa, 56124, Pisa, Italy 1st Propaedeutic Surgical Unit, Hippocrateion Athens General Hospital, Athens Medical School, National and Kapodistrian University of Athens, Greece MPLSC, Athens Medical School, National and Kapodistrian University of Athens, Greece Department of Surgery, University of Washington Medical Center, Seattle, WA, United States Scuola Superiore Sant'Anna of Pisa, 56214, Pisa, Italy Institute for Medical Science and Technology, University of Dundee, Dundee, DD2 1FD, United Kingdom
| | | | | | | | | |
Collapse
|
48
|
Qin Y, Allan M, Burdick JW, Azizian M. Autonomous Hierarchical Surgical State Estimation During Robot-Assisted Surgery Through Deep Neural Networks. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3091728] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
49
|
Holden MS, Portillo A, Salame G. Skills Classification in Cardiac Ultrasound with Temporal Convolution and Domain Knowledge Using a Low-Cost Probe Tracker. ULTRASOUND IN MEDICINE & BIOLOGY 2021; 47:3002-3013. [PMID: 34344562 DOI: 10.1016/j.ultrasmedbio.2021.06.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/29/2021] [Accepted: 06/17/2021] [Indexed: 06/13/2023]
Abstract
As point-of-care ultrasound (POCUS) becomes more integrated into clinical practice, it is essential to address all aspects of ultrasound operator proficiency. Ultrasound proficiency requires the ability to acquire, interpret and integrate bedside ultrasound images. The difference in image acquisition psychomotor skills between novice (trainee) and expert (instructor) ultrasonographer has not been described. We created an inexpensive system, called Probe Watch, to record probe motion and assess image acquisition in cardiac POCUS using an inertial measurement device and software for data recording based on open-source components. We designed a temporal convolutional network for skills classification from probe motion that integrates clinical domain knowledge. We further designed data augmentation methods to improve its generalization. Subsequently, we validated the setup and assessment method on a set of novice and expert sonographers performing cardiac ultrasound in a simulation-based training environment. The proposed methods classified participants as novice or expert with areas under the receiver operating characteristic curve of 0.931 and 0.761 for snippets and trials, respectively. Integrating domain knowledge into the neural network had added value. Furthermore, we identified the most discriminative features for assessment. Probe Watch quantifies motion during cardiac ultrasound and provides insight into probe motion behavior. It may be deployed during cardiac ultrasound training to monitor learning curves objectively and automatically.
Collapse
Affiliation(s)
- Matthew S Holden
- School of Computer Science, Carleton University, Ottawa, Ontario, Canada.
| | | | | |
Collapse
|
50
|
|