1
|
Ban Y, Eckhoff JA, Ward TM, Hashimoto DA, Meireles OR, Rus D, Rosman G. Concept Graph Neural Networks for Surgical Video Understanding. IEEE Trans Med Imaging 2024; 43:264-274. [PMID: 37498757 DOI: 10.1109/tmi.2023.3299518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Analysis of relations between objects and comprehension of abstract concepts in the surgical video is important in AI-augmented surgery. However, building models that integrate our knowledge and understanding of surgery remains a challenging endeavor. In this paper, we propose a novel way to integrate conceptual knowledge into temporal analysis tasks using temporal concept graph networks. In the proposed networks, a knowledge graph is incorporated into the temporal video analysis of surgical notions, learning the meaning of concepts and relations as they apply to the data. We demonstrate results in surgical video data for tasks such as verification of the critical view of safety, estimation of the Parkland grading scale as well as recognizing instrument-action-tissue triplets. The results show that our method improves the recognition and detection of complex benchmarks as well as enables other analytic applications of interest.
Collapse
|
2
|
Eckhoff JA, Rosman G, Altieri MS, Speidel S, Stoyanov D, Anvari M, Meier-Hein L, März K, Jannin P, Pugh C, Wagner M, Witkowski E, Shaw P, Madani A, Ban Y, Ward T, Filicori F, Padoy N, Talamini M, Meireles OR. SAGES consensus recommendations on surgical video data use, structure, and exploration (for research in artificial intelligence, clinical quality improvement, and surgical education). Surg Endosc 2023; 37:8690-8707. [PMID: 37516693 PMCID: PMC10616217 DOI: 10.1007/s00464-023-10288-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 07/05/2023] [Indexed: 07/31/2023]
Abstract
BACKGROUND Surgery generates a vast amount of data from each procedure. Particularly video data provides significant value for surgical research, clinical outcome assessment, quality control, and education. The data lifecycle is influenced by various factors, including data structure, acquisition, storage, and sharing; data use and exploration, and finally data governance, which encompasses all ethical and legal regulations associated with the data. There is a universal need among stakeholders in surgical data science to establish standardized frameworks that address all aspects of this lifecycle to ensure data quality and purpose. METHODS Working groups were formed, among 48 representatives from academia and industry, including clinicians, computer scientists and industry representatives. These working groups focused on: Data Use, Data Structure, Data Exploration, and Data Governance. After working group and panel discussions, a modified Delphi process was conducted. RESULTS The resulting Delphi consensus provides conceptualized and structured recommendations for each domain related to surgical video data. We identified the key stakeholders within the data lifecycle and formulated comprehensive, easily understandable, and widely applicable guidelines for data utilization. Standardization of data structure should encompass format and quality, data sources, documentation, metadata, and account for biases within the data. To foster scientific data exploration, datasets should reflect diversity and remain adaptable to future applications. Data governance must be transparent to all stakeholders, addressing legal and ethical considerations surrounding the data. CONCLUSION This consensus presents essential recommendations around the generation of standardized and diverse surgical video databanks, accounting for multiple stakeholders involved in data generation and use throughout its lifecycle. Following the SAGES annotation framework, we lay the foundation for standardization of data use, structure, and exploration. A detailed exploration of requirements for adequate data governance will follow.
Collapse
Affiliation(s)
- Jennifer A Eckhoff
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA.
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany.
| | - Guy Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - Maria S Altieri
- Stony Brook University Hospital, Washington University in St. Louis, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Stefanie Speidel
- National Center for Tumor Diseases (NCT), Fiedlerstraße 23, 01307, Dresden, Germany
| | - Danail Stoyanov
- University College London, 43-45 Foley Street, London, W1W 7TY, UK
| | - Mehran Anvari
- Center for Surgical Invention and Innovation, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Lena Meier-Hein
- German Cancer Research Center, Deutsches Krebsforschungszentrum (DKFZ), Im Neuenheimer Feld 280, 69120, Heidelberg, Germany
| | - Keno März
- German Cancer Research Center, Deutsches Krebsforschungszentrum (DKFZ), Im Neuenheimer Feld 280, 69120, Heidelberg, Germany
| | - Pierre Jannin
- MediCIS, University of Rennes - Campus Beaulieu, 2 Av. du Professeur Léon Bernard, 35043, Rennes, France
| | - Carla Pugh
- Department of Surgery, Stanford School of Medicine, 291 Campus Drive, Stanford, CA, 94305, USA
| | - Martin Wagner
- Department of Surgery, University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120, Heidelberg, Germany
| | - Elan Witkowski
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| | - Paresh Shaw
- New York University Langone, 530 1St Ave. Floor 12, New York, NY, 10016, USA
| | - Amin Madani
- Surgical Artifcial Intelligence Research Academy, Department of Surgery, University Health Network, Toronto, ON, Canada
| | - Yutong Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - Thomas Ward
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| | - Filippo Filicori
- Intraoperative Performance Analytics Laboratory (IPAL), Department of General Surgery, Northwell Health, Lenox Hill Hospital, New York, NY, USA
| | - Nicolas Padoy
- Ihu Strasbourg - Institute Surgery Guided Par L'image, 1 Pl. de L'Hôpital, 67000, Strasbourg, France
| | - Mark Talamini
- Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
| | - Ozanan R Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| |
Collapse
|
3
|
Eckhoff JA, Ban Y, Rosman G, Müller DT, Hashimoto DA, Witkowski E, Babic B, Rus D, Bruns C, Fuchs HF, Meireles O. TEsoNet: knowledge transfer in surgical phase recognition from laparoscopic sleeve gastrectomy to the laparoscopic part of Ivor-Lewis esophagectomy. Surg Endosc 2023; 37:4040-4053. [PMID: 36932188 PMCID: PMC10156818 DOI: 10.1007/s00464-023-09971-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 02/21/2023] [Indexed: 03/19/2023]
Abstract
BACKGROUND Surgical phase recognition using computer vision presents an essential requirement for artificial intelligence-assisted analysis of surgical workflow. Its performance is heavily dependent on large amounts of annotated video data, which remain a limited resource, especially concerning highly specialized procedures. Knowledge transfer from common to more complex procedures can promote data efficiency. Phase recognition models trained on large, readily available datasets may be extrapolated and transferred to smaller datasets of different procedures to improve generalizability. The conditions under which transfer learning is appropriate and feasible remain to be established. METHODS We defined ten operative phases for the laparoscopic part of Ivor-Lewis Esophagectomy through expert consensus. A dataset of 40 videos was annotated accordingly. The knowledge transfer capability of an established model architecture for phase recognition (CNN + LSTM) was adapted to generate a "Transferal Esophagectomy Network" (TEsoNet) for co-training and transfer learning from laparoscopic Sleeve Gastrectomy to the laparoscopic part of Ivor-Lewis Esophagectomy, exploring different training set compositions and training weights. RESULTS The explored model architecture is capable of accurate phase detection in complex procedures, such as Esophagectomy, even with low quantities of training data. Knowledge transfer between two upper gastrointestinal procedures is feasible and achieves reasonable accuracy with respect to operative phases with high procedural overlap. CONCLUSION Robust phase recognition models can achieve reasonable yet phase-specific accuracy through transfer learning and co-training between two related procedures, even when exposed to small amounts of training data of the target procedure. Further exploration is required to determine appropriate data amounts, key characteristics of the training procedure and temporal annotation methods required for successful transferal phase recognition. Transfer learning across different procedures addressing small datasets may increase data efficiency. Finally, to enable the surgical application of AI for intraoperative risk mitigation, coverage of rare, specialized procedures needs to be explored.
Collapse
Affiliation(s)
- J A Eckhoff
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA. .,Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany.
| | - Y Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA.,Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - G Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA.,Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - D T Müller
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - D A Hashimoto
- Department of Surgery, University Hospitals Cleveland Medical Center, Cleveland, OH, 44106, USA.,Department of Surgery, Case Western Reserve School of Medicine, Cleveland, OH, 44106, USA
| | - E Witkowski
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| | - B Babic
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - D Rus
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
| | - C Bruns
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - H F Fuchs
- Department of General, Visceral, Tumor and Transplant Surgery, University Hospital Cologne, Kerpenerstrasse 62, 50937, Cologne, Germany
| | - O Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC339, Boston, MA, 02114, USA
| |
Collapse
|
4
|
Ward TM, Hashimoto DA, Ban Y, Rosman G, Meireles OR. Artificial intelligence prediction of cholecystectomy operative course from automated identification of gallbladder inflammation. Surg Endosc 2022; 36:6832-6840. [PMID: 35031869 DOI: 10.1007/s00464-022-09009-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 01/03/2022] [Indexed: 12/24/2022]
Abstract
BACKGROUND Operative courses of laparoscopic cholecystectomies vary widely due to differing pathologies. Efforts to assess intra-operative difficulty include the Parkland grading scale (PGS), which scores inflammation from the initial view of the gallbladder on a 1-5 scale. We investigated the impact of PGS on intra-operative outcomes, including laparoscopic duration, attainment of the critical view of safety (CVS), and gallbladder injury. We additionally trained an artificial intelligence (AI) model to identify PGS. METHODS One surgeon labeled surgical phases, PGS, CVS attainment, and gallbladder injury in 200 cholecystectomy videos. We used multilevel Bayesian regression models to analyze the PGS's effect on intra-operative outcomes. We trained AI models to identify PGS from an initial view of the gallbladder and compared model performance to annotations by a second surgeon. RESULTS Slightly inflamed gallbladders (PGS-2) minimally increased duration, adding 2.7 [95% compatibility interval (CI) 0.3-7.0] minutes to an operation. This contrasted with maximally inflamed gallbladders (PGS-5), where on average 16.9 (95% CI 4.4-33.9) minutes were added, with 31.3 (95% CI 8.0-67.5) minutes added for the most affected surgeon. Inadvertent gallbladder injury occurred in 25% of cases, with a minimal increase in gallbladder injury observed with added inflammation. However, up to a 28% (95% CI - 2, 63) increase in probability of a gallbladder hole during PGS-5 cases was observed for some surgeons. Inflammation had no substantial effect on whether or not a surgeon attained the CVS. An AI model could reliably (Krippendorff's α = 0.71, 95% CI 0.65-0.77) quantify inflammation when compared to a second surgeon (α = 0.82, 95% CI 0.75-0.87). CONCLUSIONS An AI model can identify the degree of gallbladder inflammation, which is predictive of cholecystectomy intra-operative course. This automated assessment could be useful for operating room workflow optimization and for targeted per-surgeon and per-resident feedback to accelerate acquisition of operative skills.
Collapse
Affiliation(s)
- Thomas M Ward
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA.
| | - Daniel A Hashimoto
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
| | - Yutong Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
- Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Guy Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
- Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ozanan R Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
| |
Collapse
|
5
|
Li X, Rosman G, Gilitschenski I, Araki B, Vasile CI, Karaman S, Rus D. Learning an Explainable Trajectory Generator Using the Automaton Generative Network (AGN). IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2021.3135940] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
6
|
Ban Y, Rosman G, Eckhoff JA, Ward TM, Hashimoto DA, Kondo T, Iwaki H, Meireles OR, Rus D. SUPR-GAN: SUrgical PRediction GAN for Event Anticipation in Laparoscopic and Robotic Surgery. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3156856] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Yutong Ban
- Distributed Robotics Laboratory, CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Guy Rosman
- Distributed Robotics Laboratory, CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | | | | | | | | | | - Daniela Rus
- Distributed Robotics Laboratory, CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
7
|
Meireles OR, Rosman G, Altieri MS, Carin L, Hager G, Madani A, Padoy N, Pugh CM, Sylla P, Ward TM, Hashimoto DA. SAGES consensus recommendations on an annotation framework for surgical video. Surg Endosc 2021; 35:4918-4929. [PMID: 34231065 DOI: 10.1007/s00464-021-08578-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 05/26/2021] [Indexed: 11/25/2022]
Abstract
BACKGROUND The growing interest in analysis of surgical video through machine learning has led to increased research efforts; however, common methods of annotating video data are lacking. There is a need to establish recommendations on the annotation of surgical video data to enable assessment of algorithms and multi-institutional collaboration. METHODS Four working groups were formed from a pool of participants that included clinicians, engineers, and data scientists. The working groups were focused on four themes: (1) temporal models, (2) actions and tasks, (3) tissue characteristics and general anatomy, and (4) software and data structure. A modified Delphi process was utilized to create a consensus survey based on suggested recommendations from each of the working groups. RESULTS After three Delphi rounds, consensus was reached on recommendations for annotation within each of these domains. A hierarchy for annotation of temporal events in surgery was established. CONCLUSIONS While additional work remains to achieve accepted standards for video annotation in surgery, the consensus recommendations on a general framework for annotation presented here lay the foundation for standardization. This type of framework is critical to enabling diverse datasets, performance benchmarks, and collaboration.
Collapse
Affiliation(s)
- Ozanan R Meireles
- Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC460, Boston, MA, 02114, USA.
| | - Guy Rosman
- Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC460, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
| | - Maria S Altieri
- Department of Surgery, East Carolina University, Greenville, USA
| | - Lawrence Carin
- Department of Electrical and Computer Engineering, Duke University, Durham, USA
| | - Gregory Hager
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, USA
| | - Amin Madani
- Department of Surgery, University Health Network, Toronto, Canada
| | - Nicolas Padoy
- ICube, University of Strasbourg, Strasbourg, France
- IHU Strasbourg, Strasbourg, France
| | - Carla M Pugh
- Department of Surgery, Stanford University, Stanford, USA
| | - Patricia Sylla
- Department of Surgery, Mount Sinai Medical Center, New York, USA
| | - Thomas M Ward
- Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC460, Boston, MA, 02114, USA
| | - Daniel A Hashimoto
- Department of Surgery, Massachusetts General Hospital, 15 Parkman Street, WAC460, Boston, MA, 02114, USA.
| |
Collapse
|
8
|
Huang X, McGill S, DeCastro J, Fletcher L, Leonard J, Williams B, Rosman G. CARPAL: Confidence-Aware Intent Recognition for Parallel Autonomy. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3068894] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
9
|
Abstract
Annotation of surgical video is important for establishing ground truth in surgical data science endeavors that involve computer vision. With the growth of the field over the last decade, several challenges have been identified in annotating spatial, temporal, and clinical elements of surgical video as well as challenges in selecting annotators. In reviewing current challenges, we provide suggestions on opportunities for improvement and possible next steps to enable translation of surgical data science efforts in surgical video analysis to clinical research and practice.
Collapse
Affiliation(s)
- Thomas M Ward
- Surgical AI & Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Danyal M Fer
- Department of Surgery, University of California San Francisco East Bay, Hayward, CA, USA
| | - Yutong Ban
- Surgical AI & Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.,Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Guy Rosman
- Surgical AI & Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.,Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ozanan R Meireles
- Surgical AI & Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel A Hashimoto
- Surgical AI & Innovation Laboratory, Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
10
|
Li X, Rosman G, Gilitschenski I, Vasile CI, DeCastro JA, Karaman S, Rus D. Vehicle Trajectory Prediction Using Generative Adversarial Network With Temporal Logic Syntax Tree Features. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3062807] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
11
|
Ward TM, Mascagni P, Ban Y, Rosman G, Padoy N, Meireles O, Hashimoto DA. Computer vision in surgery. Surgery 2020; 169:1253-1256. [PMID: 33272610 DOI: 10.1016/j.surg.2020.10.039] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 10/09/2020] [Accepted: 10/10/2020] [Indexed: 12/17/2022]
Abstract
The fields of computer vision (CV) and artificial intelligence (AI) have undergone rapid advancements in the past decade, many of which have been applied to the analysis of intraoperative video. These advances are driven by wide-spread application of deep learning, which leverages multiple layers of neural networks to teach computers complex tasks. Prior to these advances, applications of AI in the operating room were limited by our relative inability to train computers to accurately understand images with traditional machine learning (ML) techniques. The development and refining of deep neural networks that can now accurately identify objects in images and remember past surgical events has sparked a surge in the applications of CV to analyze intraoperative video and has allowed for the accurate identification of surgical phases (steps) and instruments across a variety of procedures. In some cases, CV can even identify operative phases with accuracy similar to surgeons. Future research will likely expand on this foundation of surgical knowledge using larger video datasets and improved algorithms with greater accuracy and interpretability to create clinically useful AI models that gain widespread adoption and augment the surgeon's ability to provide safer care for patients everywhere.
Collapse
Affiliation(s)
- Thomas M Ward
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Pietro Mascagni
- ICube, University of Strasbourg, CNRS, IHU Strasbourg, France; Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Yutong Ban
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA
| | - Guy Rosman
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Distributed Robotics Laboratory, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA
| | - Nicolas Padoy
- ICube, University of Strasbourg, CNRS, IHU Strasbourg, France
| | - Ozanan Meireles
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Daniel A Hashimoto
- Surgical Artificial Intelligence and Innovation Laboratory, Massachusetts General Hospital, Harvard Medical School, Boston, MA.
| |
Collapse
|
12
|
Gilitschenski I, Rosman G, Gupta A, Karaman S, Rus D. Deep Context Maps: Agent Trajectory Prediction Using Location-Specific Latent Maps. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3004800] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
13
|
Huang X, McGill SG, DeCastro JA, Fletcher L, Leonard JJ, Williams BC, Rosman G. DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3005369] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
14
|
Ward TM, Hashimoto DA, Ban Y, Rattner DW, Inoue H, Lillemoe KD, Rus DL, Rosman G, Meireles OR. Automated operative phase identification in peroral endoscopic myotomy. Surg Endosc 2020; 35:4008-4015. [PMID: 32720177 DOI: 10.1007/s00464-020-07833-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 07/16/2020] [Indexed: 12/23/2022]
Abstract
BACKGROUND Artificial intelligence (AI) and computer vision (CV) have revolutionized image analysis. In surgery, CV applications have focused on surgical phase identification in laparoscopic videos. We proposed to apply CV techniques to identify phases in an endoscopic procedure, peroral endoscopic myotomy (POEM). METHODS POEM videos were collected from Massachusetts General and Showa University Koto Toyosu Hospitals. Videos were labeled by surgeons with the following ground truth phases: (1) Submucosal injection, (2) Mucosotomy, (3) Submucosal tunnel, (4) Myotomy, and (5) Mucosotomy closure. The deep-learning CV model-Convolutional Neural Network (CNN) plus Long Short-Term Memory (LSTM)-was trained on 30 videos to create POEMNet. We then used POEMNet to identify operative phases in the remaining 20 videos. The model's performance was compared to surgeon annotated ground truth. RESULTS POEMNet's overall phase identification accuracy was 87.6% (95% CI 87.4-87.9%). When evaluated on a per-phase basis, the model performed well, with mean unweighted and prevalence-weighted F1 scores of 0.766 and 0.875, respectively. The model performed best with longer phases, with 70.6% accuracy for phases that had a duration under 5 min and 88.3% accuracy for longer phases. DISCUSSION A deep-learning-based approach to CV, previously successful in laparoscopic video phase identification, translates well to endoscopic procedures. With continued refinements, AI could contribute to intra-operative decision-support systems and post-operative risk prediction.
Collapse
Affiliation(s)
- Thomas M Ward
- Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA.
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA.
| | - Daniel A Hashimoto
- Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Yutong Ban
- Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - David W Rattner
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Haruhiro Inoue
- Digestive Disease Center, Showa University Koto Toyosu Hospital, Tokyo, Japan
| | - Keith D Lillemoe
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Daniela L Rus
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Guy Rosman
- Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ozanan R Meireles
- Surgical AI and Innovation Laboratory, Massachusetts General Hospital, 15 Parkman St., WAC 460, Boston, MA, 02114, USA
- Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
15
|
McGill SG, Rosman G, Ort T, Pierson A, Gilitschenski I, Araki B, Fletcher L, Karaman S, Rus D, Leonard JJ. Probabilistic Risk Metrics for Navigating Occluded Intersections. IEEE Robot Autom Lett 2019. [DOI: 10.1109/lra.2019.2931823] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
16
|
Abstract
OBJECTIVE The aim of this review was to summarize major topics in artificial intelligence (AI), including their applications and limitations in surgery. This paper reviews the key capabilities of AI to help surgeons understand and critically evaluate new AI applications and to contribute to new developments. SUMMARY BACKGROUND DATA AI is composed of various subfields that each provide potential solutions to clinical problems. Each of the core subfields of AI reviewed in this piece has also been used in other industries such as the autonomous car, social networks, and deep learning computers. METHODS A review of AI papers across computer science, statistics, and medical sources was conducted to identify key concepts and techniques within AI that are driving innovation across industries, including surgery. Limitations and challenges of working with AI were also reviewed. RESULTS Four main subfields of AI were defined: (1) machine learning, (2) artificial neural networks, (3) natural language processing, and (4) computer vision. Their current and future applications to surgical practice were introduced, including big data analytics and clinical decision support systems. The implications of AI for surgeons and the role of surgeons in advancing the technology to optimize clinical effectiveness were discussed. CONCLUSIONS Surgeons are well positioned to help integrate AI into modern practice. Surgeons should partner with data scientists to capture data across phases of care and to provide clinical context, for AI has the potential to revolutionize the way surgery is taught and practiced with the promise of a future optimized for the highest quality patient care.
Collapse
Affiliation(s)
| | - Guy Rosman
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Boston, MA
| | - Daniela Rus
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Boston, MA
| | | |
Collapse
|
17
|
Straub J, Freifeld O, Rosman G, Leonard JJ, Fisher JW. The Manhattan Frame Model-Manhattan World Inference in the Space of Surface Normals. IEEE Trans Pattern Anal Mach Intell 2018; 40:235-249. [PMID: 28166490 DOI: 10.1109/tpami.2017.2662686] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Objects and structures within man-made environments typically exhibit a high degree of organization in the form of orthogonal and parallel planes. Traditional approaches utilize these regularities via the restrictive, and rather local, Manhattan World (MW) assumption which posits that every plane is perpendicular to one of the axes of a single coordinate system. The aforementioned regularities are especially evident in the surface normal distribution of a scene where they manifest as orthogonally-coupled clusters. This motivates the introduction of the Manhattan-Frame (MF) model which captures the notion of an MW in the surface normals space, the unit sphere, and two probabilistic MF models over this space. First, for a single MF we propose novel real-time MAP inference algorithms, evaluate their performance and their use in drift-free rotation estimation. Second, to capture the complexity of real-world scenes at a global scale, we extend the MF model to a probabilistic mixture of Manhattan Frames (MMF). For MMF inference we propose a simple MAP inference algorithm and an adaptive Markov-Chain Monte-Carlo sampling algorithm with Metropolis-Hastings split/merge moves that let us infer the unknown number of mixture components. We demonstrate the versatility of the MMF model and inference algorithm across several scales of man-made environments.
Collapse
|
18
|
Abstract
Document information systems (ISs) can be used to support staff tasks with documents. Owing to the declining prices of software and hardware, more and more document ISs are being designed, tested and implemented. This paper presents some types of costs and benefits of document ISs and some design methodological aspects that arise from the documentary nature of the data. This has conse quences both for the method of working and the method of modelling. There are multiple data types and the access structures to the information must be designed. A case study of the implementation route of a document IS is described. Based on a quantification and qualification of the document collections, costs and benefits of the docu ment IS, documentary aspects of the system design, together with the various sections of the invitation to tender for obtaining a suitable software package, are given.
Collapse
Affiliation(s)
- G. Rosman
- Delft University of Technology, The Netherlands
| | | | - H.G. Sol
- Delft University of Technology, The Netherlands
| |
Collapse
|
19
|
Dubrovina-Karni A, Rosman G, Kimmel R. Multi-Region Active Contours with a Single Level Set Function. IEEE Trans Pattern Anal Mach Intell 2015; 37:1585-1601. [PMID: 26352997 DOI: 10.1109/tpami.2014.2385708] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Segmenting an image into an arbitrary number of coherent regions is at the core of image understanding. Many formulations of the segmentation problem have been suggested over the past years. These formulations include, among others, axiomatic functionals, which are hard to implement and analyze, and graph-based alternatives, which impose a non-geometric metric on the problem. We propose a novel method for segmenting an image into an arbitrary number of regions using an axiomatic variational approach. The proposed method allows to incorporate various generic region appearance models, while avoiding metrication errors. In the suggested framework, the segmentation is performed by level set evolution. Yet, contrarily to most existing methods, here, multiple regions are represented by a single non-negative level set function. The level set function evolution is efficiently executed through the Voronoi Implicit Interface Method for multi-phase interface evolution. The proposed approach is shown to obtain accurate segmentation results for various natural 2D and 3D images, comparable to state-of-the-art image segmentation algorithms.
Collapse
|