1
|
Liu Y, Hayashi Y, Oda M, Kitasaka T, Mori K. YOLOv7-RepFPN: Improving real-time performance of laparoscopic tool detection on embedded systems. Healthc Technol Lett 2024; 11:157-166. [PMID: 38638498 PMCID: PMC11022232 DOI: 10.1049/htl2.12072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 12/09/2023] [Indexed: 04/20/2024] Open
Abstract
This study focuses on enhancing the inference speed of laparoscopic tool detection on embedded devices. Laparoscopy, a minimally invasive surgery technique, markedly reduces patient recovery times and postoperative complications. Real-time laparoscopic tool detection helps assisting laparoscopy by providing information for surgical navigation, and its implementation on embedded devices is gaining interest due to the portability, network independence and scalability of the devices. However, embedded devices often face computation resource limitations, potentially hindering inference speed. To mitigate this concern, the work introduces a two-fold modification to the YOLOv7 model: the feature channels and integrate RepBlock is halved, yielding the YOLOv7-RepFPN model. This configuration leads to a significant reduction in computational complexity. Additionally, the focal EIoU (efficient intersection of union) loss function is employed for bounding box regression. Experimental results on an embedded device demonstrate that for frame-by-frame laparoscopic tool detection, the proposed YOLOv7-RepFPN achieved an mAP of 88.2% (with IoU set to 0.5) on a custom dataset based on EndoVis17, and an inference speed of 62.9 FPS. Contrasting with the original YOLOv7, which garnered an 89.3% mAP and 41.8 FPS under identical conditions, the methodology enhances the speed by 21.1 FPS while maintaining detection accuracy. This emphasizes the effectiveness of the work.
Collapse
Affiliation(s)
- Yuzhang Liu
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
| | - Yuichiro Hayashi
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
| | - Masahiro Oda
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
- Information and CommunicationsNagoya UniversityAichi NagoyaJapan
| | - Takayuki Kitasaka
- Department of Information ScienceAichi Institute of TechnologyAichi, NagoyaJapan
| | - Kensaku Mori
- Graduate School of InformaticsNagoya UniversityAichi, NagoyaJapan
- Information and CommunicationsNagoya UniversityAichi NagoyaJapan
- Research Center of Medical BigdataNational Institute of InformaticsTokyoJapan
| |
Collapse
|
2
|
Huang Y, Ding X, Zhao Y, Tian X, Feng G, Gao Z. Automatic detection and segmentation of chorda tympani under microscopic vision in otosclerosis patients via convolutional neural networks. Int J Med Robot 2023; 19:e2567. [PMID: 37634074 DOI: 10.1002/rcs.2567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 08/08/2023] [Accepted: 08/14/2023] [Indexed: 08/28/2023]
Abstract
BACKGROUND Artificial intelligence (AI) techniques, especially deep learning (DL) techniques, have shown promising results for various computer vision tasks in the field of surgery. However, AI-guided navigation during microscopic surgery for real-time surgical guidance and decision support is much more complex, and its efficacy has yet to be demonstrated. We propose a model dedicated to the evaluation of DL-based semantic segmentation of chorda tympani (CT) during microscopic surgery. METHODS Various convolutional neural networks were constructed, trained, and validated for semantic segmentation of CT. Our dataset has 5817 images annotated from 36 patients, which were further randomly split into the training set (90%, 5236 images) and validation set (10%, 581 images). In addition, 1500 raw images from 3 patients (500 images randomly selected per patient) were used to evaluate the network performance. RESULTS When evaluated on a validation set (581 images), our proposed CT detection networks achieved great performance, and the modified U-net performed best (mIOU = 0.892, mPA = 0.9427). Moreover, when applying U-net to predict the test set (1500 raw images from 3 patients), our methods also showed great overall performance (Accuracy = 0.976, Precision = 0.996, Sensitivity = 0.979, Specificity = 0.902). CONCLUSIONS This study suggests that DL can be used for the automated detection and segmentation of CT in patients with otosclerosis during microscopic surgery with a high degree of performance. Our research validated the potential feasibility for future vision-based navigation surgical assistance and autonomous surgery using AI.
Collapse
Affiliation(s)
- Yu Huang
- Department of Otorhinolaryngology Head and Neck Surgery, the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xin Ding
- Department of Otorhinolaryngology Head and Neck Surgery, the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yang Zhao
- Department of Otorhinolaryngology Head and Neck Surgery, the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xu Tian
- Department of Otorhinolaryngology Head and Neck Surgery, the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Guodong Feng
- Department of Otorhinolaryngology Head and Neck Surgery, the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zhiqiang Gao
- Department of Otorhinolaryngology Head and Neck Surgery, the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
3
|
Liu L, Yu B, Xu L, Wang S, Zhao L, Wu H. Comparison of stereopsis thresholds measured with conventional methods and a new eye tracking method. PLoS One 2023; 18:e0293735. [PMID: 37917615 PMCID: PMC10621823 DOI: 10.1371/journal.pone.0293735] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/18/2023] [Indexed: 11/04/2023] Open
Abstract
PURPOSE Stereopsis is the ability to perceive depth using the slightly different views from two eyes. This study aims to conduct innovative stereopsis tests using the objective data outputted by eye tracking technology. METHODS A laptop and an eye tracker were used to establish the test system. Anaglyphic glasses were employed to execute the stereopsis assessment. The test symbol employed was devised to emulate the quantitative measurement component of the Random Dot 3 Stereo Acuity Test. Sub-pixel technology was used to increase the disparity accuracy of test pages. The tested disparities were: 160″, 100″, 63″, 50″, 40″, 32″, 25″, 20″, 16″, and 12.5″. The test was conducted at a distance of 0.65m. Conventional and eye tracking stereopsis assessments were conducted on 120 subjects. Wilcoxon signed-rank test was used to test the difference, while the Bland-Altman method was used to test the consistency between the two methods. RESULTS The Wilcoxon signed-rank test showed no significant difference between conventional and eye tracking thresholds of stereopsis (Z = -1.497, P = 0.134). There was a high level of agreement between the two methods using Bland- Altman statistical analysis (The 95 per cent limits of agreement were -0.40 to 0.47 log arcsec). CONCLUSIONS Stereoacuity can be evaluated utilizing an innovative stereopsis measurement system grounded in eye tracking technology.
Collapse
Affiliation(s)
- Lu Liu
- Department of Optometry, The Second Hospital of Jilin University, Changchun, China
| | - Bo Yu
- Department of Optometry, The Second Hospital of Jilin University, Changchun, China
| | - Lingxian Xu
- Department of Optometry, The Second Hospital of Jilin University, Changchun, China
| | - Shiyi Wang
- Department of Optometry, The Second Hospital of Jilin University, Changchun, China
| | - Lingzhi Zhao
- Department of Optometry, The Second Hospital of Jilin University, Changchun, China
| | - Huang Wu
- Department of Optometry, The Second Hospital of Jilin University, Changchun, China
| |
Collapse
|
4
|
Ping L, Wang Z, Yao J, Gao J, Yang S, Li J, Shi J, Wu W, Hua S, Wang H. Application and evaluation of surgical tool and tool tip recognition based on Convolutional Neural Network in multiple endoscopic surgical scenarios. Surg Endosc 2023; 37:7376-7384. [PMID: 37580576 DOI: 10.1007/s00464-023-10323-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 07/19/2023] [Indexed: 08/16/2023]
Abstract
BACKGROUND In recent years, computer-assisted intervention and robot-assisted surgery are receiving increasing attention. The need for real-time identification and tracking of surgical tools and tool tips is constantly demanding. A series of researches focusing on surgical tool tracking and identification have been performed. However, the size of dataset, the sensitivity/precision, and the response time of these studies were limited. In this work, we developed and utilized an automated method based on Convolutional Neural Network (CNN) and You Only Look Once (YOLO) v3 algorithm to locate and identify surgical tools and tool tips covering five different surgical scenarios. MATERIALS AND METHODS An algorithm of object detection was applied to identify and locate the surgical tools and tool tips. DarkNet-19 was used as Backbone Network and YOLOv3 was modified and applied for the detection. We included a series of 181 endoscopy videos covering 5 different surgical scenarios: pancreatic surgery, thyroid surgery, colon surgery, gastric surgery, and external scenes. A total amount of 25,333 images containing 94,463 targets were collected. Training and test sets were divided in a proportion of 2.5:1. The data sets were openly stored at the Kaggle database. RESULTS Under an Intersection over Union threshold of 0.5, the overall sensitivity and precision rate of the model were 93.02% and 89.61% for tool recognition and 87.05% and 83.57% for tool tip recognition, respectively. The model demonstrated the highest tool and tool tip recognition sensitivity and precision rate under external scenes. Among the four different internal surgical scenes, the network had better performances in pancreatic and colon surgeries and poorer performances in gastric and thyroid surgeries. CONCLUSION We developed a surgical tool and tool tip recognition model based on CNN and YOLOv3. Validation of our model demonstrated satisfactory precision, accuracy, and robustness across different surgical scenes.
Collapse
Affiliation(s)
- Lu Ping
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Zhihong Wang
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jingjing Yao
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Junyi Gao
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Sen Yang
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jiayi Li
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Jile Shi
- 8-Year MD Program, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Wenming Wu
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Surong Hua
- Department of General Surgery, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| | - Huizhen Wang
- Department of Nursing, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China.
| |
Collapse
|
5
|
Chen Z, Marzullo A, Alberti D, Lievore E, Fontana M, De Cobelli O, Musi G, Ferrigno G, De Momi E. FRSR: Framework for real-time scene reconstruction in robot-assisted minimally invasive surgery. Comput Biol Med 2023; 163:107121. [PMID: 37311383 DOI: 10.1016/j.compbiomed.2023.107121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/12/2023] [Accepted: 05/30/2023] [Indexed: 06/15/2023]
Abstract
3D reconstruction of the intra-operative scenes provides precise position information which is the foundation of various safety related applications in robot-assisted surgery, such as augmented reality. Herein, a framework integrated into a known surgical system is proposed to enhance the safety of robotic surgery. In this paper, we present a scene reconstruction framework to restore the 3D information of the surgical site in real time. In particular, a lightweight encoder-decoder network is designed to perform disparity estimation, which is the key component of the scene reconstruction framework. The stereo endoscope of da Vinci Research Kit (dVRK) is adopted to explore the feasibility of the proposed approach, and it provides the possibility for the migration to other Robot Operating System (ROS) based robot platforms due to the strong independence on hardware. The framework is evaluated using three different scenarios, including a public dataset (3018 pairs of endoscopic images), the scene from the dVRK endoscope in our lab as well as a self-made clinical dataset captured from an oncology hospital. Experimental results show that the proposed framework can reconstruct 3D surgical scenes in real time (25 FPS), and achieve high accuracy (2.69 ± 1.48 mm in MAE, 5.47 ± 1.34 mm in RMSE and 0.41 ± 0.23 in SRE, respectively). It demonstrates that our framework can reconstruct intra-operative scenes with high reliability of both accuracy and speed, and the validation of clinical data also shows its potential in surgery. This work enhances the state of art in 3D intra-operative scene reconstruction based on medical robot platforms. The clinical dataset has been released to promote the development of scene reconstruction in the medical image community.
Collapse
Affiliation(s)
- Ziyang Chen
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, 20133, Italy.
| | - Aldo Marzullo
- Department of Mathematics and Computer Science, University of Calabria, Rende, 87036, Italy
| | - Davide Alberti
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, 20133, Italy
| | - Elena Lievore
- Department of Urology, European Institute of Oncology, IRCCS, Milan, 20141, Italy
| | - Matteo Fontana
- Department of Urology, European Institute of Oncology, IRCCS, Milan, 20141, Italy
| | - Ottavio De Cobelli
- Department of Urology, European Institute of Oncology, IRCCS, Milan, 20141, Italy; Department of Oncology and Onco-haematology, Faculty of Medicine and Surgery, University of Milan, Milan, 20122, Italy
| | - Gennaro Musi
- Department of Urology, European Institute of Oncology, IRCCS, Milan, 20141, Italy; Department of Oncology and Onco-haematology, Faculty of Medicine and Surgery, University of Milan, Milan, 20122, Italy
| | - Giancarlo Ferrigno
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, 20133, Italy
| | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, 20133, Italy; Department of Urology, European Institute of Oncology, IRCCS, Milan, 20141, Italy
| |
Collapse
|
6
|
Titov O, Bykanov A, Pitskhelauri D. Neurosurgical skills analysis by machine learning models: systematic review. Neurosurg Rev 2023; 46:121. [PMID: 37191734 DOI: 10.1007/s10143-023-02028-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 04/16/2023] [Accepted: 05/06/2023] [Indexed: 05/17/2023]
Abstract
Machine learning (ML) models are being actively used in modern medicine, including neurosurgery. This study aimed to summarize the current applications of ML in the analysis and assessment of neurosurgical skills. We conducted this systematic review in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We searched the PubMed and Google Scholar databases for eligible studies published until November 15, 2022, and used the Medical Education Research Study Quality Instrument (MERSQI) to assess the quality of the included articles. Of the 261 studies identified, we included 17 in the final analysis. Studies were most commonly related to oncological, spinal, and vascular neurosurgery using microsurgical and endoscopic techniques. Machine learning-evaluated tasks included subpial brain tumor resection, anterior cervical discectomy and fusion, hemostasis of the lacerated internal carotid artery, brain vessel dissection and suturing, glove microsuturing, lumbar hemilaminectomy, and bone drilling. The data sources included files extracted from VR simulators and microscopic and endoscopic videos. The ML application was aimed at classifying participants into several expertise levels, analysis of differences between experts and novices, surgical instrument recognition, division of operation into phases, and prediction of blood loss. In two articles, ML models were compared with those of human experts. The machines outperformed humans in all tasks. The most popular algorithms used to classify surgeons by skill level were the support vector machine and k-nearest neighbors, and their accuracy exceeded 90%. The "you only look once" detector and RetinaNet usually solved the problem of detecting surgical instruments - their accuracy was approximately 70%. The experts differed by more confident contact with tissues, higher bimanuality, smaller distance between the instrument tips, and relaxed and focused state of the mind. The average MERSQI score was 13.9 (from 18). There is growing interest in the use of ML in neurosurgical training. Most studies have focused on the evaluation of microsurgical skills in oncological neurosurgery and on the use of virtual simulators; however, other subspecialties, skills, and simulators are being investigated. Machine learning models effectively solve different neurosurgical tasks related to skill classification, object detection, and outcome prediction. Properly trained ML models outperform human efficacy. Further research on ML application in neurosurgery is needed.
Collapse
Affiliation(s)
- Oleg Titov
- Burdenko Neurosurgery Center, Moscow, Russia.
- OPEN BRAIN, Laboratory of Neurosurgical Innovations, Moscow, Russia.
| | | | | |
Collapse
|
7
|
Liu S, Wang A, Deng X, Yang C. MGNN: A multiscale grouped convolutional neural network for efficient atrial fibrillation detection. Comput Biol Med 2022; 148:105863. [PMID: 35849950 DOI: 10.1016/j.compbiomed.2022.105863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Revised: 06/29/2022] [Accepted: 07/03/2022] [Indexed: 11/03/2022]
Abstract
The reliable detection of atrial fibrillation (AF) is of great significance for monitoring disease progression and developing tailored care paths. In this work, we proposed a novel and robust method based on deep learning for the accurate detection of AF. Using RR interval sequences, a multiscale grouped convolutional neural network (MGNN) combined with self-attention was designed for automatic feature extraction, and AF and non-AF classification. An average accuracy of 97.07% was obtained in the 5-fold cross-validation. The generalization ability of the proposed MGNN was further independently tested on four other unseen datasets, and the accuracy was 92.23%, 96.86%, 94.23% and 95.91%. Moreover, comparison of the network structures indicated that the MGNN had not only better detection performance but also lower computational complexity. In conclusion, the proposed model is shown to be an efficient AF detector that has great potential for use in clinical auxiliary diagnosis and long-term home monitoring based on wearable devices.
Collapse
Affiliation(s)
- Sen Liu
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, PR China
| | - Aiguo Wang
- Department of Cardiology, Xinghua City People's Hospital, Jiangsu, 225700, PR China
| | - Xintao Deng
- Department of Cardiology, Xinghua City People's Hospital, Jiangsu, 225700, PR China.
| | - Cuiwei Yang
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai, 200433, PR China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, 200093, PR China.
| |
Collapse
|