1
|
Yang Z, Dai J, Pan J. 3D reconstruction from endoscopy images: A survey. Comput Biol Med 2024; 175:108546. [PMID: 38704902 DOI: 10.1016/j.compbiomed.2024.108546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/05/2024] [Accepted: 04/28/2024] [Indexed: 05/07/2024]
Abstract
Three-dimensional reconstruction of images acquired through endoscopes is playing a vital role in an increasing number of medical applications. Endoscopes used in the clinic are commonly classified as monocular endoscopes and binocular endoscopes. We have reviewed the classification of methods for depth estimation according to the type of endoscope. Basically, depth estimation relies on feature matching of images and multi-view geometry theory. However, these traditional techniques have many problems in the endoscopic environment. With the increasing development of deep learning techniques, there is a growing number of works based on learning methods to address challenges such as inconsistent illumination and texture sparsity. We have reviewed over 170 papers published in the 10 years from 2013 to 2023. The commonly used public datasets and performance metrics are summarized. We also give a taxonomy of methods and analyze the advantages and drawbacks of algorithms. Summary tables and result atlas are listed to facilitate the comparison of qualitative and quantitative performance of different methods in each category. In addition, we summarize commonly used scene representation methods in endoscopy and speculate on the prospects of deep estimation research in medical applications. We also compare the robustness performance, processing time, and scene representation of the methods to facilitate doctors and researchers in selecting appropriate methods based on surgical applications.
Collapse
Affiliation(s)
- Zhuoyue Yang
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, 37 Xueyuan Road, Haidian District, Beijing, 100191, China; Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, Guangdong Province, 518000, China
| | - Ju Dai
- Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, Guangdong Province, 518000, China
| | - Junjun Pan
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, 37 Xueyuan Road, Haidian District, Beijing, 100191, China; Peng Cheng Lab, 2 Xingke 1st Street, Nanshan District, Shenzhen, Guangdong Province, 518000, China.
| |
Collapse
|
2
|
Schmidt A, Mohareri O, DiMaio S, Yip MC, Salcudean SE. Tracking and mapping in medical computer vision: A review. Med Image Anal 2024; 94:103131. [PMID: 38442528 DOI: 10.1016/j.media.2024.103131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/08/2024] [Accepted: 02/29/2024] [Indexed: 03/07/2024]
Abstract
As computer vision algorithms increase in capability, their applications in clinical systems will become more pervasive. These applications include: diagnostics, such as colonoscopy and bronchoscopy; guiding biopsies, minimally invasive interventions, and surgery; automating instrument motion; and providing image guidance using pre-operative scans. Many of these applications depend on the specific visual nature of medical scenes and require designing algorithms to perform in this environment. In this review, we provide an update to the field of camera-based tracking and scene mapping in surgery and diagnostics in medical computer vision. We begin with describing our review process, which results in a final list of 515 papers that we cover. We then give a high-level summary of the state of the art and provide relevant background for those who need tracking and mapping for their clinical applications. After which, we review datasets provided in the field and the clinical needs that motivate their design. Then, we delve into the algorithmic side, and summarize recent developments. This summary should be especially useful for algorithm designers and to those looking to understand the capability of off-the-shelf methods. We maintain focus on algorithms for deformable environments while also reviewing the essential building blocks in rigid tracking and mapping since there is a large amount of crossover in methods. With the field summarized, we discuss the current state of the tracking and mapping methods along with needs for future algorithms, needs for quantification, and the viability of clinical applications. We then provide some research directions and questions. We conclude that new methods need to be designed or combined to support clinical applications in deformable environments, and more focus needs to be put into collecting datasets for training and evaluation.
Collapse
Affiliation(s)
- Adam Schmidt
- Department of Electrical and Computer Engineering, University of British Columbia, 2329 West Mall, Vancouver V6T 1Z4, BC, Canada.
| | - Omid Mohareri
- Advanced Research, Intuitive Surgical, 1020 Kifer Rd, Sunnyvale, CA 94086, USA
| | - Simon DiMaio
- Advanced Research, Intuitive Surgical, 1020 Kifer Rd, Sunnyvale, CA 94086, USA
| | - Michael C Yip
- Department of Electrical and Computer Engineering, University of California San Diego, 9500 Gilman Dr, La Jolla, CA 92093, USA
| | - Septimiu E Salcudean
- Department of Electrical and Computer Engineering, University of British Columbia, 2329 West Mall, Vancouver V6T 1Z4, BC, Canada
| |
Collapse
|
3
|
Li W, Fan J, Li Y, Hao P, Lin Y, Fu T, Ai D, Song H, Yang J. Endoscopy image enhancement method by generalized imaging defect models based adversarial training. Phys Med Biol 2022; 67. [DOI: 10.1088/1361-6560/ac6724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Accepted: 04/13/2022] [Indexed: 11/12/2022]
Abstract
Abstract
Objective. Smoke, uneven lighting, and color deviation are common issues in endoscopic surgery, which have increased the risk of surgery and even lead to failure. Approach. In this study, we present a new physics model driven semi-supervised learning framework for high-quality pixel-wise endoscopic image enhancement, which is generalizable for smoke removal, light adjustment, and color correction. To improve the authenticity of the generated images, and thereby improve the network performance, we integrated specific physical imaging defect models with the CycleGAN framework. No ground-truth data in pairs are required. In addition, we propose a transfer learning framework to address the data scarcity in several endoscope enhancement tasks and improve the network performance. Main results. Qualitative and quantitative studies reveal that the proposed network outperforms the state-of-the-art image enhancement methods. In particular, the proposed method performs much better than the original CycleGAN, for example, the structural similarity improved from 0.7925 to 0.8648, feature similarity for color images from 0.8917 to 0.9283, and quaternion structural similarity from 0.8097 to 0.8800 in the smoke removal task. Experimental results of the proposed transfer learning method also reveal its superior performance when trained with small datasets of target tasks. Significance. Experimental results on endoscopic images prove the effectiveness of the proposed network in smoke removal, light adjustment, and color correction, showing excellent clinical usefulness.
Collapse
|
4
|
Liu S, Fan J, Ai D, Song H, Fu T, Wang Y, Yang J. Feature matching for texture-less endoscopy images via superpixel vector field consistency. BIOMEDICAL OPTICS EXPRESS 2022; 13:2247-2265. [PMID: 35519251 PMCID: PMC9045917 DOI: 10.1364/boe.450259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/05/2022] [Accepted: 01/23/2022] [Indexed: 06/14/2023]
Abstract
Feature matching is an important technology to obtain the surface morphology of soft tissues in intraoperative endoscopy images. The extraction of features from clinical endoscopy images is a difficult problem, especially for texture-less images. The reduction of surface details makes the problem more challenging. We proposed an adaptive gradient-preserving method to improve the visual feature of texture-less images. For feature matching, we first constructed a spatial motion field by using the superpixel blocks and estimated its information entropy matching with the motion consistency algorithm to obtain the initial outlier feature screening. Second, we extended the superpixel spatial motion field to the vector field and constrained it with the vector feature to optimize the confidence of the initial matching set. Evaluations were implemented on public and undisclosed datasets. Our method increased by an order of magnitude in the three feature point extraction methods than the original image. In the public dataset, the accuracy and F1-score increased to 92.6% and 91.5%. The matching score was improved by 1.92%. In the undisclosed dataset, the reconstructed surface integrity of the proposed method was improved from 30% to 85%. Furthermore, we also presented the surface reconstruction result of differently sized images to validate the robustness of our method, which showed high-quality feature matching results. Overall, the experiment results proved the effectiveness of the proposed matching method. This demonstrates its capability to extract sufficient visual feature points and generate reliable feature matches for 3D reconstruction and meaningful applications in clinical.
Collapse
Affiliation(s)
- Shiyuan Liu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Danni Ai
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Hong Song
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Tianyu Fu
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Yongtian Wang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
5
|
Augmented reality navigation with real-time tracking for facial repair surgery. Int J Comput Assist Radiol Surg 2022; 17:981-991. [PMID: 35286586 DOI: 10.1007/s11548-022-02589-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 02/26/2022] [Indexed: 11/05/2022]
Abstract
PURPOSE Facial repair surgeries (FRS) require accuracy for navigating the critical anatomy safely and quickly. The purpose of this paper is to develop a method to directly track the position of the patient using video data acquired from the single camera, which can achieve noninvasive, real time, and high positioning accuracy in FRS. METHODS Our method first performs camera calibration and registers the surface segmented from computed tomography to the patient. Then, a two-step constraint algorithm, which includes the feature local constraint and the distance standard deviation constraint, is used to find the optimal feature matching pair quickly. Finally, the movements of the camera and the patient decomposed from the image motion matrix are used to track the camera and the patient, respectively. RESULTS The proposed method achieved fusion error RMS of 1.44 ± 0.35, 1.50 ± 0.15, 1.63 ± 0.03 mm in skull phantom, cadaver mandible, and human experiments, respectively. The above errors of the proposed method were lower than those of the optical tracking system-based method. Additionally, the proposed method could process video streams up to 24 frames per second, which can meet the real-time requirements of FRS. CONCLUSIONS The proposed method does not rely on tracking markers attached to the patient; it could be executed automatically to maintain the correct augmented reality scene and overcome the decrease in positioning accuracy caused by patient movement during surgery.
Collapse
|