1
|
Zhang Z, Song H, Fan J, Fu T, Li Q, Ai D, Xiao D, Yang J. Dual-correlate optimized coarse-fine strategy for monocular laparoscopic videos feature matching via multilevel sequential coupling feature descriptor. Comput Biol Med 2024; 169:107890. [PMID: 38168646 DOI: 10.1016/j.compbiomed.2023.107890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 12/13/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024]
Abstract
Feature matching of monocular laparoscopic videos is crucial for visualization enhancement in computer-assisted surgery, and the keys to conducting high-quality matches are accurate homography estimation, relative pose estimation, as well as sufficient matches and fast calculation. However, limited by various monocular laparoscopic imaging characteristics such as highlight noises, motion blur, texture interference and illumination variation, most exiting feature matching methods face the challenges of producing high-quality matches efficiently and sufficiently. To overcome these limitations, this paper presents a novel sequential coupling feature descriptor to extract and express multilevel feature maps efficiently, and a dual-correlate optimized coarse-fine strategy to establish dense matches in coarse level and adjust pixel-wise matches in fine level. Firstly, a novel sequential coupling swin transformer layer is designed in feature descriptor to learn and extract multilevel feature representations richly without increasing complexity. Then, a dual-correlate optimized coarse-fine strategy is proposed to match coarse feature sequences under low resolution, and the correlated fine feature sequences is optimized to refine pixel-wise matches based on coarse matching priors. Finally, the sequential coupling feature descriptor and dual-correlate optimization are merged into the Sequential Coupling Dual-Correlate Network (SeCo DC-Net) to produce high-quality matches. The evaluation is conducted on two public laparoscopic datasets: Scared and EndoSLAM, and the experimental results show the proposed network outperforms state-of-the-art methods in homography estimation, relative pose estimation, reprojection error, matching pairs number and inference runtime. The source code is publicly available at https://github.com/Iheckzza/FeatureMatching.
Collapse
Affiliation(s)
- Ziang Zhang
- The School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Hong Song
- The School of Computer Science & Technology, Beijing Institute of Technology, Beijing, 100081, China.
| | - Jingfan Fan
- The School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China.
| | - Tianyu Fu
- The School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Qiang Li
- The School of Computer Science & Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Danni Ai
- The School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Deqaing Xiao
- The School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China
| | - Jian Yang
- The School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China.
| |
Collapse
|
2
|
Zhang Y, Li H, Zhang W, Xiao C. Dense stereo fish-eye images using a modified hemispherical ASW algorithm. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2021; 38:476-487. [PMID: 33798176 DOI: 10.1364/josaa.413120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 02/06/2021] [Indexed: 06/12/2023]
Abstract
In this paper, we concentrate on dense estimation of disparities between fish-eye images without corrections. Because of the distortions, fish-eye images cannot be processed directly utilizing the classical adaptive support weight (ASW) method for perspective images. To address this problem, we propose a modified hemispherical ASW method in a hemispherical framework. First, 3D epipolar curves are calculated directly on a hemispherical model to deal with the problem that 2D epipolar curves cannot cover the whole image disc. Then, a modified ASW method with hemispherical support window and hemispherical geodesic distance is presented. Moreover, a three-dimensional epipolar distance transform (3DEDT) is proposed and fused into the matching cost to cope with the textureless region problem. The benefit of this approach is demonstrated by realizing the dense stereo matching for fish-eye images using a public fish-eye data set, for which both objectively evaluated as well as visually convincing results are provided.
Collapse
|
3
|
Li C, Zhou Y, Li Y, Yang S. A coarse-to-fine registration method for three-dimensional MR images. Med Biol Eng Comput 2021; 59:457-469. [PMID: 33515131 DOI: 10.1007/s11517-021-02317-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 01/15/2021] [Indexed: 10/22/2022]
Abstract
Three-dimensional (3D) multimodal magnetic resonance (MR) image registration aims to align similar things in different MR images spatially. Such a technology is useful in auxiliary disease diagnosis and surgical treatment. However, inconsistent intensity correspondence and large initial displacement contribute to the difficulty in registering multimodal MR volumes. A coarse-to-fine method is proposed in this study for pairwise 3D MR image rigid registration. Firstly, the proposed method extracts image feature points to form unregistered point sets and performs coarse registration based on point set registration to reduce the initial displacements of offset images effectively. Then, this method calculates a grey histogram based on voxels in the adaptive region of interest and further improves registration accuracy by maximizing mutual information of coarse-registered images. Some representative registration methods are compared on the basis of three MR image datasets to evaluate the performance of the proposed method. Experimental results show that the proposed method improved more in registration success rate and accuracy compared with conventional registration methods, especially when initial displacements are large.
Collapse
Affiliation(s)
- Cuixia Li
- School of Software Academy, Zhengzhou University, Zhengzhou, 450000, China
| | - Yuanyuan Zhou
- School of Software Academy, Zhengzhou University, Zhengzhou, 450000, China
| | - Yinghao Li
- School of Software Academy, Zhengzhou University, Zhengzhou, 450000, China.
| | - Shanshan Yang
- School of Software Academy, Zhengzhou University, Zhengzhou, 450000, China
| |
Collapse
|
4
|
Zhang Y, Zhang H, Zhang W. Feature matching based on curve descriptor and local D-Nets for fish-eye images. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2020; 37:787-796. [PMID: 32400712 DOI: 10.1364/josaa.385921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 03/30/2020] [Indexed: 06/11/2023]
Abstract
Most feature-matching algorithms based on perspective images, such as scale-invariant feature transform (SIFT), speeded up robust features, or DAISY, construct their feature descriptors from the neighborhood information of feature points. Large nonlinear distortion results in different amounts of neighborhood information at different feature points within the fish-eye images, especially for the case when a feature pixel is at the central region and the corresponding feature pixel is at the periphery. In contrast, descriptor-Nets (D-Nets) is a feature-matching algorithm based on global information. It is more robust, but it is time-consuming. In this paper, we employ the SIFT detector to extract feature pixels, and then we propose a novel feature-matching strategy based on the D-Nets algorithm. We modify the linear descriptors in the traditional D-Nets algorithm and propose a curve descriptor based on the hemispheric model of a fish-eye image. In the traditional D-Nets algorithm, each feature point is described by all other pixels of the entire image, and complicated calculations cause slow matching speed. To solve this problem, we convert the traditional global D-Nets into a novel local D-Nets. In the experiment, we obtain image pairs from real scenery using the binocular fish-eye camera platform. Experimental results show that the proposed local D-Nets method can achieve more than 3 times the initial matching pixels, and the percentage of bad matching is reduced by 40% compared with the best performing method among the comparison methods. In addition, the matching pixel pairs obtained by the proposed method are evenly distributed, either in the center region with small distortion or in the peripheral region with large distortion. Meanwhile, the local D-Nets algorithm is 16 times less than that of the global D-Nets algorithm.
Collapse
|
5
|
AI Radar Sensor: Creating Radar Depth Sounder Images Based on Generative Adversarial Network. SENSORS 2019; 19:s19245479. [PMID: 31842359 PMCID: PMC6960960 DOI: 10.3390/s19245479] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 12/04/2019] [Accepted: 12/06/2019] [Indexed: 11/17/2022]
Abstract
Significant resources have been spent in collecting and storing large and heterogeneous radar datasets during expensive Arctic and Antarctic fieldwork. The vast majority of data available is unlabeled, and the labeling process is both time-consuming and expensive. One possible alternative to the labeling process is the use of synthetically generated data with artificial intelligence. Instead of labeling real images, we can generate synthetic data based on arbitrary labels. In this way, training data can be quickly augmented with additional images. In this research, we evaluated the performance of synthetically generated radar images based on modified cycle-consistent adversarial networks. We conducted several experiments to test the quality of the generated radar imagery. We also tested the quality of a state-of-the-art contour detection algorithm on synthetic data and different combinations of real and synthetic data. Our experiments show that synthetic radar images generated by generative adversarial network (GAN) can be used in combination with real images for data augmentation and training of deep neural networks. However, the synthetic images generated by GANs cannot be used solely for training a neural network (training on synthetic and testing on real) as they cannot simulate all of the radar characteristics such as noise or Doppler effects. To the best of our knowledge, this is the first work in creating radar sounder imagery based on generative adversarial network.
Collapse
|
6
|
Deep Recurrent Neural Network and Data Filtering for Rumor Detection on Sina Weibo. Symmetry (Basel) 2019. [DOI: 10.3390/sym11111408] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Social media makes it easy for individuals to publish and consume news, but it also facilitates the spread of rumors. This paper proposes a novel deep recurrent neural model with a symmetrical network architecture for automatic rumor detection in social media such as Sina Weibo, which shows better performance than the existing methods. In the data preparing phase, we filter the posts according to the followers of the user. We then use sequential encoding for the posts and multiple embedding layers to get better feature representation, and multiple recurrent neural network layers to capture the dynamic temporal signals characteristic. The experimental results on the Sina Weibo dataset show that: 1. the sequential encoding performs better than the term frequency-inverse document frequency (TF-IDF) or the doc2vec encoding scheme; 2. the model is more accurate when trained on the posts from the users with more followers; and 3. the model achieves superior improvements over the existing works on the accuracy of detection, including the early detection.
Collapse
|
7
|
An Efficient Image Reconstruction Framework Using Total Variation Regularization with Lp-Quasinorm and Group Gradient Sparsity. INFORMATION 2019. [DOI: 10.3390/info10030115] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The total variation (TV) regularization-based methods are proven to be effective in removing random noise. However, these solutions usually have staircase effects. This paper proposes a new image reconstruction method based on TV regularization with Lp-quasinorm and group gradient sparsity. In this method, the regularization term of the group gradient sparsity can retrieve the neighborhood information of an image gradient, and the Lp-quasinorm constraint can characterize the sparsity of the image gradient. The method can effectively deblur images and remove impulse noise to well preserve image edge information and reduce the staircase effect. To improve the image recovery efficiency, a Fast Fourier Transform (FFT) is introduced to effectively avoid large matrix multiplication operations. Moreover, by introducing accelerated alternating direction method of multipliers (ADMM) in the method to allow for a fast restart of the optimization process, this method can run faster. In numerical experiments on standard test images sourced form Emory University and CVG-UGR (Computer Vision Group, University of Granada) image database, the advantage of the new method is verified by comparing it with existing advanced TV-based methods in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and operational time.
Collapse
|
8
|
Kahaki SMM, Arshad H, Nordin MJ, Ismail W. Geometric feature descriptor and dissimilarity-based registration of remotely sensed imagery. PLoS One 2018; 13:e0200676. [PMID: 30024921 PMCID: PMC6067644 DOI: 10.1371/journal.pone.0200676] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 06/19/2018] [Indexed: 12/04/2022] Open
Abstract
Image registration of remotely sensed imagery is challenging, as complex deformations are common. Different deformations, such as affine and homogenous transformation, combined with multimodal data capturing can emerge in the data acquisition process. These effects, when combined, tend to compromise the performance of the currently available registration methods. A new image transform, known as geometric mean projection transform, is introduced in this work. As it is deformation invariant, it can be employed as a feature descriptor, whereby it analyzes the functions of all vertical and horizontal signals in local areas of the image. Moreover, an invariant feature correspondence method is proposed as a point matching algorithm, which incorporates new descriptor’s dissimilarity metric. Considering the image as a signal, the proposed approach utilizes a square Eigenvector correlation (SEC) based on the Eigenvector properties. In our experiments on standard test images sourced from “Featurespace” and “IKONOS” datasets, the proposed method achieved higher average accuracy relative to that obtained from other state of the art image registration techniques. The accuracy of the proposed method was assessed using six standard evaluation metrics. Furthermore, statistical analyses, including t-test and Friedman test, demonstrate that the method developed as a part of this study is superior to the existing methods.
Collapse
Affiliation(s)
- Seyed M. M. Kahaki
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, National University of Malaysia (UKM), Bangi, Selangor, Malaysia
- * E-mail:
| | - Haslina Arshad
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, National University of Malaysia (UKM), Bangi, Selangor, Malaysia
| | - Md Jan Nordin
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, National University of Malaysia (UKM), Bangi, Selangor, Malaysia
| | - Waidah Ismail
- Faculty of Science and Technology, Universiti Sains Islam Malaysia, Bandar Baru Nilai, Nilai, Negeri Sembilan, Malaysia
| |
Collapse
|
9
|
Lu Y, Gao K, Zhang T, Xu T. A novel image registration approach via combining local features and geometric invariants. PLoS One 2018; 13:e0190383. [PMID: 29293595 PMCID: PMC5749792 DOI: 10.1371/journal.pone.0190383] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 12/13/2017] [Indexed: 11/29/2022] Open
Abstract
Image registration is widely used in many fields, but the adaptability of the existing methods is limited. This work proposes a novel image registration method with high precision for various complex applications. In this framework, the registration problem is divided into two stages. First, we detect and describe scale-invariant feature points using modified computer vision-oriented fast and rotated brief (ORB) algorithm, and a simple method to increase the performance of feature points matching is proposed. Second, we develop a new local constraint of rough selection according to the feature distances. Evidence shows that the existing matching techniques based on image features are insufficient for the images with sparse image details. Then, we propose a novel matching algorithm via geometric constraints, and establish local feature descriptions based on geometric invariances for the selected feature points. Subsequently, a new price function is constructed to evaluate the similarities between points and obtain exact matching pairs. Finally, we employ the progressive sample consensus method to remove wrong matches and calculate the space transform parameters. Experimental results on various complex image datasets verify that the proposed method is more robust and significantly reduces the rate of false matches while retaining more high-quality feature points.
Collapse
Affiliation(s)
- Yan Lu
- Key Lab of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing, China
| | - Kun Gao
- Key Lab of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing, China
- * E-mail:
| | - Tinghua Zhang
- Key Lab of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing, China
| | - Tingfa Xu
- Key Lab of Photoelectronic Imaging Technology and System, Ministry of Education of China, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
10
|
Su M, Ma Y, Zhang X, Wang Y, Zhang Y. MBR-SIFT: A mirror reflected invariant feature descriptor using a binary representation for image matching. PLoS One 2017; 12:e0178090. [PMID: 28542537 PMCID: PMC5436860 DOI: 10.1371/journal.pone.0178090] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 05/07/2017] [Indexed: 12/04/2022] Open
Abstract
The traditional scale invariant feature transform (SIFT) method can extract distinctive features for image matching. However, it is extremely time-consuming in SIFT matching because of the use of the Euclidean distance measure. Recently, many binary SIFT (BSIFT) methods have been developed to improve matching efficiency; however, none of them is invariant to mirror reflection. To address these problems, in this paper, we present a horizontal or vertical mirror reflection invariant binary descriptor named MBR-SIFT, in addition to a novel image matching approach. First, 16 cells in the local region around the SIFT keypoint are reorganized, and then the 128-dimensional vector of the SIFT descriptor is transformed into a reconstructed vector according to eight directions. Finally, the MBR-SIFT descriptor is obtained after binarization and reverse coding. To improve the matching speed and accuracy, a fast matching algorithm that includes a coarse-to-fine two-step matching strategy in addition to two similarity measures for the MBR-SIFT descriptor are proposed. Experimental results on the UKBench dataset show that the proposed method not only solves the problem of mirror reflection, but also ensures desirable matching accuracy and speed.
Collapse
Affiliation(s)
- Mingzhe Su
- College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, China
| | - Yan Ma
- College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, China
- * E-mail:
| | - Xiangfen Zhang
- College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, China
| | - Yan Wang
- Mathématiques, Informatique, Télécommunications de Toulouse, Université Paul Sabatier, Toulouse, France
| | - Yuping Zhang
- College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, China
| |
Collapse
|