1
|
Zhang X, Ji X, Wang J, Fan Y, Tao C. Renal surface reconstruction and segmentation for image-guided surgical navigation of laparoscopic partial nephrectomy. Biomed Eng Lett 2023; 13:165-174. [PMID: 37124114 PMCID: PMC10130295 DOI: 10.1007/s13534-023-00263-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 12/01/2022] [Accepted: 01/22/2023] [Indexed: 02/04/2023] Open
Abstract
An unpredictable dynamic surgical environment makes it necessary to measure morphological information of target tissue real-time for laparoscopic image-guided navigation. The stereo vision method for intraoperative tissue 3D reconstruction has the most potential for clinical development benefiting from its high reconstruction accuracy and laparoscopy compatibility. However, existing stereo vision methods have difficulty in achieving high reconstruction accuracy in real time. Also, intraoperative tissue reconstruction results often contain complex background and instrument information that prevents clinical development for image-guided systems. Taking laparoscopic partial nephrectomy (LPN) as the research object, this paper realizes a real-time dense reconstruction and extraction of the kidney tissue surface. The central symmetrical Census based semi-global block stereo matching algorithm is proposed to generate a dense disparity map. A GPU-based pixel-by-pixel connectivity segmentation mechanism is designed to segment the renal tissue area. An in-vitro porcine heart, in-vivo porcine kidney and offline clinical LPN data were performed to evaluate the accuracy and effectiveness of our approach. The algorithm achieved a reconstruction accuracy of ± 2 mm with a real-time update rate of 21 fps for an HD image size of 960 × 540, and 91.0% target tissue segmentation accuracy even with surgical instrument occlusions. Experimental results have demonstrated that the proposed method could accurately reconstruct and extract renal surface in real-time in LPN. The measurement results can be used directly for image-guided systems. Our method provides a new way to measure geometric information of target tissue intraoperatively in laparoscopy surgery. Supplementary Information The online version contains supplementary material available at 10.1007/s13534-023-00263-1.
Collapse
Affiliation(s)
- Xiaohui Zhang
- School of Engineering Medicine, Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100083 China
| | - Xuquan Ji
- School of Biomedical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100083 China
| | - Junchen Wang
- School of Mechanical Engineering and Automation, Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang Unviersity, No. 37 Xueyuan Road, Haidian District, Beijing, 100083 China
| | - Yubo Fan
- School of Engineering Medicine, Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100083 China
- School of Biomedical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100083 China
| | - Chunjing Tao
- School of Engineering Medicine, Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100083 China
| |
Collapse
|
2
|
Diaz-Ramirez VH, Gonzalez-Ruiz M, Kober V, Juarez-Salazar R. Stereo Image Matching Using Adaptive Morphological Correlation. SENSORS (BASEL, SWITZERLAND) 2022; 22:9050. [PMID: 36501752 PMCID: PMC9737403 DOI: 10.3390/s22239050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 11/17/2022] [Accepted: 11/20/2022] [Indexed: 06/17/2023]
Abstract
A stereo matching method based on adaptive morphological correlation is presented. The point correspondences of an input pair of stereo images are determined by matching locally adaptive image windows using the suggested morphological correlation that is optimal with respect to an introduced binary dissimilarity-to-matching ratio criterion. The proposed method is capable of determining the point correspondences in homogeneous image regions and at the edges of scene objects of input stereo images with high accuracy. Furthermore, unknown correspondences of occluded and not matched points in the scene can be successfully recovered using a simple proposed post-processing. The performance of the proposed method is exhaustively tested for stereo matching in terms of objective measures using known database images. In addition, the obtained results are discussed and compared with those of two similar state-of-the-art methods.
Collapse
Affiliation(s)
- Victor H. Diaz-Ramirez
- Instituto Politécnico Nacional-CITEDI, Instituto Politécnico Nacional 1310, Tijuana 22310, BC, Mexico
| | - Martin Gonzalez-Ruiz
- Instituto Politécnico Nacional-CITEDI, Instituto Politécnico Nacional 1310, Tijuana 22310, BC, Mexico
| | - Vitaly Kober
- Department of Computer Science, CICESE, Ensenada 22860, BC, Mexico
- Department of Mathematics, Chelyabinsk State University, 454001 Chelyabinsk, Russia
| | - Rigoberto Juarez-Salazar
- CONACYT-Instituo Politécnico Nacional, CITEDI, Instituto Politécnico Nacional 1310, Tijuana 22310, BC, Mexico
| |
Collapse
|
3
|
Gani SFA, Miskon MF, Hamzah RA. Depth Map Information from Stereo Image Pairs using Deep Learning and Bilateral Filter for Machine Vision Application. 2022 IEEE 5TH INTERNATIONAL SYMPOSIUM IN ROBOTICS AND MANUFACTURING AUTOMATION (ROMA) 2022. [DOI: 10.1109/roma55875.2022.9915680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Affiliation(s)
- Shamsul Fakhar Abd Gani
- Universiti Teknikal Malaysia Melaka,Fakulti Teknologi Kejuruteraan Elektrik dan Elektronik,Melaka,Malaysia
| | | | | |
Collapse
|
4
|
Novel Projection Schemes for Graph-Based Light Field Coding. SENSORS 2022; 22:s22134948. [PMID: 35808447 PMCID: PMC9269820 DOI: 10.3390/s22134948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 06/22/2022] [Accepted: 06/26/2022] [Indexed: 12/04/2022]
Abstract
In light field compression, graph-based coding is powerful to exploit signal redundancy along irregular shapes and obtains good energy compaction. However, apart from high time complexity to process high dimensional graphs, their graph construction method is highly sensitive to the accuracy of disparity information between viewpoints. In real-world light field or synthetic light field generated by computer software, the use of disparity information for super-rays projection might suffer from inaccuracy due to vignetting effect and large disparity between views in the two types of light fields, respectively. This paper introduces two novel projection schemes resulting in less error in disparity information, in which one projection scheme can also significantly reduce computation time for both encoder and decoder. Experimental results show projection quality of super-pixels across views can be considerably enhanced using the proposals, along with rate-distortion performance when compared against original projection scheme and HEVC-based or JPEG Pleno-based coding approaches.
Collapse
|
5
|
Schischmanow A, Dahlke D, Baumbach D, Ernst I, Linkiewicz M. Seamless Navigation, 3D Reconstruction, Thermographic and Semantic Mapping for Building Inspection. SENSORS 2022; 22:s22134745. [PMID: 35808239 PMCID: PMC9268807 DOI: 10.3390/s22134745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 06/10/2022] [Accepted: 06/18/2022] [Indexed: 11/26/2022]
Abstract
We present a workflow for seamless real-time navigation and 3D thermal mapping in combined indoor and outdoor environments in a global reference frame. The automated workflow and partly real-time capabilities are of special interest for inspection tasks and also for other time-critical applications. We use a hand-held integrated positioning system (IPS), which is a real-time capable visual-aided inertial navigation technology, and augment it with an additional passive thermal infrared camera and global referencing capabilities. The global reference is realized through surveyed optical markers (AprilTags). Due to the sensor data’s fusion of the stereo camera and the thermal images, the resulting georeferenced 3D point cloud is enriched with thermal intensity values. A challenging calibration approach is used to geometrically calibrate and pixel-co-register the trifocal camera system. By fusing the terrestrial dataset with additional geographic information from an unmanned aerial vehicle, we gain a complete building hull point cloud and automatically reconstruct a semantic 3D model. A single-family house with surroundings in the village of Morschenich near the city of Jülich (German federal state North Rhine-Westphalia) was used as a test site to demonstrate our workflow. The presented work is a step towards automated building information modeling.
Collapse
|
6
|
Cambuim L, Barros E. FPGA-Based Pedestrian Detection for Collision Prediction System. SENSORS (BASEL, SWITZERLAND) 2022; 22:4421. [PMID: 35746203 PMCID: PMC9230132 DOI: 10.3390/s22124421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 06/04/2022] [Accepted: 06/09/2022] [Indexed: 06/15/2023]
Abstract
Pedestrian detection (PD) systems capable of locating pedestrians over large distances and locating them faster are needed in Pedestrian Collision Prediction (PCP) systems to increase the decision-making distance. This paper proposes a performance-optimized FPGA implementation of a HOG-SVM-based PD system with support for image pyramids and detection windows of different sizes to locate near and far pedestrians. This work proposes a hardware architecture that can process one pixel per clock cycle by exploring data and temporal parallelism using techniques such as pipeline and spatial division of data between parallel processing units. The proposed architecture for the PD module was validated in FPGA and integrated with the stereo semi-global matching (SGM) module, also prototyped in FPGA. Processing two windows of different dimensions permitted a reduction in miss rate of at least 6% compared to a uniquely sized window detector. The performances achieved by the PD system and the PCP system in HD resolution were 100 and 66.2 frames per second (FPS), respectively. The performance improvement achieved by the PCP system with the addition of our PD module permitted an increase in decision-making distance of 3.3 m compared to a PCP system that processes at 30 FPS.
Collapse
|
7
|
Jang M, Yoon H, Lee S, Kang J, Lee S. A Comparison and Evaluation of Stereo Matching on Active Stereo Images. SENSORS (BASEL, SWITZERLAND) 2022; 22:3332. [PMID: 35591022 PMCID: PMC9100404 DOI: 10.3390/s22093332] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 04/19/2022] [Accepted: 04/24/2022] [Indexed: 06/15/2023]
Abstract
The relationship between the disparity and depth information of corresponding pixels is inversely proportional. Thus, in order to accurately estimate depth from stereo vision, it is important to obtain accurate disparity maps, which encode the difference between horizontal coordinates of corresponding image points. Stereo vision can be classified as either passive or active. Active stereo vision generates pattern texture, which passive stereo vision does not have, on the image to fill the textureless regions. In passive stereo vision, many surveys have discovered that disparity accuracy is heavily reliant on attributes, such as radiometric variation and color variation, and have found the best-performing conditions. However, in active stereo matching, the accuracy of the disparity map is influenced not only by those affecting the passive stereo technique, but also by the attributes of the generated pattern textures. Therefore, in this paper, we analyze and evaluate the relationship between the performance of the active stereo technique and the attributes of pattern texture. When evaluating, experiments are conducted under various settings, such as changing the pattern intensity, pattern contrast, number of pattern dots, and global gain, that may affect the overall performance of the active stereo matching technique. Through this evaluation, our discovery can act as a noteworthy reference for constructing an active stereo system.
Collapse
Affiliation(s)
- Mingyu Jang
- Department of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Korea; (M.J.); (H.Y.); (S.L.)
| | - Hyunse Yoon
- Department of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Korea; (M.J.); (H.Y.); (S.L.)
| | - Seongmin Lee
- Department of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Korea; (M.J.); (H.Y.); (S.L.)
| | - Jiwoo Kang
- Department of IT Engineering, Sookmyung Women’s University, Seoul 04310, Korea
| | - Sanghoon Lee
- Department of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Korea; (M.J.); (H.Y.); (S.L.)
- Department of Radiology, College of Medicine, Yonsei University, Seoul 03722, Korea
| |
Collapse
|
8
|
Radargrammetric DSM Generation by Semi-Global Matching and Evaluation of Penalty Functions. REMOTE SENSING 2022. [DOI: 10.3390/rs14081778] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Radargrammetry is a useful approach to generate Digital Surface Models (DSMs) and an alternative to InSAR techniques that are subject to temporal or atmospheric decorrelation. Stereo image matching in radargrammetry refers to the process of determining homologous points in two images. The performance of image matching influences the final quality of DSM used for spatial-temporal analysis of landscapes and terrain. In SAR image matching, local matching methods are commonly used but usually produce sparse and inaccurate homologous points adding ambiguity to final products; global or semi-global matching methods are seldom applied even though more accurate and dense homologous points can be yielded. To fill this gap, we propose a hierarchical semi-global matching (SGM) pipeline to reconstruct DSMs in forested and mountainous regions using stereo TerraSAR-X images. In addition, three penalty functions were implemented in the pipeline and evaluated for effectiveness. To make accuracy and efficiency comparisons between our SGM dense matching method and the local matching method, the normalized cross-correlation (NCC) local matching method was also applied to generate DSMs using the same test data. The accuracy of radargrammetric DSMs was validated against an airborne photogrammetric reference DSM and compared with the accuracy of NASA’s 30 m SRTM DEM. The results show the SGM pipeline produces DSMs with height accuracy and computing efficiency that exceeds the SRTM DEM and NCC-derived DSMs. The penalty function adopting the Canny edge detector yields a higher vertical precision than the other two evaluated penalty functions. SGM is a powerful and efficient tool to produce high-quality DSMs using stereo Spaceborne SAR images.
Collapse
|
9
|
Altingövde O, Mishchuk A, Ganeeva G, Oveisi E, Hebert C, Fua P. 3D reconstruction of curvilinear structures with stereo matching deep convolutional neural networks. Ultramicroscopy 2022; 234:113460. [PMID: 35121280 DOI: 10.1016/j.ultramic.2021.113460] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 11/21/2021] [Accepted: 12/23/2021] [Indexed: 11/25/2022]
Abstract
Curvilinear structures frequently appear in microscopy imaging as the object of interest. Crystallographic defects, i.e dislocations, are one of the curvilinear structures that have been repeatedly investigated under transmission electron microscopy (TEM) and their 3D structural information is of great importance for understanding the properties of materials. 3D information of dislocations is often obtained by tomography which is a cumbersome process since it is required to acquire many images with different tilt angles and similar imaging conditions. Although, alternative stereoscopy methods lower the number of required images to two, they still require human intervention and shape priors for accurate 3D estimation. We propose a fully automated pipeline for both detection and matching of curvilinear structures in stereo pairs by utilizing deep convolutional neural networks (CNNs) without making any prior assumption on 3D shapes. In this work, we mainly focus on 3D reconstruction of dislocations from stereo pairs of TEM images.
Collapse
Affiliation(s)
- Okan Altingövde
- Computer Vision Laboratory, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland.
| | - Anastasiia Mishchuk
- Computer Vision Laboratory, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland; Electron Spectrometry and Microscopy Laboratory, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| | - Gulnaz Ganeeva
- Electron Spectrometry and Microscopy Laboratory, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| | - Emad Oveisi
- Interdisciplinary Centre for Electron Microscopy, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| | - Cecile Hebert
- Electron Spectrometry and Microscopy Laboratory, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| | - Pascal Fua
- Computer Vision Laboratory, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| |
Collapse
|
10
|
Jeon HG, Im S, Choe J, Kang M, Lee JY, Hebert M. CMSNet: Deep Color and Monochrome Stereo. Int J Comput Vis 2022. [DOI: 10.1007/s11263-021-01565-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
11
|
Yang L, Han T, Meng J, Qian S, Yang C, Liu Z, Ding Z. Optimized number of the primary singular values for image reconstruction in reflection matrix based optical coherence tomography. OPTICS EXPRESS 2022; 30:2680-2692. [PMID: 35209403 DOI: 10.1364/oe.442672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 12/21/2021] [Indexed: 06/14/2023]
Abstract
A reflection matrix based optical coherence tomography (OCT) is recently proposed and expected to extend the imaging-depth limit twice. However, the imaging depth and hence the image quality heavily depend on the number of primary singular values considered for image reconstruction. To this regard, we propose a method based on correlation between image pairs reconstructed from different number of singular values and corresponding remainders. The obtained correlation curve and another feature curve fetched from the former are then fed to a long short-term memory (LSTM) network classifier to identify the optimized number of primary singular values for image reconstruction. Simulated targets with different combinations of filling fraction and signal-to-noise ratio (SNR) are reconstructed by the developed method as well as two current adopted methods for comparison. The results demonstrate that the proposed method is robust to recover the image with satisfactory similarity close to the reference one. To our knowledge, this is the first comprehensive study on the optimized number of the primary singular values considered for image reconstruction in reflection matrix based OCT.
Collapse
|
12
|
Jin Y, Zhao H, Bu P. Four-direction global matching with cost volume update for stereovision. APPLIED OPTICS 2021; 60:5471-5479. [PMID: 34263833 DOI: 10.1364/ao.422798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 05/26/2021] [Indexed: 06/13/2023]
Abstract
The accuracy and speed of semi-global matching (SGM) make it widely used in many computer vision problems. However, SGM often struggles in dealing with pixels in the homogeneous regions and also suffers from streak artefacts for weak smoothness constraints. Meanwhile, we observe that the global method usually fails in occluded areas. The disparities for occluded pixels are typically the average of the disparity of nearby pixels. The local method can propagate the information into occluded pixels with a similar color. In this paper, we propose a novel, to the best of our knowledge, four-direction global matching with a cost volume update scheme to cope with textureless regions and occlusion. The proposed method makes two changes in the recursive formula: a) the computation process considers four visited nodes to enforce more smooth constraints; b) the recursive formula integrates cost filtering to propagate reliable information farther in nontextured regions. Thus, our method can inherit the speed of SGM, properly avoid streaking artefacts, and deal with the occluded pixel. Extensive experiments in stereo matching on Middlebury demonstrate that our method outperforms typical SGM-based cost aggregation approaches and other state-of-the-art local methods.
Collapse
|
13
|
Wang H, Fan R, Cai P, Liu M. PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo Matching. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3068108] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
14
|
Poggi M, Tonioni A, Tosi F, Mattoccia S, Di Stefano L. Continual Adaptation for Deep Stereo. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; PP:1-1. [PMID: 33909558 DOI: 10.1109/tpami.2021.3075815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Depth estimation from stereo images is carried out with unmatched results by convolutional neural networks trained end-to-end to regress dense disparities. Like for most tasks, this is possible if large amounts of labelled samples are available for training, possibly covering the whole data distribution encountered at deployment time. Being such an assumption systematically unmet in real applications, the capacity of adapting to any unseen setting becomes of paramount importance. Purposely, we propose a continual adaptation paradigm for deep stereo networks designed to deal with challenging and ever-changing environments. We design a lightweight and modular architecture, Modularly ADaptive Network (MADNet), and formulate Modular ADaptation algorithms (MAD, MAD++) which permit efficient optimization of independent sub-portions of the entire network. In our paradigm, the learning signals needed to continuously adapt models online can be sourced from self-supervision via right-to-left image warping or from traditional stereo algorithms. With both sources, no other data than the input images being gathered at deployment time are needed. Thus, our network architecture and adaptation algorithms realize the first real-time self-adaptive deep stereo system and pave the way for a new paradigm that can facilitate practical deployment of end-to-end architectures for dense disparity regression.
Collapse
|
15
|
Yin W, Hu Y, Feng S, Huang L, Kemao Q, Chen Q, Zuo C. Single-shot 3D shape measurement using an end-to-end stereo matching network for speckle projection profilometry. OPTICS EXPRESS 2021; 29:13388-13407. [PMID: 33985073 DOI: 10.1364/oe.418881] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 03/30/2021] [Indexed: 06/12/2023]
Abstract
Speckle projection profilometry (SPP), which establishes the global correspondences between stereo images by projecting only a single speckle pattern, has the advantage of single-shot 3D reconstruction. Nevertheless, SPP suffers from the low matching accuracy of traditional stereo matching algorithms, which fundamentally limits its 3D measurement accuracy. In this work, we propose a single-shot 3D shape measurement method using an end-to-end stereo matching network for SPP. To build a high-quality SPP dataset for training the network, by combining phase-shifting profilometry (PSP) and temporal phase unwrapping techniques, high-precision absolute phase maps can be obtained to generate accurate and dense disparity maps with high completeness as the ground truth by phase matching. For the architecture of the network, a multi-scale residual subnetwork is first leveraged to synchronously extract compact feature tensors with 1/4 resolution from speckle images for constructing the 4D cost volume. Considering that the cost filtering based on 3D convolution is computationally costly, a lightweight 3D U-net network is proposed to implement efficient 4D cost aggregation. In addition, because the disparity maps in the SPP dataset should have valid values only in the foreground, a simple and fast saliency detection network is integrated to avoid predicting the invalid pixels in the occlusions and background regions, thereby implicitly enhancing the matching accuracy for valid pixels. Experiment results demonstrated that the proposed method improves the matching accuracy by about 50% significantly compared with traditional stereo matching methods. Consequently, our method achieves fast and absolute 3D shape measurement with an accuracy of about 100µm through a single speckle pattern.
Collapse
|
16
|
Tang J, Tian FP, Feng W, Li J, Tan P. Learning Guided Convolutional Network for Depth Completion. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:1116-1129. [PMID: 33290217 DOI: 10.1109/tip.2020.3040528] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Dense depth perception is critical for autonomous driving and other robotics applications. However, modern LiDAR sensors only provide sparse depth measurement. It is thus necessary to complete the sparse LiDAR data, where a synchronized guidance RGB image is often used to facilitate this completion. Many neural networks have been designed for this task. However, they often naïvely fuse the LiDAR data and RGB image information by performing feature concatenation or element-wise addition. Inspired by the guided image filtering, we design a novel guided network to predict kernel weights from the guidance image. These predicted kernels are then applied to extract the depth image features. In this way, our network generates content-dependent and spatially-variant kernels for multi-modal feature fusion. Dynamically generated spatially-variant kernels could lead to prohibitive GPU memory consumption and computation overhead. We further design a convolution factorization to reduce computation and memory consumption. The GPU memory reduction makes it possible for feature fusion to work in multi-stage scheme. We conduct comprehensive experiments to verify our method on real-world outdoor, indoor and synthetic datasets. Our method produces strong results. It outperforms state-of-the-art methods on the NYUv2 dataset and ranks 1st on the KITTI depth completion benchmark at the time of submission. It also presents strong generalization capability under different 3D point densities, various lighting and weather conditions as well as cross-dataset evaluations. The code will be released for reproduction.
Collapse
|
17
|
Sun Y, Montazeri S, Wang Y, Zhu XX. Automatic registration of a single SAR image and GIS building footprints in a large-scale urban area. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING : OFFICIAL PUBLICATION OF THE INTERNATIONAL SOCIETY FOR PHOTOGRAMMETRY AND REMOTE SENSING (ISPRS) 2020; 170:1-14. [PMID: 33299267 PMCID: PMC7694880 DOI: 10.1016/j.isprsjprs.2020.09.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 06/12/2023]
Abstract
Existing techniques of 3-D reconstruction of buildings from SAR images are mostly based on multibaseline SAR interferometry, such as PSI and SAR tomography (TomoSAR). However, these techniques require tens of images for a reliable reconstruction, which limits the application in various scenarios, such as emergency response. Therefore, alternatives that use a single SAR image and the building footprints from GIS data show their great potential in 3-D reconstruction. The combination of GIS data and SAR images requires a precise registration, which is challenging due to the unknown terrain height, and the difficulty in finding and extracting the correspondence. In this paper, we propose a framework to automatically register GIS building footprints to a SAR image by exploiting the features representing the intersection of ground and visible building facades, specifically the near-range boundaries in the building polygons, and the double bounce lines in the SAR image. Based on those features, the two data sets are registered progressively in multiple resolutions, allowing the algorithm to cope with variations in the local terrain. The proposed framework was tested in Berlin using one TerraSAR-X High Resolution SpotLight image and GIS building footprints of the area. Comparing to the ground truth, the proposed algorithm reduced the average distance error from 5.91 m before the registration to -0.08 m, and the standard deviation from 2.77 m to 1.12 m. Such accuracy, better than half of the typical urban floor height (3 m), is significant for precise building height reconstruction on a large scale. The proposed registration framework has great potential in assisting SAR image interpretation in typical urban areas and building model reconstruction from SAR images.
Collapse
Affiliation(s)
- Yao Sun
- Remote Sensing Technology Institute, German Aerospace Center (DLR), Münchener Straße 20, 82234 Weßling, Germany
| | - Sina Montazeri
- Remote Sensing Technology Institute, German Aerospace Center (DLR), Münchener Straße 20, 82234 Weßling, Germany
| | - Yuanyuan Wang
- Remote Sensing Technology Institute, German Aerospace Center (DLR), Münchener Straße 20, 82234 Weßling, Germany
- Signal Processing in Earth Observation, Technical University of Munich, Arcisstraße 21, 80333 Munich, Germany
| | - Xiao Xiang Zhu
- Remote Sensing Technology Institute, German Aerospace Center (DLR), Münchener Straße 20, 82234 Weßling, Germany
- Signal Processing in Earth Observation, Technical University of Munich, Arcisstraße 21, 80333 Munich, Germany
| |
Collapse
|
18
|
Yin W, Zhong J, Feng S, Tao T, Han J, Huang L, Chen Q, Zuo C. Composite deep learning framework for absolute 3D shape measurement based on single fringe phase retrieval and speckle correlation. JPHYS PHOTONICS 2020. [DOI: 10.1088/2515-7647/abbcd9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
19
|
Loquercio A, Dosovitskiy A, Scaramuzza D. Learning Depth With Very Sparse Supervision. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3009067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
20
|
Meier K, Chung S, Hutchinson S. River segmentation for autonomous surface vehicle localization and river boundary mapping. J FIELD ROBOT 2020. [DOI: 10.1002/rob.21989] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Kevin Meier
- Sensors and Electron Devices Directorate, Sensors Division, United States Army Research Laboratory Adelphi Laboratory Center Adelphi Maryland USA
| | - Soon‐Jo Chung
- Department of Aerospace California Institute of Technology Pasadena California USA
| | - Seth Hutchinson
- Institute for Robotics and Intelligent Machines Georgia Institute of Technology Atlanta Georgia USA
| |
Collapse
|
21
|
Stereo Dense Image Matching by Adaptive Fusion of Multiple-Window Matching Results. REMOTE SENSING 2020. [DOI: 10.3390/rs12193138] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Traditional stereo dense image matching (DIM) methods normally predefine a fixed window to compute matching cost, while their performances are limited by the matching window sizes. A large matching window usually achieves robust matching results in weak-textured regions, while it may cause over-smoothness problems in disparity jumps and fine structures. A small window can recover sharp boundaries and fine structures, while it contains high matching uncertainties in weak-textured regions. To address the issue above, we respectively compute matching results with different matching window sizes and then proposes an adaptive fusion method of these matching results so that a better matching result can be generated. The core algorithm designs a Convolutional Neural Network (CNN) to predict the probabilities of large and small windows for each pixel and then refines these probabilities by imposing a global energy function. A compromised solution of the global energy function is utilized by breaking the optimization into sub-optimizations of each pixel in one-dimensional (1D) paths. Finally, the matching results of large and small windows are fused by taking the refined probabilities as weights for more accurate matching. We test our method on aerial image datasets, satellite image datasets, and Middlebury benchmark with different matching cost metrics. Experiments show that our proposed adaptive fusion of multiple-window matching results method has a good transferability across different datasets and outperforms the small windows, the median windows, the large windows, and some state-of-the-art matching window selection methods.
Collapse
|
22
|
Okae J, Du J, Huang T. A novel depth estimation approach based on bidirectional matching for stereo vision systems. Adv Robot 2020. [DOI: 10.1080/01691864.2020.1803127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- J. Okae
- Guangdong Provincial Engineering Laboratory for Advanced Chip Intelligent Packaging Equipment, South China University of Technology, Guangzhou, People's Republic of China
| | - J. Du
- Guangdong Provincial Engineering Laboratory for Advanced Chip Intelligent Packaging Equipment, South China University of Technology, Guangzhou, People's Republic of China
| | - T. Huang
- Guangdong Provincial Engineering Laboratory for Advanced Chip Intelligent Packaging Equipment, South China University of Technology, Guangzhou, People's Republic of China
| |
Collapse
|
23
|
Zeglazi O, Rziza M, Amine A, Demonceaux C. Structural Similarity Measurement Based Cost Function for Stereo Matching of Automotive Applications. J Imaging 2020; 6:jimaging6080077. [PMID: 34460692 PMCID: PMC8321088 DOI: 10.3390/jimaging6080077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 07/21/2020] [Accepted: 07/28/2020] [Indexed: 11/23/2022] Open
Abstract
The human visual perception uses structural information to recognize stereo correspondences in natural scenes. Therefore, structural information is important to build an efficient stereo matching algorithm. In this paper, we demonstrate that incorporating the structural information similarity, extracted either from image intensity (SSIM) directly or from image gradients (GSSIM), between two patches can accurately describe the patch structures and, thus, provides more reliable initial cost values. We also address one of the major phenomenons faced in stereo matching for real world scenes, radiometric changes. The performance of the proposed cost functions was evaluated within two stages: the first one considers these costs without aggregation process while the second stage uses the fast adaptive aggregation technique. The experiments were conducted on the real road traffic scenes KITTI 2012 and KITTI 2015 benchmarks. The obtained results demonstrate the potential merits of the proposed stereo similarity measurements under radiometric changes.
Collapse
Affiliation(s)
- Oussama Zeglazi
- LRIT, Rabat IT Center, Faculty of Sciences, Mohammed V University, Rabat B.P. 1014, Morocco;
- Correspondence: ; Tel.: +212-670-281-916
| | - Mohammed Rziza
- LRIT, Rabat IT Center, Faculty of Sciences, Mohammed V University, Rabat B.P. 1014, Morocco;
| | - Aouatif Amine
- LGS, National School of Applied Sciences, Ibn Tofail University, Kenitra B.P. 241, Morocco;
| | - Cédric Demonceaux
- ERL VIBOT CNRS 6000, ImViA, Université Bourgogne Franche-Comté, 71200 Le Creusot, France;
| |
Collapse
|
24
|
A Real-Time Infrared Stereo Matching Algorithm for RGB-D Cameras’ Indoor 3D Perception. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2020. [DOI: 10.3390/ijgi9080472] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Low-cost, commercial RGB-D cameras have become one of the main sensors for indoor scene 3D perception and robot navigation and localization. In these studies, the Intel RealSense R200 sensor (R200) is popular among many researchers, but its integrated commercial stereo matching algorithm has a small detection range, short measurement distance and low depth map resolution, which severely restrict its usage scenarios and service life. For these problems, on the basis of the existing research, a novel infrared stereo matching algorithm that combines the idea of the semi-global method and sliding window is proposed in this paper. First, the R200 is calibrated. Then, through Gaussian filtering, the mutual information and correlation between the left and right stereo infrared images are enhanced. According to mutual information, the dynamic threshold selection in matching is realized, so the adaptability to different scenes is improved. Meanwhile, the robustness of the algorithm is improved by the Sobel operators in the cost calculation of the energy function. In addition, the accuracy and quality of disparity values are improved through a uniqueness test and sub-pixel interpolation. Finally, the BundleFusion algorithm is used to reconstruct indoor 3D surface models in different scenarios, which proved the effectiveness and superiority of the stereo matching algorithm proposed in this paper.
Collapse
|
25
|
Deep Color Transfer for Color-Plus-Mono Dual Cameras. SENSORS 2020; 20:s20092743. [PMID: 32403436 PMCID: PMC7249219 DOI: 10.3390/s20092743] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 05/03/2020] [Accepted: 05/06/2020] [Indexed: 11/17/2022]
Abstract
A few approaches have studied image fusion using color-plus-mono dual cameras to improve the image quality in low-light shooting. Among them, the color transfer approach, which transfers the color information of a color image to a mono image, is considered to be promising for obtaining improved images with less noise and more detail. However, the color transfer algorithms rely heavily on appropriate color hints from a given color image. Unreliable color hints caused by errors in stereo matching of a color-plus-mono image pair can generate various visual artifacts in the final fused image. This study proposes a novel color transfer method that seeks reliable color hints from a color image and colorizes a corresponding mono image with reliable color hints that are based on a deep learning model. Specifically, a color-hint-based mask generation algorithm is developed to obtain reliable color hints. It removes unreliable color pixels using a reliability map computed by the binocular just-noticeable-difference model. In addition, a deep colorization network that utilizes structural information is proposed for solving the color bleeding artifact problem. The experimental results demonstrate that the proposed method provides better results than the existing image fusion algorithms for dual cameras.
Collapse
|
26
|
Improved Cost Computation and Adaptive Shape Guided Filter for Local Stereo Matching of Low Texture Stereo Images. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10051869] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Dense stereo matching has been widely used in photogrammetry and computer vision applications. Even though it has a long research history, dense stereo matching is still challenging for occluded, textureless and discontinuous regions. This paper proposed an efficient and effective matching cost measurement and an adaptive shape guided filter-based matching cost aggregation method to improve the stereo matching performance for large textureless regions. At first, an efficient matching cost function combining enhanced image gradient-based matching cost and improved census transform-based matching cost is introduced. This proposed matching cost function is robust against radiometric variations and textureless regions. Following this, an adaptive shape cross-based window is constructed for each pixel and a modified guided filter based on this adaptive shape window is implemented for cost aggregation. The final disparity map is obtained after disparity selection and multiple steps disparity refinement. Experiments were conducted on the Middlebury benchmark dataset to evaluate the effectiveness of the proposed cost measurement and cost aggregation strategy. The experimental results demonstrated that the average matching error rate on Middlebury standard image pairs is 9.40%. Compared with the traditional guided filter-based stereo matching method, the proposed method achieved a better matching result in textureless regions.
Collapse
|
27
|
Zhou H, Jagadeesan J. Real-Time Dense Reconstruction of Tissue Surface From Stereo Optical Video. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:400-412. [PMID: 31283478 PMCID: PMC6946894 DOI: 10.1109/tmi.2019.2927436] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We propose an approach to reconstruct dense three-dimensional (3D) model of tissue surface from stereo optical videos in real-time, the basic idea of which is to first extract 3D information from video frames by using stereo matching, and then to mosaic the reconstructed 3D models. To handle the common low-texture regions on tissue surfaces, we propose effective post-processing steps for the local stereo matching method to enlarge the radius of constraint, which include outliers removal, hole filling, and smoothing. Since the tissue models obtained by stereo matching are limited to the field of view of the imaging modality, we propose a model mosaicking method by using a novel feature-based simultaneously localization and mapping (SLAM) method to align the models. Low-texture regions and the varying illumination condition may lead to a large percentage of feature matching outliers. To solve this problem, we propose several algorithms to improve the robustness of the SLAM, which mainly include 1) a histogram voting-based method to roughly select possible inliers from the feature matching results; 2) a novel 1-point RANSAC-based [Formula: see text] algorithm called as DynamicR1PP [Formula: see text] to track the camera motion; and 3) a GPU-based iterative closest points (ICP) and bundle adjustment (BA) method to refine the camera motion estimation results. Experimental results on ex- and in vivo data showed that the reconstructed 3D models have high-resolution texture with an accuracy error of less than 2 mm. Most algorithms are highly parallelized for GPU computation, and the average runtime for processing one key frame is 76.3 ms on stereo images with 960×540 resolution.
Collapse
|
28
|
Hernandez-Beltran JE, Diaz-Ramirez VH, Juarez-Salazar R. Adaptive matched filter for implicit-target recognition: application in three-dimensional reconstruction. APPLIED OPTICS 2019; 58:8920-8930. [PMID: 31873670 DOI: 10.1364/ao.58.008920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 10/20/2019] [Indexed: 06/10/2023]
Abstract
The design of matched filters for optical correlators requires explicit knowledge of the shape of the target. This requirement limits its usefulness in applications where the appearance of the target is unspecified or dynamically changing. This research presents the design of an adaptive correlation filter by the optimization of the mean-squared-error criterion when the shape of the target is implicit and embedded on a cluttered background with unknown statistics in the reference image. For this, estimators to obtain the region of support of the target as well as statistical parameters of additive and nonoverlapping noise of the scene are proposed. The performance of the proposed filter is analyzed in terms of detection efficiency and location accuracy of an implicit target in the context of stereo matching and three-dimensional reconstruction.
Collapse
|
29
|
Dense Image-Matching via Optical Flow Field Estimation and Fast-Guided Filter Refinement. REMOTE SENSING 2019. [DOI: 10.3390/rs11202410] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The development of an efficient and robust method for dense image-matching has been a technical challenge due to high variations in illumination and ground features of aerial images of large areas. In this paper, we propose a method for the dense matching of aerial images using an optical flow field and a fast-guided filter. The proposed method utilizes a coarse-to-fine matching strategy for a pixel-wise correspondence search across stereo image pairs. The pyramid Lucas–Kanade (L–K) method is first used to generate a sparse optical flow field within the stereo image pairs, and an adjusted control lattice is then used to derive the multi-level B-spline interpolating function for estimating the dense optical flow field. The dense correspondence is subsequently refined through a combination of a novel cross-region-based voting process and fast guided filtering. The performance of the proposed method was evaluated on three bases, namely, the matching accuracy, the matching success rate, and the matching efficiency. The evaluative experiments were performed using sets of unmanned aerial vehicle (UAV) images and aerial digital mapping camera (DMC) images. The results showed that the proposed method afforded the root mean square error (RMSE) of the reprojection errors better than ±0.5 pixels in image, and a height accuracy within ±2.5 GSD (ground sampling distance) from the ground. The method was further compared with the state-of-the-art commercial software SURE and confirmed to deliver more complete matches for images with poor-texture areas, the matching success rate of the proposed method is higher than 97% while SURE is 96%, and there is 47% higher matching efficiency. This demonstrates the superior applicability of the proposed method to aerial image-based dense matching with poor texture regions.
Collapse
|
30
|
Choi N, Jang J, Paik J. Illuminant-invariant stereo matching using cost volume and confidence-based disparity refinement. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2019; 36:1768-1776. [PMID: 31674442 DOI: 10.1364/josaa.36.001768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 09/10/2019] [Indexed: 06/10/2023]
Abstract
In stereo-matching techniques for three-dimensional (3D) vision, illumination change is a major problem that degrades matching accuracy. When large intensity differences are observed between a pair of stereos, it is difficult to find the similarity in the matching process. In addition, inaccurately estimated disparities are obtained in textureless regions, since there are no distinguishable features in the region. To solve these problems, this paper presents a robust stereo-matching method using illuminant-invariant cost volume and confidence-based disparity refinement. In the step of matching a stereo pair, the proposed method combines two cost volumes using an invariant image and Weber local descriptor (WLD), which was originally motivated by human visual characteristics. The invariant image used in the matching step is insensitive to sudden brightness changes by shadow or light sources, and WLD reflects structural features of the invariant image with consideration of a gradual illumination change. After aggregating the cost using a guided filter, we refine the initially estimated disparity map based on the confidence map computed by the combined cost volume. Experimental results verify that the matching computation of the proposed method improves the accuracy of the disparity map under a radiometrically dynamic environment. Since the proposed disparity refinement method can also reduce the error of the initial disparity map in textureless areas, it can be applied to various 3D vision systems such as industrial robots and autonomous vehicles.
Collapse
|
31
|
Abstract
In this work, we introduce an end-to-end workflow for very high-resolution satellite-based mapping, building the basis for important 3D mapping products: (1) digital surface model, (2) digital terrain model, (3) normalized digital surface model and (4) ortho-rectified image mosaic. In particular, we describe all underlying principles for satellite-based 3D mapping and propose methods that extract these products from multi-view stereo satellite imagery. Our workflow is demonstrated for the Pléiades satellite constellation, however, the applied building blocks are more general and thus also applicable for different setups. Besides introducing the overall end-to-end workflow, we need also to tackle single building blocks: optimization of sensor models represented by rational polynomials, epipolar rectification, image matching, spatial point intersection, data fusion, digital terrain model derivation, ortho rectification and ortho mosaicing. For each of these steps, extensions to the state-of-the-art are proposed and discussed in detail. In addition, a novel approach for terrain model generation is introduced. The second aim of the study is a detailed assessment of the resulting output products. Thus, a variety of data sets showing different acquisition scenarios are gathered, allover comprising 24 Pléiades images. First, the accuracies of the 2D and 3D geo-location are analyzed. Second, surface and terrain models are evaluated, including a critical look on the underlying error metrics and discussing the differences of single stereo, tri-stereo and multi-view data sets. Overall, 3D accuracies in the range of 0 . 2 to 0 . 3 m in planimetry and 0 . 2 to 0 . 4 m in height are achieved w.r.t. ground control points. Retrieved surface models show normalized median absolute deviations around 0 . 9 m in comparison to reference LiDAR data. Multi-view stereo outperforms single stereo in terms of accuracy and completeness of the resulting surface models.
Collapse
|
32
|
Evaluation of Matching Costs for High-Quality Sea-Ice Surface Reconstruction from Aerial Images. REMOTE SENSING 2019. [DOI: 10.3390/rs11091055] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Satellite remote sensing can be used effectively with a wide coverage and repeatability in large-scale Arctic sea-ice analysis. To produce reliable sea-ice information, satellite remote-sensing methods should be established and validated using accurate field data, but obtaining field data on Arctic sea-ice is very difficult due to limited accessibility. In this situation, digital surface models derived from aerial images can be a good alternative to topographical field data. However, to achieve this, we should discuss an additional issue, i.e., that low-textured surfaces on sea-ice can reduce the matching accuracy of aerial images. The matching performance is dependent on the matching cost and search window size used. Therefore, in order to generate high-quality sea-ice surface models, we first need to examine the influence of matching costs and search window sizes on the matching performance on low-textured sea-ice surfaces. For this reason, in this study, we evaluate the performance of matching costs in relation to changes of the search window size, using acquired aerial images of Arctic sea-ice. The evaluation concerns three factors. The first is the robustness of matching to low-textured surfaces. Matching costs for generating sea-ice surface models should have a high discriminatory power on low-textured surfaces, even with small search windows. To evaluate this, we analyze the accuracy, uncertainty, and optimal window size in terms of template matching. The second is the robustness of positioning to low-textured surfaces. One of the purposes of image matching is to determine the positions of object points that constitute digital surface models. From this point of view, we analyze the accuracy and uncertainty in terms of positioning object points. The last is the processing speed. Since the computation complexity is also an important performance indicator, we analyze the elapsed time for each of the processing steps. The evaluation results showed that the image domain costs were more effective for low-textured surfaces than the frequency domain costs. In terms of matching robustness, the image domain costs showed a better performance, even with smaller search windows. In terms of positioning robustness, the image domain costs also performed better because of the lower uncertainty. Lastly, in terms of processing speed, the PC (phase correlation) of the frequency domain showed the best performance, but the image domain costs, except MI (mutual information), were not far behind. From the evaluation results, we concluded that, among the compared matching costs, ZNCC (zero-mean normalized cross-correlation) is the most effective for sea-ice surface model generation. In addition, we found that it is necessary to adjust search window sizes properly, according to the number of textures required for reliable image matching on sea-ice surfaces, and that various uncertainties due to low-textured surfaces should be considered to determine the positions of object points.
Collapse
|
33
|
Lim J, Lee S. Patchmatch-Based Robust Stereo Matching Under Radiometric Changes. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:1203-1212. [PMID: 29993771 DOI: 10.1109/tpami.2018.2819662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In the real world, the two challenges of stereo vision system include a robust system under various radiometric changes and real-time process. To extract depth information from stereoscopic images, this paper proposes Patchmatch-based robust and fast stereo matching under radiometric changes. For this, a cost function was designed and minimized for estimating an accurate disparity map. Specifically, we used a prior probability to minimize the occlusion region and a smoothness term that considers convexity of objects to extract a fine disparity map. For evaluating the performance of the proposed scheme, we used Middlebury stereo data sets with radiometric changes. The experimental result showed that the proposed method outperforms state-of-the-art methods by up to 3.35 percent better and a range of 4.71 - 27.24 times faster result in terms of bad pixel error and processing time, respectively. Therefore, we believe that the proposed scheme can be a useful tool for computer vision-based applications.
Collapse
|
34
|
Abstract
Stereo matching has been under development for decades and is an important process for many applications. Difficulties in stereo matching include textureless regions, occlusion, illumination variation, the fattening effect, and discontinuity. These challenges are effectively solved in recently developed stereo matching algorithms. A new imperfect rectification problem has recently been encountered in stereo matching, and the problem results from the high resolution of stereo images. State-of-the-art stereo matching algorithms fail to exactly reconstruct the depth information using stereo images with imperfect rectification, as the imperfectly rectified image problems are not explicitly taken into account. In this paper, we solve the imperfect rectification problems, and propose matching stereo matching methods that based on absolute differences, square differences, normalized cross correlation, zero-mean normalized cross correlation, and rank and census transforms. Finally, we conduct experiments to evaluate these stereo matching methods using the Middlebury datasets. The experimental results show the proposed stereo matching methods can reduce error rate significantly for stereo images with imperfect rectification.
Collapse
|
35
|
Fusion of Multi-Sensor-Derived Heights and OSM-Derived Building Footprints for Urban 3D Reconstruction. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2019. [DOI: 10.3390/ijgi8040193] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
So-called prismatic 3D building models, following the level-of-detail (LOD) 1 of the OGC City Geography Markup Language (CityGML) standard, are usually generated automatically by combining building footprints with height values. Typically, high-resolution digital elevation models (DEMs) or dense LiDAR point clouds are used to generate these building models. However, high-resolution LiDAR data are usually not available with extensive coverage, whereas globally available DEM data are often not detailed and accurate enough to provide sufficient input to the modeling of individual buildings. Therefore, this paper investigates the possibility of generating LOD1 building models from both volunteered geographic information (VGI) in the form of OpenStreetMap data and remote sensing-derived geodata improved by multi-sensor and multi-modal DEM fusion techniques or produced by synthetic aperture radar (SAR)-optical stereogrammetry. The results of this study show several things: First, it can be seen that the height information resulting from data fusion is of higher quality than the original data sources. Secondly, the study confirms that simple, prismatic building models can be reconstructed by combining OpenStreetMap building footprints and easily accessible, remote sensing-derived geodata, indicating the potential of application on extensive areas. The building models were created under the assumption of flat terrain at a constant height, which is valid in the selected study area.
Collapse
|
36
|
Accuracy Analysis of a 3D Model of Excavation, Created from Images Acquired with an Action Camera from Low Altitudes. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2019. [DOI: 10.3390/ijgi8020083] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In the last few years, Unmanned Aerial Vehicles (UAVs) equipped with compact digital cameras, have become a cheap and efficient alternative to classic aerial photogrammetry and close-range photogrammetry. Low-altitude photogrammetry has great potential not only in the development of orthophoto maps but is also increasingly used in surveying and rapid mapping. This paper presents a practical aspect of the application of the custom homemade low-cost UAV, equipped with an action camera, to obtain images from low altitudes and develop a digital elevation model of the excavation. The conducted analyses examine the possibilities of using low-cost UAVs to deliver useful photogrammetric products. The experiments were carried out on a closed excavation in the town of Mince (north-eastern Poland). The flight over the examined area was carried out autonomously. A photogrammetric network was designed, and the reference areas in the mine were measured using the Global Navigation Satellite System-Real Time Kinematic (GNSS-RTK) method to perform accuracy analyses of the excavation 3D model. Representation of the created numerical terrain model was a dense point cloud. The average height difference between the generated dense point cloud and the reference model was within the range of 0.01–0.13 m. The difference between the volume of the excavation measured by the GNSS kinematic method and the volume measured on the basis of a dense point cloud was less than 1%. The obtained results show that the application of the low-cost UAV equipped with an action camera with a wide-angle lens, allows for obtaining high-accuracy images comparable to classic, compact digital cameras.
Collapse
|
37
|
Jeon HG, Park J, Choe G, Park J, Bok Y, Tai YW, Kweon IS. Depth from a Light Field Image with Learning-Based Matching Costs. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:297-310. [PMID: 29994179 DOI: 10.1109/tpami.2018.2794979] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
One of the core applications of light field imaging is depth estimation. To acquire a depth map, existing approaches apply a single photo-consistency measure to an entire light field. However, this is not an optimal choice because of the non-uniform light field degradations produced by limitations in the hardware design. In this paper, we introduce a pipeline that automatically determines the best configuration for photo-consistency measure, which leads to the most reliable depth label from the light field. We analyzed the practical factors affecting degradation in lenslet light field cameras, and designed a learning based framework that can retrieve the best cost measure and optimal depth label. To enhance the reliability of our method, we augmented an existing light field benchmark to simulate realistic source dependent noise, aberrations, and vignetting artifacts. The augmented dataset was used for the training and validation of the proposed approach. Our method was competitive with several state-of-the-art methods for the benchmark and real-world light field datasets.
Collapse
|
38
|
Bao Y, Tang L, Breitzman MW, Salas Fernandez MG, Schnable PS. Field‐based robotic phenotyping of sorghum plant architecture using stereo vision. J FIELD ROBOT 2018. [DOI: 10.1002/rob.21830] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- Yin Bao
- Department of Agricultural and Biosystems Engineering Iowa State University Ames Iowa
| | - Lie Tang
- Department of Agricultural and Biosystems Engineering Iowa State University Ames Iowa
| | | | | | | |
Collapse
|
39
|
Abstract
Planet Labs have recently launched a large constellation of small satellites (3U cubesats) capable of imaging the whole Earth landmass everyday. These small satellites capture multiple images of an area on consecutive days or sometimes on the same day with a spatial resolution of 3–4 m. Planet Labs endeavors to operate the constellation in a nadir pointing mode, however, the view angle of these satellites currently varies within a few degrees from the nadir leading to varying B/H ratio for overlapping image pairs. Due to relatively small scene footprint and small off-nadir angle, the baseline to height ratio (B/H) of the overlapping PlanetScope images is often less than 1:10, which is not ideal for 3D reconstruction. Therefore, this paper explores the potential of Digital Elevation Model generation from this multi-date, multi-satellite PlanetScope imagery. The DEM generation from multiple PlanetScope images is achieved using a volumetric stereo reconstruction technique, which applies semi global matching in georeferenced object space. The results are evaluated using a LiDAR based DEM (5 m) over Mount Teide (3718 m) in Canary Islands and the ALOS (30 m) DEM on rugged terrain of the Nanga Parbat massif (8126 m) in the western Himalaya range. The proposed methodology is then applied on images from two PlanetScope satellites overpasses within a couple of minutes difference to compute the DEM of the Khurdopin glacier in the Karakoram range, known for its recent surge. The quantitative assessment of the generated elevation models is done by comparing statistics of the elevation differences between the reference LiDAR and ALOS DEM and the PlanetScope DEM. The Normalized Median of Absolute Deviation (NMAD) of the elevation differences between the computed PlanetScope DEM and LiDAR DEM is 4.1 m and the elevation differences for the ALOS DEM over stable terrain is 3.9 m. The results show that PlanetScope imagery can lead to sufficient quality DEM even with a small baseline to height ratio. Therefore, the daily PlanetScope imagery is a valuable data source and the DEM generated from this imagery can potentially be employed in numerous applications requiring multi temporal DEMs.
Collapse
|
40
|
|
41
|
Kim S, Jang J, Lim J, Paik J, Lee S. Disparity-selective stereo matching using correlation confidence measure. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2018; 35:1653-1662. [PMID: 30183001 DOI: 10.1364/josaa.35.001653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 08/08/2018] [Indexed: 06/08/2023]
Abstract
Recently, the cost-volume filtering (CVF) methods for local stereo matching have provided fast and accurate results compared to those of the other method. However, CVF still causes incorrect results in the occlusion and texture-free regions. In particular, cost aggregation by pixel units involves complex computation because of its dependence on the image resolution and search range. This paper presents a robust stereo matching method for occluded regions. First, we generate cost volumes using the CENSUS transform and the scale-invariant feature transform (SIFT). Then, label-based cost volumes are aggregated using adaptive support weight and the simple linear iterative clustering (SLIC) scheme from two generated cost volumes. In order to obtain optimal disparity by two label-based cost volumes, we select the disparity corresponding to high confidence similarity of CENSUS or SIFT with minimum cost point. Experimental results show that our method estimates the optimal disparity in occlusion information, which exists only in the scene of one of the stereo pairs.
Collapse
|
42
|
A hierarchical stereo matching algorithm based on adaptive support region aggregation method. Pattern Recognit Lett 2018. [DOI: 10.1016/j.patrec.2018.07.020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
43
|
Ahmed S, Hansard M, Cavallaro A. Constrained Optimization for Plane-Based Stereo. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:3870-3882. [PMID: 29727272 DOI: 10.1109/tip.2018.2823543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Depth and surface normal estimation are crucial components in understanding 3D scene geometry from calibrated stereo images. In this paper, we propose visibility and disparity magnitude constraints for slanted patches in the scene. These constraints can be used to associate geometrically feasible planes with each point in the disparity space. The new constraints are validated in the PatchMatch Stereo framework. We use these new constraints not only for initialization, but also in the local plane refinement step of this iterative algorithm. The proposed constraints increase the probability of estimating correct plane parameters, and lead to an improved 3D reconstruction of the scene. Furthermore, the proposed constrained initialization reduces the number of iterations before convergence to the optimal plane parameters. In addition, as most stereo image pairs are not perfectly rectified, we modify the view propagation process by assigning the plane parameters to the neighbors of the candidate pixel. To update the plane parameters in the plane refinement step, we use a gradient free non-linear optimizer. The benefits of the new initialization, propagation, and refinement schemes are demonstrated.
Collapse
|
44
|
A Miniature Binocular Endoscope with Local Feature Matching and Stereo Matching for 3D Measurement and 3D Reconstruction. SENSORS 2018; 18:s18072243. [PMID: 30002288 PMCID: PMC6069142 DOI: 10.3390/s18072243] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 06/17/2018] [Accepted: 06/28/2018] [Indexed: 11/17/2022]
Abstract
As the traditional single camera endoscope can only provide clear images without 3D measurement and 3D reconstruction, a miniature binocular endoscope based on the principle of binocular stereoscopic vision to implement 3D measurement and 3D reconstruction in tight and restricted spaces is presented. In order to realize the exact matching of points of interest in the left and right images, a novel construction method of the weighted orthogonal-symmetric local binary pattern (WOS-LBP) descriptor is presented. Then a stereo matching algorithm based on Gaussian-weighted AD-Census transform and improved cross-based adaptive regions is studied to realize 3D reconstruction for real scenes. In the algorithm, we adjust determination criterions of adaptive regions for edge and discontinuous areas in particular and as well extract mismatched pixels caused by occlusion through image entropy and region-growing algorithm. This paper develops a binocular endoscope with an external diameter of 3.17 mm and the above algorithms are applied in it. The endoscope contains two CMOS cameras and four fiber optics for illumination. Three conclusions are drawn from experiments: (1) the proposed descriptor has good rotation invariance, distinctiveness and robustness to light change as well as noises; (2) the proposed stereo matching algorithm has a mean relative error of 8.48% for Middlebury standard pairs of images and compared with several classical stereo matching algorithms, our algorithm performs better in edge and discontinuous areas; (3) the mean relative error of length measurement is 3.22%, and the endoscope can be utilized to measure and reconstruct real scenes effectively.
Collapse
|
45
|
Zhu D, Li J, Wang X, Peng J, Shi W, Zhang X. Semantic Edge Based Disparity Estimation Using Adaptive Dynamic Programming for Binocular Sensors. SENSORS 2018; 18:s18041074. [PMID: 29614028 PMCID: PMC5949043 DOI: 10.3390/s18041074] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 03/27/2018] [Accepted: 03/30/2018] [Indexed: 11/16/2022]
Abstract
Disparity calculation is crucial for binocular sensor ranging. The disparity estimation based on edges is an important branch in the research of sparse stereo matching and plays an important role in visual navigation. In this paper, we propose a robust sparse stereo matching method based on the semantic edges. Some simple matching costs are used first, and then a novel adaptive dynamic programming algorithm is proposed to obtain optimal solutions. This algorithm makes use of the disparity or semantic consistency constraint between the stereo images to adaptively search parameters, which can improve the robustness of our method. The proposed method is compared quantitatively and qualitatively with the traditional dynamic programming method, some dense stereo matching methods, and the advanced edge-based method respectively. Experiments show that our method can provide superior performance on the above comparison.
Collapse
Affiliation(s)
- Dongchen Zhu
- Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Jiamao Li
- Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
| | - Xianshun Wang
- Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Jingquan Peng
- Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.
| | - Wenjun Shi
- Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Xiaolin Zhang
- Bio-Vision System Laboratory, State Key Laboratory of Transducer Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.
| |
Collapse
|
46
|
Zhang J, Liu Z, Nezan JF, Zhang G. Correspondence matching among stereo images with object flow and minimum spanning tree aggregation. INT J ADV ROBOT SYST 2018. [DOI: 10.1177/1729881418760986] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Affiliation(s)
- Jinglin Zhang
- Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China
| | - Zhiwei Liu
- The 27th Research Institute of China Electronics Technology Group Corporation, Zhengzhou, China
| | | | - Guoyu Zhang
- Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China
| |
Collapse
|
47
|
Khan A, Khan MUK, Kyung CM. Intensity guided cost metric for fast stereo matching under radiometric variations. OPTICS EXPRESS 2018; 26:4096-4111. [PMID: 29475264 DOI: 10.1364/oe.26.004096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 01/30/2018] [Indexed: 06/08/2023]
Abstract
Reliable and efficient stereo matching is a challenging task due to the presence of multiple radiometric variations. In stereo matching, correspondence between left and right images can become hard owing to low correlation between radiometric changes in left and right images. Previously presented cost metrics are not robust enough against intensive radiometric variations and/or are computationally expensive. In this work, we propose a new similarity metric coined as Intensity Guided Cost Metric (IGCM). IGCM turns out to significantly contribute to the depth accuracy by rejecting outliers and reducing the edge-fattening effect in object boundaries. IGCM is further combined explicitly with a color formation model to handle various radiometric changes that occur between stereo images. Experimental results on Middlebury dataset show 13.8%, 22.8%, 20.9%, 19.5 % and 9.1% decrease in average error rate compared to Adaptive Normalized Cross-Correlation (ANCC), Dense Adaptive Self-Correlation (DASC), Adaptive Descriptor(AD), Fast Cost Volume Filtering (FCVF) and Iterative Guided Filter (IGF)-based methods, respectively. Moreover, using integral images IGCM can achieve a speedup of 20x, 6x, 41x, 25x and 45x compared to the aforementioned methods.
Collapse
|
48
|
Wah BW. Fundamental Principles on Learning New Features for Effective Dense Matching. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:822-836. [PMID: 28920900 DOI: 10.1109/tip.2017.2752370] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In dense matching (including stereo matching and optical flow), nearly all existing approaches are based on simple features, such as gray or RGB color, gradient or simple transformations like census, to calculate matching costs. These features do not perform well in complex scenes that may involve radiometric changes, noises, overexposure and/or textureless regions. Various problems may appear, such as wrong matching at the pixel or region level, flattening/breaking of edges and/or even entire structural collapse. In this paper, we propose two fundamental principles based on the consistency and the distinctiveness of features. We show that almost all existing problems in dense matching are caused by features that violate one or both of these principles. To systematically learn good features for dense matching, we develop a general multi-objective optimization based on these two principles and apply convolutional neural networks to find new features that lie on the Pareto frontier. By using two-frame optical flow and stereo matching as applications, our experimental results show that the features learned can significantly improve the performance of state-of-the-art approaches. Based on the KITTI benchmarks, our method ranks first on the two stereo benchmarks and is the best among existing two-frame optical-flow algorithms on flow benchmarks.
Collapse
|
49
|
Gil G, Savino G, Piantini S, Pierini M. Motorcycle That See: Multifocal Stereo Vision Sensor for Advanced Safety Systems in Tilting Vehicles. SENSORS 2018; 18:s18010295. [PMID: 29351267 PMCID: PMC5795592 DOI: 10.3390/s18010295] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 01/17/2018] [Accepted: 01/17/2018] [Indexed: 11/16/2022]
Abstract
Advanced driver assistance systems, ADAS, have shown the possibility to anticipate crash accidents and effectively assist road users in critical traffic situations. This is not the case for motorcyclists, in fact ADAS for motorcycles are still barely developed. Our aim was to study a camera-based sensor for the application of preventive safety in tilting vehicles. We identified two road conflict situations for which automotive remote sensors installed in a tilting vehicle are likely to fail in the identification of critical obstacles. Accordingly, we set two experiments conducted in real traffic conditions to test our stereo vision sensor. Our promising results support the application of this type of sensors for advanced motorcycle safety applications.
Collapse
Affiliation(s)
- Gustavo Gil
- Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze, Santa Marta 3, 50139 Firenze, Italy.
| | - Giovanni Savino
- Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze, Santa Marta 3, 50139 Firenze, Italy.
- Accident Research Centre, Monash University, Melbourne, 21 Alliance Lane, Clayton, VIC 3800, Australia.
| | - Simone Piantini
- Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze, Santa Marta 3, 50139 Firenze, Italy.
| | - Marco Pierini
- Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze, Santa Marta 3, 50139 Firenze, Italy.
| |
Collapse
|
50
|
Hassner T, Filosof S, Mayzels V, Zelnik-Manor L. SIFTing Through Scales. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2017; 39:1431-1443. [PMID: 27448341 DOI: 10.1109/tpami.2016.2592916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Scale invariant feature detectors often find stable scales in only a few image pixels. Consequently, methods for feature matching typically choose one of two extreme options: matching a sparse set of scale invariant features, or dense matching using arbitrary scales. In this paper, we turn our attention to the overwhelming majority of pixels, those where stable scales are not found by standard techniques. We ask, is scale-selection necessary for these pixels, when dense, scale-invariant matching is required and if so, how can it be achieved? We make the following contributions: (i) We show that features computed over different scales, even in low-contrast areas, can be different and selecting a single scale, arbitrarily or otherwise, may lead to poor matches when the images have different scales. (ii) We show that representing each pixel as a set of SIFTs, extracted at multiple scales, allows for far better matches than single-scale descriptors, but at a computational price. Finally, (iii) we demonstrate that each such set may be accurately represented by a low-dimensional, linear subspace. A subspace-to-point mapping may further be used to produce a novel descriptor representation, the Scale-Less SIFT (SLS), as an alternative to single-scale descriptors. These claims are verified by quantitative and qualitative tests, demonstrating significant improvements over existing methods. A preliminary version of this work appeared in [1] .
Collapse
|