1
|
Arshad S, Park TH. SVS-VPR: A Semantic Visual and Spatial Information-Based Hierarchical Visual Place Recognition for Autonomous Navigation in Challenging Environmental Conditions. SENSORS (BASEL, SWITZERLAND) 2024; 24:906. [PMID: 38339624 PMCID: PMC10857550 DOI: 10.3390/s24030906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 01/24/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024]
Abstract
Robust visual place recognition (VPR) enables mobile robots to identify previously visited locations. For this purpose, the extracted visual information and place matching method plays a significant role. In this paper, we critically review the existing VPR methods and group them into three major categories based on visual information used, i.e., handcrafted features, deep features, and semantics. Focusing the benefits of convolutional neural networks (CNNs) and semantics, and limitations of existing research, we propose a robust appearance-based place recognition method, termed SVS-VPR, which is implemented as a hierarchical model consisting of two major components: global scene-based and local feature-based matching. The global scene semantics are extracted and compared with pre-visited images to filter the match candidates while reducing the search space and computational cost. The local feature-based matching involves the extraction of robust local features from CNN possessing invariant properties against environmental conditions and a place matching method utilizing semantic, visual, and spatial information. SVS-VPR is evaluated on publicly available benchmark datasets using true positive detection rate, recall at 100% precision, and area under the curve. Experimental findings demonstrate that SVS-VPR surpasses several state-of-the-art deep learning-based methods, boosting robustness against significant changes in viewpoint and appearance while maintaining efficient matching time performance.
Collapse
Affiliation(s)
- Saba Arshad
- Industrial Artificial Intelligence Research Center, Chungbuk National University, Cheongju 28644, Republic of Korea;
| | - Tae-Hyoung Park
- Department of Intelligent Systems and Robotics, Chungbuk National University, Cheongju 28644, Republic of Korea
| |
Collapse
|
2
|
Song S, Yu F, Jiang X, Zhu J, Cheng W, Fang X. Loop closure detection of visual SLAM based on variational autoencoder. Front Neurorobot 2024; 17:1301785. [PMID: 38313328 PMCID: PMC10837850 DOI: 10.3389/fnbot.2023.1301785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 12/26/2023] [Indexed: 02/06/2024] Open
Abstract
Loop closure detection is an important module for simultaneous localization and mapping (SLAM). Correct detection of loops can reduce the cumulative drift in positioning. Because traditional detection methods rely on handicraft features, false positive detections can occur when the environment changes, resulting in incorrect estimates and an inability to obtain accurate maps. In this research paper, a loop closure detection method based on a variational autoencoder (VAE) is proposed. It is intended to be used as a feature extractor to extract image features through neural networks to replace the handicraft features used in traditional methods. This method extracts a low-dimensional vector as the representation of the image. At the same time, the attention mechanism is added to the network and constraints are added to improve the loss function for better image representation. In the back-end feature matching process, geometric checking is used to filter out the wrong matching for the false positive problem. Finally, through numerical experiments, the proposed method is demonstrated to have a better precision-recall curve than the traditional method of the bag-of-words model and other deep learning methods and is highly robust to environmental changes. In addition, experiments on datasets from three different scenarios also demonstrate that the method can be applied in real-world scenarios and that it has a good performance.
Collapse
Affiliation(s)
- Shibin Song
- Department of College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China
| | - Fengjie Yu
- Department of College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China
| | - Xiaojie Jiang
- Yantai Tulan Electronic Technology Co., Ltd, Yantai, China
| | - Jie Zhu
- Department of College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China
| | - Weihao Cheng
- Department of College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China
| | - Xiao Fang
- Department of College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China
| |
Collapse
|
3
|
A Review of Common Techniques for Visual Simultaneous Localization and Mapping. JOURNAL OF ROBOTICS 2023. [DOI: 10.1155/2023/8872822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
Mobile robots are widely used in medicine, agriculture, home furnishing, and industry. Simultaneous localization and mapping (SLAM) is the working basis of mobile robots, so it is extremely necessary and meaningful for making researches on SLAM technology. SLAM technology involves robot mechanism kinematics, logic, mathematics, perceptual detection, and other fields. However, it faces the problem of classifying the technical content, which leads to diverse technical frameworks of SLAM. Among all sorts of SLAM, visual SLAM (V-SLAM) has become the key academic research due to its advantages of low price, easy installation, and simple algorithm model. Firstly, we illustrate the superiority of V-SLAM by comparing it with other localization techniques. Secondly, we sort out some open-source V-SLAM algorithms and compare their real-time performance, robustness, and innovation. Then, we analyze the frameworks, mathematical models, and related basic theoretical knowledge of V-SLAM. Meanwhile, we review the related works from four aspects: visual odometry, back-end optimization, loop closure detection, and mapping. Finally, we prospect the future development trend and make a foundation for researchers to expand works in the future. All in all, this paper classifies each module of V-SLAM in detail and provides better readability to readers. This is undoubtedly the most comprehensive review of V-SLAM recently.
Collapse
|
4
|
|
5
|
Zhang X, Zheng L, Tan Z, Li S. Loop Closure Detection Based on Residual Network and Capsule Network for Mobile Robot. SENSORS (BASEL, SWITZERLAND) 2022; 22:7137. [PMID: 36236235 PMCID: PMC9573234 DOI: 10.3390/s22197137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 07/10/2022] [Accepted: 07/17/2022] [Indexed: 06/16/2023]
Abstract
Loop closure detection based on a residual network (ResNet) and a capsule network (CapsNet) is proposed to address the problems of low accuracy and poor robustness for mobile robot simultaneous localization and mapping (SLAM) in complex scenes. First, the residual network of a feature coding strategy is introduced to extract the shallow geometric features and deep semantic features of images, reduce the amount of image noise information, accelerate the convergence speed of the model, and solve the problems of gradient disappearance and network degradation of deep neural networks. Then, the dynamic routing mechanism of the capsule network is optimized through the entropy peak density, and a vector is used to represent the spatial position relationship between features, which can improve the ability of image feature extraction and expression to optimize the overall performance of networks. Finally, the optimized residual network and capsule network are fused to retain the differences and correlations between features, and the global feature descriptors and feature vectors are combined to calculate the similarity of image features for loop closure detection. The experimental results show that the proposed method can achieve loop closure detection for mobile robots in complex scenes, such as view changes, illumination changes, and dynamic objects, and improve the accuracy and robustness of mobile robot SLAM.
Collapse
Affiliation(s)
- Xin Zhang
- School of Mechanical Engineering, Shenyang Ligong University, Shenyang 110159, China
- Shenyang Institute of Computing Technology Co., Ltd., Chinese Academy of Sciences, Shenyang 110168, China
- Software College, Northeastern University, Shenyang 110169, China
| | - Liaomo Zheng
- Shenyang Institute of Computing Technology Co., Ltd., Chinese Academy of Sciences, Shenyang 110168, China
| | - Zhenhua Tan
- Software College, Northeastern University, Shenyang 110169, China
| | - Suo Li
- School of Mechanical Engineering, Shenyang Ligong University, Shenyang 110159, China
| |
Collapse
|
6
|
Duszak P. SLAM on the Hexagonal Grid. SENSORS (BASEL, SWITZERLAND) 2022; 22:6221. [PMID: 36015980 PMCID: PMC9415786 DOI: 10.3390/s22166221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 08/10/2022] [Accepted: 08/17/2022] [Indexed: 06/15/2023]
Abstract
Hexagonal grids have many advantages over square grids and could be successfully used in mobile robotics as a map representation. However, there is a lack of an essential algorithm, namely, SLAM (simultaneous localization and mapping), that would generate a map directly on the hexagonal grid. In this paper, this issue is addressed. The solution is based on scan matching and solving the least-square problem with the Gauss-Newton formula, but it is modified with the Lagrange multiplier theorem. This is necessary to fulfill the constraints given by the manifold. The algorithm was tested in the synthetic environment and on a real robot and is entirely fully suitable for the presented task. It generates a very accurate map and generally has even better precision than the similar approach implemented on the square lattice.
Collapse
Affiliation(s)
- Piotr Duszak
- Institute of Automatic Control and Robotics, Warsaw University of Technology, 02-525 Warsaw, Poland
| |
Collapse
|
7
|
An improved adaptive ORB-SLAM method for monocular vision robot under dynamic environments. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01627-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Abstract
Visual SLAM (VSLAM) has been developing rapidly due to its advantages of low-cost sensors, the easy fusion of other sensors, and richer environmental information. Traditional visionbased SLAM research has made many achievements, but it may fail to achieve wished results in challenging environments. Deep learning has promoted the development of computer vision, and the combination of deep learning and SLAM has attracted more and more attention. Semantic information, as high-level environmental information, can enable robots to better understand the surrounding environment. This paper introduces the development of VSLAM technology from two aspects: traditional VSLAM and semantic VSLAM combined with deep learning. For traditional VSLAM, we summarize the advantages and disadvantages of indirect and direct methods in detail and give some classical VSLAM open-source algorithms. In addition, we focus on the development of semantic VSLAM based on deep learning. Starting with typical neural networks CNN and RNN, we summarize the improvement of neural networks for the VSLAM system in detail. Later, we focus on the help of target detection and semantic segmentation for VSLAM semantic information introduction. We believe that the development of the future intelligent era cannot be without the help of semantic technology. Introducing deep learning into the VSLAM system to provide semantic information can help robots better perceive the surrounding environment and provide people with higher-level help.
Collapse
|
9
|
Zhang H, Zhao T, Zhong Y, Yin Y, Yuan H, Dian S. An efficient loop closure detection method based on spatially constrained feature matching. INTEL SERV ROBOT 2022. [DOI: 10.1007/s11370-022-00423-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Ikram MH, Khaliq S, Anjum ML, Hussain W. Perceptual Aliasing++: Adversarial Attack for Visual SLAM Front-End and Back-End. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3150031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
11
|
Yang B, Xu X, Ren J, Cheng L, Guo L, Zhang Z. SAM-Net: Semantic probabilistic and attention mechanisms of dynamic objects for self-supervised depth and camera pose estimation in visual odometry applications. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2021.11.028] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
12
|
Zhao D, Zhang Z, Lu H, Cheng S, Si B, Feng X. Learning Cognitive Map Representations for Navigation by Sensory-Motor Integration. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:508-521. [PMID: 32275629 DOI: 10.1109/tcyb.2020.2977999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
How to transform a mixed flow of sensory and motor information into memory state of self-location and to build map representations of the environment are central questions in the navigation research. Studies in neuroscience have shown that place cells in the hippocampus of the rodent brains form dynamic cognitive representations of locations in the environment. We propose a neural-network model called sensory-motor integration network model (SeMINet) to learn cognitive map representations by integrating sensory and motor information while an agent is exploring a virtual environment. This biologically inspired model consists of a deep neural network representing visual features of the environment, a recurrent network of place units encoding spatial information by sensorimotor integration, and a secondary network to decode the locations of the agent from spatial representations. The recurrent connections between the place units sustain an activity bump in the network without the need of sensory inputs, and the asymmetry in the connections propagates the activity bump in the network, forming a dynamic memory state which matches the motion of the agent. A competitive learning process establishes the association between the sensory representations and the memory state of the place units, and is able to correct the cumulative path-integration errors. The simulation results demonstrate that the network forms neural codes that convey location information of the agent independent of its head direction. The decoding network reliably predicts the location even when the movement is subject to noise. The proposed SeMINet thus provides a brain-inspired neural-network model for cognitive map updated by both self-motion cues and visual cues.
Collapse
|
13
|
Sun L, Singh RP, Kanehiro F. Visual SLAM Framework Based on Segmentation with the Improvement of Loop Closure Detection in Dynamic Environments. JOURNAL OF ROBOTICS AND MECHATRONICS 2021. [DOI: 10.20965/jrm.2021.p1385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Most simultaneous localization and mapping (SLAM) systems assume that SLAM is conducted in a static environment. When SLAM is used in dynamic environments, the accuracy of each part of the SLAM system is adversely affected. We term this problem as dynamic SLAM. In this study, we propose solutions for three main problems in dynamic SLAM: camera tracking, three-dimensional map reconstruction, and loop closure detection. We propose to employ geometry-based method, deep learning-based method, and the combination of them for object segmentation. Using the information from segmentation to generate the mask, we filter the keypoints that lead to errors in visual odometry and features extracted by the CNN from dynamic areas to improve the performance of loop closure detection. Then, we validate our proposed loop closure detection method using the precision-recall curve and also confirm the framework’s performance using multiple datasets. The absolute trajectory error and relative pose error are used as metrics to evaluate the accuracy of the proposed SLAM framework in comparison with state-of-the-art methods. The findings of this study can potentially improve the robustness of SLAM technology in situations where mobile robots work together with humans, while the object-based point cloud byproduct has potential for other robotics tasks.
Collapse
|
14
|
Tang L, Wang Y, Tan Q, Xiong R. Explicit feature disentanglement for visual place recognition across appearance changes. INT J ADV ROBOT SYST 2021. [DOI: 10.1177/17298814211037497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In the long-term deployment of mobile robots, changing appearance brings challenges for localization. When a robot travels to the same place or restarts from an existing map, global localization is needed, where place recognition provides coarse position information. For visual sensors, changing appearances such as the transition from day to night and seasonal variation can reduce the performance of a visual place recognition system. To address this problem, we propose to learn domain-unrelated features across extreme changing appearance, where a domain denotes a specific appearance condition, such as a season or a kind of weather. We use an adversarial network with two discriminators to disentangle domain-related features and domain-unrelated features from images, and the domain-unrelated features are used as descriptors in place recognition. Provided images from different domains, our network is trained in a self-supervised manner which does not require correspondences between these domains. Besides, our feature extractors are shared among all domains, making it possible to contain more appearance without increasing model complexity. Qualitative and quantitative results on two toy cases are presented to show that our network can disentangle domain-related and domain-unrelated features from given data. Experiments on three public datasets and one proposed dataset for visual place recognition are conducted to illustrate the performance of our method compared with several typical algorithms. Besides, an ablation study is designed to validate the effectiveness of the introduced discriminators in our network. Additionally, we use a four-domain dataset to verify that the network can extend to multiple domains with one model while achieving similar performance.
Collapse
Affiliation(s)
- Li Tang
- Department of Control Science and Engineering, Zhejiang University, Hangzhou, People’s Republic of China
| | - Yue Wang
- Department of Control Science and Engineering, Zhejiang University, Hangzhou, People’s Republic of China
| | - Qimeng Tan
- Beijing Key Laboratory of Intelligent Space Robotic System Technology and Applications, Beijing Institute of Spacecraft System Engineering, Beijing, People’s Republic of China
| | - Rong Xiong
- Department of Control Science and Engineering, Zhejiang University, Hangzhou, People’s Republic of China
| |
Collapse
|
15
|
Yuan Q, Zhang Z, Pi Y, Kou L, Zhang F. Real-Time Closed-Loop Detection Method of vSLAM Based on a Dynamic Siamese Network. SENSORS 2021; 21:s21227612. [PMID: 34833691 PMCID: PMC8622372 DOI: 10.3390/s21227612] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 10/30/2021] [Accepted: 11/12/2021] [Indexed: 11/18/2022]
Abstract
As visual simultaneous localization and mapping (vSLAM) is easy disturbed by the changes of camera viewpoint and scene appearance when building a globally consistent map, the robustness and real-time performance of key frame image selections cannot meet the requirements. To solve this problem, a real-time closed-loop detection method based on a dynamic Siamese networks is proposed in this paper. First, a dynamic Siamese network-based fast conversion learning model is constructed to handle the impact of external changes on key frame judgments, and an elementwise convergence strategy is adopted to ensure the accurate positioning of key frames in the closed-loop judgment process. Second, a joint training strategy is designed to ensure the model parameters can be learned offline in parallel from tagged video sequences, which can effectively improve the speed of closed-loop detection. Finally, the proposed method is applied experimentally to three typical closed-loop detection scenario datasets and the experimental results demonstrate the effectiveness and robustness of the proposed method under the interference of complex scenes.
Collapse
Affiliation(s)
- Quande Yuan
- School of Computer Technology and Engineering, Changchun Institute of Technology, Changchun 130012, China;
- National Local Joint Engineering Research Center for Smart Distribution, Grid Measurement and Control with Safety Operation Technology, Changchun Institute of Technology, Changchun 130012, China
| | - Zhenming Zhang
- School of Electrical Engineering, Northeast Electric Power University, Jilin 132011, China;
| | - Yuzhen Pi
- National Local Joint Engineering Research Center for Smart Distribution, Grid Measurement and Control with Safety Operation Technology, Changchun Institute of Technology, Changchun 130012, China
- School of Electrical Engineering and Information Technology, Changchun Institute of Technology, Changchun 130012, China
- Correspondence:
| | - Lei Kou
- Institute of Oceanographic Instrumentation, Qilu University of Technology (Shandong Academy of Sciences), Qingdao 266075, China;
| | - Fangfang Zhang
- School of Electrical Engineering and Automation, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China;
| |
Collapse
|
16
|
Abstract
This study analyses the main challenges, trends, technological approaches, and artificial intelligence methods developed by new researchers and professionals in the field of machine learning, with an emphasis on the most outstanding and relevant works to date. This literature review evaluates the main methodological contributions of artificial intelligence through machine learning. The methodology used to study the documents was content analysis; the basic terminology of the study corresponds to machine learning, artificial intelligence, and big data between the years 2017 and 2021. For this study, we selected 181 references, of which 120 are part of the literature review. The conceptual framework includes 12 categories, four groups, and eight subgroups. The study of data management using AI methodologies presents symmetry in the four machine learning groups: supervised learning, unsupervised learning, semi-supervised learning, and reinforced learning. Furthermore, the artificial intelligence methods with more symmetry in all groups are artificial neural networks, Support Vector Machines, K-means, and Bayesian Methods. Finally, five research avenues are presented to improve the prediction of machine learning.
Collapse
|
17
|
Roy P, Chowdhury C. A Survey of Machine Learning Techniques for Indoor Localization and Navigation Systems. J INTELL ROBOT SYST 2021. [DOI: 10.1007/s10846-021-01327-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
18
|
Arshad S, Kim GW. Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey. SENSORS 2021; 21:s21041243. [PMID: 33578695 PMCID: PMC7916334 DOI: 10.3390/s21041243] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 01/27/2021] [Accepted: 02/04/2021] [Indexed: 11/16/2022]
Abstract
Loop closure detection is of vital importance in the process of simultaneous localization and mapping (SLAM), as it helps to reduce the cumulative error of the robot's estimated pose and generate a consistent global map. Many variations of this problem have been considered in the past and the existing methods differ in the acquisition approach of query and reference views, the choice of scene representation, and associated matching strategy. Contributions of this survey are many-fold. It provides a thorough study of existing literature on loop closure detection algorithms for visual and Lidar SLAM and discusses their insight along with their limitations. It presents a taxonomy of state-of-the-art deep learning-based loop detection algorithms with detailed comparison metrics. Also, the major challenges of conventional approaches are identified. Based on those challenges, deep learning-based methods were reviewed where the identified challenges are tackled focusing on the methods providing long-term autonomy in various conditions such as changing weather, light, seasons, viewpoint, and occlusion due to the presence of mobile objects. Furthermore, open challenges and future directions were also discussed.
Collapse
|
19
|
Wang Z, Peng Z, Guan Y, Wu L. Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder. J INTELL ROBOT SYST 2021. [DOI: 10.1007/s10846-020-01302-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
20
|
Robust Loop Closure Detection Integrating Visual–Spatial–Semantic Information via Topological Graphs and CNN Features. REMOTE SENSING 2020. [DOI: 10.3390/rs12233890] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Loop closure detection is a key module for visual simultaneous localization and mapping (SLAM). Most previous methods for this module have not made full use of the information provided by images, i.e., they have only used the visual appearance or have only considered the spatial relationships of landmarks; the visual, spatial and semantic information have not been fully integrated. In this paper, a robust loop closure detection approach integrating visual–spatial–semantic information is proposed by employing topological graphs and convolutional neural network (CNN) features. Firstly, to reduce mismatches under different viewpoints, semantic topological graphs are introduced to encode the spatial relationships of landmarks, and random walk descriptors are employed to characterize the topological graphs for graph matching. Secondly, dynamic landmarks are eliminated by using semantic information, and distinctive landmarks are selected for loop closure detection, thus alleviating the impact of dynamic scenes. Finally, to ease the effect of appearance changes, the appearance-invariant descriptor of the landmark region is extracted by a pre-trained CNN without the specially designed manual features. The proposed approach weakens the influence of viewpoint changes and dynamic scenes, and extensive experiments conducted on open datasets and a mobile robot demonstrated that the proposed method has more satisfactory performance compared to state-of-the-art methods.
Collapse
|
21
|
|
22
|
Incremental Pose Map Optimization for Monocular Vision SLAM Based on Similarity Transformation. SENSORS 2019; 19:s19224945. [PMID: 31766236 PMCID: PMC6891346 DOI: 10.3390/s19224945] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Revised: 11/08/2019] [Accepted: 11/10/2019] [Indexed: 11/25/2022]
Abstract
The novel contribution of this paper is to propose an incremental pose map optimization for monocular vision simultaneous localization and mapping (SLAM) based on similarity transformation, which can effectively solve the scale drift problem of SLAM for monocular vision and eliminate the cumulative error by global optimization. With the method of mixed inverse depth estimation based on a probability graph, the problem of the uncertainty of depth estimation is effectively solved and the robustness of depth estimation is improved. Firstly, this paper proposes a method combining the sparse direct method based on histogram equalization and the feature point method for front-end processing, and the mixed inverse depth estimation method based on a probability graph is used to estimate the depth information. Then, a bag-of-words model based on the mean initialization K-means is proposed for closed-loop feature detection. Finally, the incremental pose map optimization method based on similarity transformation is proposed to process the back end to optimize the pose and depth information of the camera. When the closed loop is detected, global optimization is carried out to effectively eliminate the cumulative error of the system. In this paper, indoor and outdoor environmental experiments are carried out using open data sets, such as TUM and KITTI, which fully proves the effectiveness of this method. Closed-loop detection experiments using hand-held cameras verify the importance of closed-loop detection. This method can effectively solve the scale drift problem of monocular vision SLAM and has strong robustness.
Collapse
|
23
|
A High-Accuracy Indoor-Positioning Method with Automated RGB-D Image Database Construction. REMOTE SENSING 2019. [DOI: 10.3390/rs11212572] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
High-accuracy indoor positioning is a prerequisite to satisfy the increasing demands of position-based services in complex indoor scenes. Current indoor visual-positioning methods mainly include image retrieval-based methods, visual landmarks-based methods, and learning-based methods. To better overcome the limitations of traditional methods such as them being labor-intensive, of poor accuracy, and time-consuming, this paper proposes a novel indoor-positioning method with automated red, green, blue and depth (RGB-D) image database construction. First, strategies for automated database construction are developed to reduce the workload of manually selecting database images and ensure the requirements of high-accuracy indoor positioning. The database is automatically constructed according to the rules, which is more objective and improves the efficiency of the image-retrieval process. Second, by combining the automated database construction module, convolutional neural network (CNN)-based image-retrieval module, and strict geometric relations-based pose estimation module, we obtain a high-accuracy indoor-positioning system. Furthermore, in order to verify the proposed method, we conducted extensive experiments on the public indoor environment dataset. The detailed experimental results demonstrated the effectiveness and efficiency of our indoor-positioning method.
Collapse
|
24
|
Su H, Fu Z, Wen Z. SFPSO algorithm-based multi-scale progressive inversion identification for structural damage in concrete cut-off wall of embankment dam. Appl Soft Comput 2019. [DOI: 10.1016/j.asoc.2019.105679] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
25
|
|
26
|
Abstract
This paper focuses on loop-closure detection (LCD) for a visual simultaneous localization and mapping (SLAM) system. We present a strategy that combines a Bayes filter and features from a pre-trained convolution neural network (CNN) to perform LCD. Rather than using features from only one layer, we fuse features from multiple layers based on spatial pyramid pooling. A flexible Bayes model is then formulated to integrate the sequential information and similarities that are computed by features at different scales. The introduction of a penalty factor and bidirectional propagation enables our approach to handle complex trajectories. We present extensive experiments on challenging datasets, and we compare our approach to state-of-the-art methods, to evaluate it. The results show that our approach can ensure remarkable performance under severe condition changes and handle trajectories that have different characteristics. We also show the advantages of Bayes filters over sequence matching in the experiments, and we analyze our feature fusion strategy by visualizing the activations of the CNN.
Collapse
Affiliation(s)
- Qiang Liu
- School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Fuhai Duan
- School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| |
Collapse
|
27
|
Loop Closure Detection Based on Multi-Scale Deep Feature Fusion. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9061120] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Loop closure detection plays a very important role in the mobile robot navigation field. It is useful in achieving accurate navigation in complex environments and reducing the cumulative error of the robot’s pose estimation. The current mainstream methods are based on the visual bag of word model, but traditional image features are sensitive to illumination changes. This paper proposes a loop closure detection algorithm based on multi-scale deep feature fusion, which uses a Convolutional Neural Network (CNN) to extract more advanced and more abstract features. In order to deal with the different sizes of input images and enrich receptive fields of the feature extractor, this paper uses the spatial pyramid pooling (SPP) of multi-scale to fuse the features. In addition, considering the different contributions of each feature to loop closure detection, the paper defines the distinguishability weight of features and uses it in similarity measurement. It reduces the probability of false positives in loop closure detection. The experimental results show that the loop closure detection algorithm based on multi-scale deep feature fusion has higher precision and recall rates and is more robust to illumination changes than the mainstream methods.
Collapse
|
28
|
Wu Y, Tang F, Li H. Image-based camera localization: an overview. Vis Comput Ind Biomed Art 2018; 1:8. [PMID: 32240389 PMCID: PMC7099558 DOI: 10.1186/s42492-018-0008-z] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/06/2018] [Indexed: 11/22/2022] Open
Abstract
Virtual reality, augmented reality, robotics, and autonomous driving, have recently attracted much attention from both academic and industrial communities, in which image-based camera localization is a key task. However, there has not been a complete review on image-based camera localization. It is urgent to map this topic to enable individuals enter the field quickly. In this paper, an overview of image-based camera localization is presented. A new and complete classification of image-based camera localization approaches is provided and the related techniques are introduced. Trends for future development are also discussed. This will be useful not only to researchers, but also to engineers and other individuals interested in this field.
Collapse
Affiliation(s)
- Yihong Wu
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China.
| | - Fulin Tang
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China
| | - Heping Li
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
29
|
|
30
|
Scale Estimation and Correction of the Monocular Simultaneous Localization and Mapping (SLAM) Based on Fusion of 1D Laser Range Finder and Vision Data. SENSORS 2018; 18:s18061948. [PMID: 29914114 PMCID: PMC6021903 DOI: 10.3390/s18061948] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Revised: 06/10/2018] [Accepted: 06/13/2018] [Indexed: 11/27/2022]
Abstract
This article presents a new sensor fusion method for visual simultaneous localization and mapping (SLAM) through integration of a monocular camera and a 1D-laser range finder. Such as a fusion method provides the scale estimation and drift correction and it is not limited by volume, e.g., the stereo camera is constrained by the baseline and overcomes the limited depth range problem associated with SLAM for RGBD cameras. We first present the analytical feasibility for estimating the absolute scale through the fusion of 1D distance information and image information. Next, the analytical derivation of the laser-vision fusion is described in detail based on the local dense reconstruction of the image sequences. We also correct the scale drift of the monocular SLAM using the laser distance information which is independent of the drift error. Finally, application of this approach to both indoor and outdoor scenes is verified by the Technical University of Munich dataset of RGBD and self-collected data. We compare the effects of the scale estimation and drift correction of the proposed method with the SLAM for a monocular camera and a RGBD camera.
Collapse
|
31
|
Qiu K, Ai Y, Tian B, Wang B, Cao D. Siamese-ResNet: Implementing Loop Closure Detection based on Siamese Network. 2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV) 2018. [DOI: 10.1109/ivs.2018.8500465] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
32
|
Arroyo R, Alcantarilla PF, Bergasa LM, Romera E. Are you ABLE to perform a life-long visual topological localization? Auton Robots 2017. [DOI: 10.1007/s10514-017-9664-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|