1
|
Al-Tawil B, Hempel T, Abdelrahman A, Al-Hamadi A. A review of visual SLAM for robotics: evolution, properties, and future applications. Front Robot AI 2024; 11:1347985. [PMID: 38686339 PMCID: PMC11056647 DOI: 10.3389/frobt.2024.1347985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 02/20/2024] [Indexed: 05/02/2024] Open
Abstract
Visual simultaneous localization and mapping (V-SLAM) plays a crucial role in the field of robotic systems, especially for interactive and collaborative mobile robots. The growing reliance on robotics has increased complexity in task execution in real-world applications. Consequently, several types of V-SLAM methods have been revealed to facilitate and streamline the functions of robots. This work aims to showcase the latest V-SLAM methodologies, offering clear selection criteria for researchers and developers to choose the right approach for their robotic applications. It chronologically presents the evolution of SLAM methods, highlighting key principles and providing comparative analyses between them. The paper focuses on the integration of the robotic ecosystem with a robot operating system (ROS) as Middleware, explores essential V-SLAM benchmark datasets, and presents demonstrative figures for each method's workflow.
Collapse
Affiliation(s)
- Basheer Al-Tawil
- Institute for Information Technology and Communications, Otto-von-Guericke-University, Magdeburg, Germany
| | | | | | | |
Collapse
|
2
|
Zhang C, Yang Z, Xue B, Zhuo H, Liao L, Yang X, Zhu Z. Perceiving like a Bat: Hierarchical 3D Geometric-Semantic Scene Understanding Inspired by a Biomimetic Mechanism. Biomimetics (Basel) 2023; 8:436. [PMID: 37754187 PMCID: PMC10526479 DOI: 10.3390/biomimetics8050436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 09/05/2023] [Accepted: 09/13/2023] [Indexed: 09/28/2023] Open
Abstract
Geometric-semantic scene understanding is a spatial intelligence capability that is essential for robots to perceive and navigate the world. However, understanding a natural scene remains challenging for robots because of restricted sensors and time-varying situations. In contrast, humans and animals are able to form a complex neuromorphic concept of the scene they move in. This neuromorphic concept captures geometric and semantic aspects of the scenario and reconstructs the scene at multiple levels of abstraction. This article seeks to reduce the gap between robot and animal perception by proposing an ingenious scene-understanding approach that seamlessly captures geometric and semantic aspects in an unexplored environment. We proposed two types of biologically inspired environment perception methods, i.e., a set of elaborate biomimetic sensors and a brain-inspired parsing algorithm related to scene understanding, that enable robots to perceive their surroundings like bats. Our evaluations show that the proposed scene-understanding system achieves competitive performance in image semantic segmentation and volumetric-semantic scene reconstruction. Moreover, to verify the practicability of our proposed scene-understanding method, we also conducted real-world geometric-semantic scene reconstruction in an indoor environment with our self-developed drone.
Collapse
Affiliation(s)
| | - Zhong Yang
- College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; (C.Z.)
| | | | | | | | | | | |
Collapse
|
3
|
Bavle H, Sanchez-Lopez JL, Cimarelli C, Tourani A, Voos H. From SLAM to Situational Awareness: Challenges and Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:4849. [PMID: 37430762 DOI: 10.3390/s23104849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 04/27/2023] [Accepted: 05/13/2023] [Indexed: 07/12/2023]
Abstract
The capability of a mobile robot to efficiently and safely perform complex missions is limited by its knowledge of the environment, namely the situation. Advanced reasoning, decision-making, and execution skills enable an intelligent agent to act autonomously in unknown environments. Situational Awareness (SA) is a fundamental capability of humans that has been deeply studied in various fields, such as psychology, military, aerospace, and education. Nevertheless, it has yet to be considered in robotics, which has focused on single compartmentalized concepts such as sensing, spatial perception, sensor fusion, state estimation, and Simultaneous Localization and Mapping (SLAM). Hence, the present research aims to connect the broad multidisciplinary existing knowledge to pave the way for a complete SA system for mobile robotics that we deem paramount for autonomy. To this aim, we define the principal components to structure a robotic SA and their area of competence. Accordingly, this paper investigates each aspect of SA, surveying the state-of-the-art robotics algorithms that cover them, and discusses their current limitations. Remarkably, essential aspects of SA are still immature since the current algorithmic development restricts their performance to only specific environments. Nevertheless, Artificial Intelligence (AI), particularly Deep Learning (DL), has brought new methods to bridge the gap that maintains these fields apart from the deployment to real-world scenarios. Furthermore, an opportunity has been discovered to interconnect the vastly fragmented space of robotic comprehension algorithms through the mechanism of Situational Graph (S-Graph), a generalization of the well-known scene graph. Therefore, we finally shape our vision for the future of robotic situational awareness by discussing interesting recent research directions.
Collapse
Affiliation(s)
- Hriday Bavle
- Interdisciplinary Center for Security Reliability and Trust (SnT), University of Luxembourg, 1855 Luxembourg, Luxembourg
| | - Jose Luis Sanchez-Lopez
- Interdisciplinary Center for Security Reliability and Trust (SnT), University of Luxembourg, 1855 Luxembourg, Luxembourg
| | - Claudio Cimarelli
- Interdisciplinary Center for Security Reliability and Trust (SnT), University of Luxembourg, 1855 Luxembourg, Luxembourg
| | - Ali Tourani
- Interdisciplinary Center for Security Reliability and Trust (SnT), University of Luxembourg, 1855 Luxembourg, Luxembourg
| | - Holger Voos
- Interdisciplinary Center for Security Reliability and Trust (SnT), University of Luxembourg, 1855 Luxembourg, Luxembourg
- Department of Engineering, Faculty of Science, Technology, and Medicine (FSTM), University of Luxembourg, 1359 Luxembourg, Luxembourg
| |
Collapse
|
4
|
Roch J, Fayyad J, Najjaran H. DOPESLAM: High-Precision ROS-Based Semantic 3D SLAM in a Dynamic Environment. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094364. [PMID: 37177568 PMCID: PMC10181773 DOI: 10.3390/s23094364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/27/2023] [Accepted: 04/27/2023] [Indexed: 05/15/2023]
Abstract
Recent advancements in deep learning techniques have accelerated the growth of robotic vision systems. One way this technology can be applied is to use a mobile robot to automatically generate a 3D map and identify objects within it. This paper addresses the important challenge of labeling objects and generating 3D maps in a dynamic environment. It explores a solution to this problem by combining Deep Object Pose Estimation (DOPE) with Real-Time Appearance-Based Mapping (RTAB-Map) through means of loose-coupled parallel fusion. DOPE's abilities are enhanced by leveraging its belief map system to filter uncertain key points, which increases precision to ensure that only the best object labels end up on the map. Additionally, DOPE's pipeline is modified to enable shape-based object recognition using depth maps, allowing it to identify objects in complete darkness. Three experiments are performed to find the ideal training dataset, quantify the increased precision, and evaluate the overall performance of the system. The results show that the proposed solution outperforms existing methods in most intended scenarios, such as in unilluminated scenes. The proposed key point filtering technique has demonstrated an improvement in the average inference speed, achieving a speedup of 2.6× and improving the average distance to the ground truth compared to the original DOPE algorithm.
Collapse
Affiliation(s)
- Jesse Roch
- School of Engineering, The University of British Columbia, Kelowna, BC V1V 1V7, Canada
| | - Jamil Fayyad
- School of Engineering, The University of British Columbia, Kelowna, BC V1V 1V7, Canada
| | - Homayoun Najjaran
- Faculty of Engineering and Computer Science, University of Victoria, Victoria, BC V8W 2Y2, Canada
| |
Collapse
|
5
|
Linton P, Morgan MJ, Read JCA, Vishwanath D, Creem-Regehr SH, Domini F. New Approaches to 3D Vision. Philos Trans R Soc Lond B Biol Sci 2023; 378:20210443. [PMID: 36511413 PMCID: PMC9745878 DOI: 10.1098/rstb.2021.0443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 10/25/2022] [Indexed: 12/15/2022] Open
Abstract
New approaches to 3D vision are enabling new advances in artificial intelligence and autonomous vehicles, a better understanding of how animals navigate the 3D world, and new insights into human perception in virtual and augmented reality. Whilst traditional approaches to 3D vision in computer vision (SLAM: simultaneous localization and mapping), animal navigation (cognitive maps), and human vision (optimal cue integration) start from the assumption that the aim of 3D vision is to provide an accurate 3D model of the world, the new approaches to 3D vision explored in this issue challenge this assumption. Instead, they investigate the possibility that computer vision, animal navigation, and human vision can rely on partial or distorted models or no model at all. This issue also highlights the implications for artificial intelligence, autonomous vehicles, human perception in virtual and augmented reality, and the treatment of visual disorders, all of which are explored by individual articles. This article is part of a discussion meeting issue 'New approaches to 3D vision'.
Collapse
Affiliation(s)
- Paul Linton
- Presidential Scholars in Society and Neuroscience, Center for Science and Society, Columbia University, New York, NY 10027, USA
- Italian Academy for Advanced Studies in America, Columbia University, New York, NY 10027, USA
- Visual Inference Lab, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA
| | - Michael J. Morgan
- Department of Optometry and Visual Sciences, City, University of London, Northampton Square, London EC1V 0HB, UK
| | - Jenny C. A. Read
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, Tyne & Wear NE2 4HH, UK
| | - Dhanraj Vishwanath
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, Fife KY16 9JP, UK
| | | | - Fulvio Domini
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912-9067, USA
| |
Collapse
|
6
|
Chang X, Ren P, Xu P, Li Z, Chen X, Hauptmann A. A Comprehensive Survey of Scene Graphs: Generation and Application. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:1-26. [PMID: 34941499 DOI: 10.1109/tpami.2021.3137605] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Scene graph is a structured representation of a scene that can clearly express the objects, attributes, and relationships between objects in the scene. As computer vision technology continues to develop, people are no longer satisfied with simply detecting and recognizing objects in images; instead, people look forward to a higher level of understanding and reasoning about visual scenes. For example, given an image, we want to not only detect and recognize objects in the image, but also understand the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content. Alternatively, we might want the machine to tell us what the little girl in the image is doing (Visual Question Answering (VQA)), or even remove the dog from the image and find similar images (image editing and retrieval), etc. These tasks require a higher level of understanding and reasoning for image vision tasks. The scene graph is just such a powerful tool for scene understanding. Therefore, scene graphs have attracted the attention of a large number of researchers, and related research is often cross-modal, complex, and rapidly developing. However, no relatively systematic survey of scene graphs exists at present. To this end, this survey conducts a comprehensive investigation of the current scene graph research. More specifically, we first summarize the general definition of the scene graph, then conducte a comprehensive and systematic discussion on the generation method of the scene graph (SGG) and the SGG with the aid of prior knowledge. We then investigate the main applications of scene graphs and summarize the most commonly used datasets. Finally, we provide some insights into the future development of scene graphs.
Collapse
|
7
|
Herrera-Alarcon EP, Satler M, Vannucci M, Avizzano CA. GNGraph: Self-Organizing Maps for Autonomous Aerial Vehicle Planning. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3195192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Edwin Paul Herrera-Alarcon
- Perceptual Robotics Laboratory at the IIM Institute, Department of Excellence in Robotics and A.I., Scuola Superiore Sant'Anna, Pisa, Italy
| | - Massimo Satler
- Perceptual Robotics Laboratory at the IIM Institute, Department of Excellence in Robotics and A.I., Scuola Superiore Sant'Anna, Pisa, Italy
| | - Marco Vannucci
- ICT-COISP at the TeCIP Institute, Department of Excellence in Robotics and A.I., Scuola Superiore Sant'Anna, Pisa, Italy
| | - Carlo Alberto Avizzano
- Perceptual Robotics Laboratory at the IIM Institute, Department of Excellence in Robotics and A.I., Scuola Superiore Sant'Anna, Pisa, Italy
| |
Collapse
|
8
|
Bavle H, Sanchez-Lopez JL, Shaheer M, Civera J, Voos H. Situational Graphs for Robot Navigation in Structured Indoor Environments. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3189785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Hriday Bavle
- Automation and Robotics Research Group, Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
| | - Jose Luis Sanchez-Lopez
- Automation and Robotics Research Group, Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
| | - Muhammad Shaheer
- Automation and Robotics Research Group, Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
| | | | - Holger Voos
- Automation and Robotics Research Group, Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg
| |
Collapse
|
9
|
Lai T. A Review on Visual-SLAM: Advancements from Geometric Modelling to Learning-Based Semantic Scene Understanding Using Multi-Modal Sensor Fusion. SENSORS (BASEL, SWITZERLAND) 2022; 22:7265. [PMID: 36236364 PMCID: PMC9571301 DOI: 10.3390/s22197265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 09/12/2022] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
Simultaneous Localisation and Mapping (SLAM) is one of the fundamental problems in autonomous mobile robots where a robot needs to reconstruct a previously unseen environment while simultaneously localising itself with respect to the map. In particular, Visual-SLAM uses various sensors from the mobile robot for collecting and sensing a representation of the map. Traditionally, geometric model-based techniques were used to tackle the SLAM problem, which tends to be error-prone under challenging environments. Recent advancements in computer vision, such as deep learning techniques, have provided a data-driven approach to tackle the Visual-SLAM problem. This review summarises recent advancements in the Visual-SLAM domain using various learning-based methods. We begin by providing a concise overview of the geometric model-based approaches, followed by technical reviews on the current paradigms in SLAM. Then, we present the various learning-based approaches to collecting sensory inputs from mobile robots and performing scene understanding. The current paradigms in deep-learning-based semantic understanding are discussed and placed under the context of Visual-SLAM. Finally, we discuss challenges and further opportunities in the direction of learning-based approaches in Visual-SLAM.
Collapse
Affiliation(s)
- Tin Lai
- School of Computer Science, The University of Sydney, Camperdown, NSW 2006, Australia
| |
Collapse
|
10
|
Ghaffari M, Zhang R, Zhu M, Lin CE, Lin TY, Teng S, Li T, Liu T, Song J. Progress in symmetry preserving robot perception and control through geometry and learning. Front Robot AI 2022; 9:969380. [PMID: 36185972 PMCID: PMC9515513 DOI: 10.3389/frobt.2022.969380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 08/02/2022] [Indexed: 11/22/2022] Open
Abstract
This article reports on recent progress in robot perception and control methods developed by taking the symmetry of the problem into account. Inspired by existing mathematical tools for studying the symmetry structures of geometric spaces, geometric sensor registration, state estimator, and control methods provide indispensable insights into the problem formulations and generalization of robotics algorithms to challenging unknown environments. When combined with computational methods for learning hard-to-measure quantities, symmetry-preserving methods unleash tremendous performance. The article supports this claim by showcasing experimental results of robot perception, state estimation, and control in real-world scenarios.
Collapse
Affiliation(s)
- Maani Ghaffari
- Computational Autonomy and Robotics Laboratory (CURLY), University of Michigan, Ann Arbor, MI, United States
| | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Jia G, Li X, Zhang D, Xu W, Lv H, Shi Y, Cai M. Visual-SLAM Classical Framework and Key Techniques: A Review. SENSORS 2022; 22:s22124582. [PMID: 35746363 PMCID: PMC9227238 DOI: 10.3390/s22124582] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 05/31/2022] [Accepted: 06/07/2022] [Indexed: 02/01/2023]
Abstract
With the significant increase in demand for artificial intelligence, environmental map reconstruction has become a research hotspot for obstacle avoidance navigation, unmanned operations, and virtual reality. The quality of the map plays a vital role in positioning, path planning, and obstacle avoidance. This review starts with the development of SLAM (Simultaneous Localization and Mapping) and proceeds to a review of V-SLAM (Visual-SLAM) from its proposal to the present, with a summary of its historical milestones. In this context, the five parts of the classic V-SLAM framework—visual sensor, visual odometer, backend optimization, loop detection, and mapping—are explained separately. Meanwhile, the details of the latest methods are shown; VI-SLAM (Visual inertial SLAM) is reviewed and extended. The four critical techniques of V-SLAM and its technical difficulties are summarized as feature detection and matching, selection of keyframes, uncertainty technology, and expression of maps. Finally, the development direction and needs of the V-SLAM field are proposed.
Collapse
Affiliation(s)
- Guanwei Jia
- School of Physics and Electronics, Henan University, Kaifeng 475004, China; (G.J.); (X.L.); (H.L.)
| | - Xiaoying Li
- School of Physics and Electronics, Henan University, Kaifeng 475004, China; (G.J.); (X.L.); (H.L.)
| | - Dongming Zhang
- School of Physics and Electronics, Henan University, Kaifeng 475004, China; (G.J.); (X.L.); (H.L.)
- Correspondence: (D.Z.); (W.X.); Tel./Fax: +86-10-82339160
| | - Weiqing Xu
- School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China; (Y.S.); (M.C.)
- Pneumatic and Thermodynamic Energy Storage and Supply Beijing Key Laboratory, Beijing 100191, China
- Correspondence: (D.Z.); (W.X.); Tel./Fax: +86-10-82339160
| | - Haojie Lv
- School of Physics and Electronics, Henan University, Kaifeng 475004, China; (G.J.); (X.L.); (H.L.)
| | - Yan Shi
- School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China; (Y.S.); (M.C.)
- Pneumatic and Thermodynamic Energy Storage and Supply Beijing Key Laboratory, Beijing 100191, China
| | - Maolin Cai
- School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China; (Y.S.); (M.C.)
- Pneumatic and Thermodynamic Energy Storage and Supply Beijing Key Laboratory, Beijing 100191, China
| |
Collapse
|
12
|
Sousa YCN, Bassani HF. Topological Semantic Mapping by Consolidation of Deep Visual Features. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3149572] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
13
|
|
14
|
Autonomous Exploration in a Cluttered Environment for a Mobile Robot with 2D-Map Segmentation and Object Detection. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3171069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|