1
|
El Ghazouali S, Mhirit Y, Oukhrid A, Michelucci U, Nouira H. FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment Anything. SENSORS (BASEL, SWITZERLAND) 2024; 24:2889. [PMID: 38732995 PMCID: PMC11086350 DOI: 10.3390/s24092889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/22/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024]
Abstract
In the realm of computer vision, the integration of advanced techniques into the pre-processing of RGB-D camera inputs poses a significant challenge, given the inherent complexities arising from diverse environmental conditions and varying object appearances. Therefore, this paper introduces FusionVision, an exhaustive pipeline adapted for the robust 3D segmentation of objects in RGB-D imagery. Traditional computer vision systems face limitations in simultaneously capturing precise object boundaries and achieving high-precision object detection on depth maps, as they are mainly proposed for RGB cameras. To address this challenge, FusionVision adopts an integrated approach by merging state-of-the-art object detection techniques, with advanced instance segmentation methods. The integration of these components enables a holistic (unified analysis of information obtained from both color RGB and depth D channels) interpretation of RGB-D data, facilitating the extraction of comprehensive and accurate object information in order to improve post-processes such as object 6D pose estimation, Simultanious Localization and Mapping (SLAM) operations, accurate 3D dataset extraction, etc. The proposed FusionVision pipeline employs YOLO for identifying objects within the RGB image domain. Subsequently, FastSAM, an innovative semantic segmentation model, is applied to delineate object boundaries, yielding refined segmentation masks. The synergy between these components and their integration into 3D scene understanding ensures a cohesive fusion of object detection and segmentation, enhancing overall precision in 3D object segmentation.
Collapse
Affiliation(s)
| | | | - Ali Oukhrid
- Independent Researcher, 2502 Biel/Bienne, Switzerland
| | | | - Hichem Nouira
- LNE Laboratoire National de Metrologie et d’Essaies, 75015 Paris, France;
| |
Collapse
|
2
|
Xia X, Fan Z, Xiao G, Chen F, Liu Y, Hu Y. Learning Representative Features by Deep Attention Network for 3D Point Cloud Registration. SENSORS (BASEL, SWITZERLAND) 2023; 23:4123. [PMID: 37112464 PMCID: PMC10145325 DOI: 10.3390/s23084123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 04/04/2023] [Accepted: 04/13/2023] [Indexed: 06/19/2023]
Abstract
Three-dimensional point cloud registration, which aims to find the transformation that best aligns two point clouds, is a widely studied problem in computer vision with a wide spectrum of applications, such as underground mining. Many learning-based approaches have been developed and have demonstrated their effectiveness for point cloud registration. Particularly, attention-based models have achieved outstanding performance due to the extra contextual information captured by attention mechanisms. To avoid the high computation cost brought by attention mechanisms, an encoder-decoder framework is often employed to hierarchically extract the features where the attention module is only applied in the middle. This leads to the compromised effectiveness of the attention module. To tackle this issue, we propose a novel model with the attention layers embedded in both the encoder and decoder stages. In our model, the self-attentional layers are applied in the encoder to consider the relationship between points inside each point cloud, while the decoder utilizes cross-attentional layers to enrich features with contextual information. Extensive experiments conducted on public datasets prove that our model is able to achieve quality results on a registration task.
Collapse
Affiliation(s)
- Xiaokai Xia
- Beijing Institute of System Engineering, Beijing 100101, China
- Artificial Intelligence Institute of China Electronics Technology Group Corporation, Beijing 100041, China
| | - Zhiqiang Fan
- Artificial Intelligence Institute of China Electronics Technology Group Corporation, Beijing 100041, China
| | - Gang Xiao
- Beijing Institute of System Engineering, Beijing 100101, China
| | - Fangyue Chen
- Artificial Intelligence Institute of China Electronics Technology Group Corporation, Beijing 100041, China
| | - Yu Liu
- State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China
| | - Yiheng Hu
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
3
|
Wu J, Wang M, Fourati H, Li H, Zhu Y, Zhang C, Jiang Y, Hu X, Liu M. Generalized n-Dimensional Rigid Registration: Theory and Applications. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:927-940. [PMID: 35507617 DOI: 10.1109/tcyb.2022.3168938] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The generalized rigid registration problem in high-dimensional Euclidean spaces is studied. The loss function is minimized with an equivalent error formulation by the Cayley formula. The closed-form linear least-square solution to such a problem is derived which generates the registration covariances, i.e., uncertainty information of rotation and translation, providing quite accurate probabilistic descriptions. Simulation results indicate the correctness of the proposed method and also present its efficiency on computation-time consumption, compared with previous algorithms using singular value decomposition (SVD) and linear matrix inequality (LMI). The proposed scheme is then applied to an interpolation problem on the special Euclidean group SE(n) with covariance-preserving functionality. Finally, experiments on covariance-aided Lidar mapping show practical superiority in robotic navigation.
Collapse
|
4
|
Zhang H, Ding J, Jiang M, Tan KC, Chai T. Inverse Gaussian Process Modeling for Evolutionary Dynamic Multiobjective Optimization. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:11240-11253. [PMID: 34033561 DOI: 10.1109/tcyb.2021.3070434] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
For dynamic multiobjective optimization problems (DMOPs), it is challenging to track the varying Pareto-optimal front. Most traditional approaches estimate the Pareto-optimal sets in the decision space. However, the obtained solutions do not necessarily satisfy the desired properties of decision makers in the objective space. Inverse model-based algorithms have a great potential to solve such problems. Nonetheless, the existing ones have low precision for handling DMOPs with nonlinear correlations between the objective and decision vectors, which greatly limits the application of the inverse models. In this article, an inverse Gaussian process (IGP)-based prediction approach for solving DMOPs is proposed. Unlike most traditional approaches, this approach exploits the IGP to construct a predictor that maps the historical optimal solutions from the objective space to the decision space. A sampling mechanism is developed for generating sample points in the objective space. Then, the IGP-based predictor is employed to generate an effective initial population by using these sample points. The proposed method by introducing IGP can obtain solutions with better diversity and convergence in the objective space, which is more responsive to the demand of decision makers than the traditional methods. It also has better performance than other inverse model-based methods in solving nonlinear DMOPs. To investigate the performance of the proposed approach, experiments have been conducted on 23 benchmark problems and a real-world raw ore allocation problem in mineral processing. The experimental results demonstrate that the proposed algorithm can significantly improve the dynamic optimization performance and has certain practical significance for solving real-world DMOPs.
Collapse
|
5
|
Wang Y, Zhao Y, Ying S, Du S, Gao Y. Rotation-Invariant Point Cloud Representation for 3-D Model Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10948-10956. [PMID: 35316205 DOI: 10.1109/tcyb.2022.3157593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Three-dimensional (3-D) data have many applications in the field of computer vision and a point cloud is one of the most popular modalities. Therefore, how to establish a good representation for a point cloud is a core issue in computer vision, especially for 3-D object recognition tasks. Existing approaches mainly focus on the invariance of representation under the group of permutations. However, for point cloud data, it should also be rotation invariant. To address such invariance, in this article, we introduce a relation of equivalence under the action of rotation group, through which the representation of point cloud is located in a homogeneous space. That is, two point clouds are regarded as equivalent when they are only different from a rotation. Our network is flexibly incorporated into existing frameworks for point clouds, which guarantees the proposed approach to be rotation invariant. Besides, a sufficient analysis on how to parameterize the group SO(3) into a convolutional network, which captures a relation with all rotations in 3-D Euclidean space [Formula: see text]. We select the optimal rotation as the best representation of point cloud and propose a solution for minimizing the problem on the rotation group SO(3) by using its geometric structure. To validate the rotation invariance, we combine it with two existing deep models and evaluate them on ModelNet40 dataset and its subset ModelNet10. Experimental results indicate that the proposed strategy improves the performance of those existing deep models when the data involve arbitrary rotations.
Collapse
|
6
|
Choe S, Seong H, Kim E. Indoor Place Category Recognition for a Cleaning Robot by Fusing a Probabilistic Approach and Deep Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:7265-7276. [PMID: 33600336 DOI: 10.1109/tcyb.2021.3052499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Indoor place category recognition for a cleaning robot is a problem in which a cleaning robot predicts the category of the indoor place using images captured by it. This is similar to scene recognition in computer vision as well as semantic mapping in robotics. Compared with scene recognition, the indoor place category recognition considered in this article differs as follows: 1) the indoor places include typical home objects; 2) a sequence of images instead of an isolated image is provided because the images are captured successively by a cleaning robot; and 3) the camera of the cleaning robot has a different view compared with those of cameras typically used by human beings. Compared with semantic mapping, indoor place category recognition can be considered as a component in semantic SLAM. In this article, a new method based on the combination of a probabilistic approach and deep learning is proposed to address indoor place category recognition for a cleaning robot. Concerning the probabilistic approach, a new place-object fusion method is proposed based on Bayesian inference. For deep learning, the proposed place-object fusion method is trained using a convolutional neural network in an end-to-end framework. Furthermore, a new recurrent neural network, called the Bayesian filtering network (BFN), is proposed to conduct time-domain fusion. Finally, the proposed method is applied to a benchmark dataset and a new dataset developed in this article, and its validity is demonstrated experimentally.
Collapse
|
7
|
Tang L, Chen K, Wu C, Hong Y, Jia K, Yang ZX. Improving Semantic Analysis on Point Clouds via Auxiliary Supervision of Local Geometric Priors. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:4949-4959. [PMID: 33095729 DOI: 10.1109/tcyb.2020.3025798] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Existing deep learning algorithms for point cloud analysis mainly concern discovering semantic patterns from the global configuration of local geometries in a supervised learning manner. However, very few explore geometric properties revealing local surface manifolds embedded in 3-D Euclidean space to discriminate semantic classes or object parts as additional supervision signals. This article is the first attempt to propose a unique multitask geometric learning network to improve semantic analysis by auxiliary geometric learning with local shape properties, which can be either generated via physical computation from point clouds themselves as self-supervision signals or provided as privileged information. Owing to explicitly encoding local shape manifolds in favor of semantic analysis, the proposed geometric self-supervised and privileged learning algorithms can achieve superior performance to their backbone baselines and other state-of-the-art methods, which are verified in the experiments on the popular benchmarks.
Collapse
|
8
|
Wang F, Zhuang Y, Zhang H, Gu H. Real-Time 3-D Semantic Scene Parsing With LiDAR Sensors. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1351-1363. [PMID: 32310814 DOI: 10.1109/tcyb.2020.2982947] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article proposes a novel deep-learning framework, called RSSP, for real-time 3-D scene understanding with LiDAR sensors. To this end, we introduce new sparse strided operations based on the sparse tensor representation of point clouds. Compared with conventional convolution operations, the time and space complexity of our sparse strided operations are proportional to the number of occupied voxels N rather than the input spatial size r 3 (often N << r 3 for LiDAR data). This enables our method to process point clouds at high resolutions (e.g., 20483) with a high speed (130 ms for classifying a single frame from Velodyne HDL-64). The main structure includes a CNN model built upon our sparse strided operations and a conditional random field (CRF) model to impose spatial consistency on the final predictions. A highly parallel implementation of our system is presented for both CPU-GPU and CPU-only environments. The efficiency and effectiveness of our approach are demonstrated on two public datasets (Semantic3D.net and KITTI). The experimental results and benchmark tests show that our system can be effectively applied for online 3-D data analyses with comparable or better accuracy than the state-of-the-art methods.
Collapse
|
9
|
Wang L, Wu J, Liu X, Ma X, Cheng J. Semantic segmentation of large-scale point clouds based on dilated nearest neighbors graph. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00618-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
AbstractThree-dimensional (3D) semantic segmentation of point clouds is important in many scenarios, such as automatic driving, robotic navigation, while edge computing is indispensable in the devices. Deep learning methods based on point sampling prove to be computation and memory efficient to tackle large-scale point clouds (e.g. millions of points). However, some local features may be abandoned while sampling. In this paper, We present one end-to-end 3D semantic segmentation framework based on dilated nearest neighbor encoding. Instead of down-sampling point cloud directly, we propose a dilated nearest neighbor encoding module to broaden the network’s receptive field to learn more 3D geometric information. Without increase of network parameters, our method is computation and memory efficient for large-scale point clouds. We have evaluated the dilated nearest neighbor encoding in two different networks. The first is the random sampling with local feature aggregation. The second is the Point Transformer. We have evaluated the quality of the semantic segmentation on the benchmark 3D dataset S3DIS, and demonstrate that the proposed dilated nearest neighbor encoding exhibited stable advantages over baseline and competing methods.
Collapse
|
10
|
Wang C, Bai X, Wang X, Liu X, Zhou J, Wu X, Li H, Tao D. Self-Supervised Multiscale Adversarial Regression Network for Stereo Disparity Estimation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4770-4783. [PMID: 32649284 DOI: 10.1109/tcyb.2020.2999492] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep learning approaches have significantly contributed to recent progress in stereo matching. These deep stereo matching methods are usually based on supervised training, which requires a large amount of high-quality ground-truth depth map annotations that are expensive to collect. Furthermore, only a limited quantity of stereo vision training data are currently available, obtained either by active sensors (Lidar and ToF cameras) or through computer graphics simulations and not meeting requirements for deep supervised training. Here, we propose a novel deep stereo approach called the "self-supervised multiscale adversarial regression network (SMAR-Net)," which relaxes the need for ground-truth depth maps for training. Specifically, we design a two-stage network. The first stage is a disparity regressor, in which a regression network estimates disparity values from stacked stereo image pairs. Stereo image stacking method is a novel contribution as it not only contains the spatial appearances of stereo images but also implies matching correspondences with different disparity values. In the second stage, a synthetic left image is generated based on the left-right consistency assumption. Our network is trained by minimizing a hybrid loss function composed of a content loss and an adversarial loss. The content loss minimizes the average warping error between the synthetic images and the real ones. In contrast to the generative adversarial loss, our proposed adversarial loss penalizes mismatches using multiscale features. This constrains the synthetic image and real image as being pixelwise identical instead of just belonging to the same distribution. Furthermore, the combined utilization of multiscale feature extraction in both the content loss and adversarial loss further improves the adaptability of SMAR-Net in ill-posed regions. Experiments on multiple benchmark datasets show that SMAR-Net outperforms the current state-of-the-art self-supervised methods and achieves comparable outcomes to supervised methods. The source code can be accessed at: https://github.com/Dawnstar8411/SMAR-Net.
Collapse
|
11
|
Kong F, Xu W, Cai Y, Zhang F. Avoiding Dynamic Small Obstacles With Onboard Sensing and Computation on Aerial Robots. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3101877] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
12
|
Chai R, Savvaris A, Tsourdos A, Chai S, Xia Y, Wang S. Solving Trajectory Optimization Problems in the Presence of Probabilistic Constraints. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:4332-4345. [PMID: 30763253 DOI: 10.1109/tcyb.2019.2895305] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The objective of this paper is to present an approximation-based strategy for solving the problem of nonlinear trajectory optimization with the consideration of probabilistic constraints. The proposed method defines a smooth and differentiable function to replace probabilistic constraints by the deterministic ones, thereby converting the chance-constrained trajectory optimization model into a parametric nonlinear programming model. In addition, it is proved that the approximation function and the corresponding approximation set will converge to that of the original problem. Furthermore, the optimal solution of the approximated model is ensured to converge to the optimal solution of the original problem. Numerical results, obtained from a new chance-constrained space vehicle trajectory optimization model and a 3-D unmanned vehicle trajectory smoothing problem, verify the feasibility and effectiveness of the proposed approach. Comparative studies were also carried out to show the proposed design can yield good performance and outperform other typical chance-constrained optimization techniques investigated in this paper.
Collapse
|
13
|
Sun X, Wang S, Wang M, Wang Z, Liu M. A Novel Coding Architecture for LiDAR Point Cloud Sequence. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3010207] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
14
|
Hu Y, Zheng J, Zou J, Yang S, Ou J, Wang R. A dynamic multi-objective evolutionary algorithm based on intensity of environmental change. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.02.071] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
15
|
3D Exploration and Navigation with Optimal-RRT Planners for Ground Robots in Indoor Incidents. SENSORS 2019; 20:s20010220. [PMID: 31906019 PMCID: PMC6983016 DOI: 10.3390/s20010220] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 12/13/2019] [Accepted: 12/27/2019] [Indexed: 11/17/2022]
Abstract
Navigation and exploration in 3D environments is still a challenging task for autonomous robots that move on the ground. Robots for Search and Rescue missions must deal with unstructured and very complex scenarios. This paper presents a path planning system for navigation and exploration of ground robots in such situations. We use (unordered) point clouds as the main sensory input without building any explicit representation of the environment from them. These 3D points are employed as space samples by an Optimal-RRTplanner (RRT * ) to compute safe and efficient paths. The use of an objective function for path construction and the natural exploratory behaviour of the RRT * planner make it appropriate for the tasks. The approach is evaluated in different simulations showing the viability of autonomous navigation and exploration in complex 3D scenarios.
Collapse
|
16
|
An Efficient Encoding Voxel-Based Segmentation (EVBS) Algorithm Based on Fast Adjacent Voxel Search for Point Cloud Plane Segmentation. REMOTE SENSING 2019. [DOI: 10.3390/rs11232727] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Plane segmentation is a basic yet important process in light detection and ranging (LiDAR) point cloud processing. The traditional point cloud plane segmentation algorithm is typically affected by the number of point clouds and the noise data, which results in slow segmentation efficiency and poor segmentation effect. Hence, an efficient encoding voxel-based segmentation (EVBS) algorithm based on a fast adjacent voxel search is proposed in this study. First, a binary octree algorithm is proposed to construct the voxel as the segmentation object and code the voxel, which can compute voxel features quickly and accurately. Second, a voxel-based region growing algorithm is proposed to cluster the corresponding voxel to perform the initial point cloud segmentation, which can improve the rationality of seed selection. Finally, a refining point method is proposed to solve the problem of under-segmentation in unlabeled voxels by judging the relationship between the points and the segmented plane. Experimental results demonstrate that the proposed algorithm is better than the traditional algorithm in terms of computation time, extraction accuracy, and recall rate.
Collapse
|
17
|
Saeidi H, Ge J, Kam M, Opfermann JD, Leonard S, Joshi AS, Krieger A. Supervised Autonomous Electrosurgery via Biocompatible Near-Infrared Tissue Tracking Techniques. IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS 2019; 1:228-236. [PMID: 33458603 PMCID: PMC7810241 DOI: 10.1109/tmrb.2019.2949870] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Autonomous robotic surgery systems aim to improve patient outcomes by leveraging the repeatability and consistency of automation and also reducing human induced errors. However, intraoperative autonomous soft tissue tracking and robot control still remains a challenge due to the lack of structure, and high deformability of such tissues. In this paper, we take advantage of biocompatible Near-Infrared (NIR) marking methods and develop a supervised autonomous 3D path planning, filtering, and control strategy for our Smart Tissue Autonomous Robot (STAR) to enable precise and consistent incisions on complex 3D soft tissues. Our experimental results on cadaver porcine tongue samples indicate that the proposed strategy reduces surface incision error and depth incision error by 40.03% and 51.5%, respectively, compared to a teleoperation strategy via da Vinci. Furthermore, compared to an autonomous path planning method with linear interpolation between the NIR markers, the proposed strategy reduces the incision depth error by 48.58% by taking advantage of 3D tissue surface information.
Collapse
Affiliation(s)
- H. Saeidi
- Mechanical Engineering Department, University of Maryland, College Park, MD 20742, USA., Fischell Institute for Biomedical Devices and the Marlene and Stewart Greenebaum Cancer Center
| | - J. Ge
- Mechanical Engineering Department, University of Maryland, College Park, MD 20742, USA., Fischell Institute for Biomedical Devices and the Marlene and Stewart Greenebaum Cancer Center
| | - M. Kam
- Mechanical Engineering Department, University of Maryland, College Park, MD 20742, USA., Fischell Institute for Biomedical Devices and the Marlene and Stewart Greenebaum Cancer Center
| | - J. D. Opfermann
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Health System, 111 Michigan Ave. N.W., Washington, DC 20010
| | - S. Leonard
- Electrical and Computer Science Eng. Dept., Johns Hopkins University, Baltimore, MD 21211
| | - A. S. Joshi
- Division of Otolaryngology - Head & Neck Surgery at The George Washington University Medical Faculty Associates, 2300 M St. NW 4th Floor, Washington DC 20037
| | - A. Krieger
- Mechanical Engineering Department, University of Maryland, College Park, MD 20742, USA., Fischell Institute for Biomedical Devices and the Marlene and Stewart Greenebaum Cancer Center
| |
Collapse
|
18
|
Zhang L, Zhang L, Liu S. Role-based collaborative task planning of heterogeneous multi-autonomous underwater vehicles. INT J ADV ROBOT SYST 2019. [DOI: 10.1177/1729881419858536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The multi-autonomous underwater vehicles (Multi-AUVs) cluster is an important means to solve the marine tasks effectively. The heterogeneous Multi-AUVs are constrained by cooperative relationship, and a model of multi-autonomous underwater vehicles role-based collaborative task planning is proposed. The Multi-AUVs are set to different roles depending on the functional properties. To analyze the accountability of each role, and to ensure the reliability, the desired behavior and the estimate state of each role are described in the model. Task allocation needs to be implemented dynamically in path planning, for the existing of the cooperative relationships and the demand of tasks changes. Role-based task assessment and allocation methods are proposed to achieve dynamic adjustment of roles according to task requirements. Due to poor underwater communication conditions, the implicit coordination framework is implied to the coordinate information interaction to compensate the large delays in underwater communications and the reliance between Multi-AUVs. To adapt to the implicit collaborative framework and poor communication conditions, a variable communication radius (contract network) is proposed. The simulation result shows that the proposed method has well performance.
Collapse
Affiliation(s)
- Lanyong Zhang
- College of Automation, Harbin Engineering University, Heilongjiang, China
| | - Lei Zhang
- College of Automation, Harbin Engineering University, Heilongjiang, China
| | - Sheng Liu
- College of Automation, Harbin Engineering University, Heilongjiang, China
| |
Collapse
|
19
|
Sun X, Ma H, Sun Y, Liu M. A Novel Point Cloud Compression Algorithm Based on Clustering. IEEE Robot Autom Lett 2019. [DOI: 10.1109/lra.2019.2900747] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
20
|
Abstract
In this article, we present a human experience–inspired path planning algorithm for service robots. In addition to considering the path distance and smoothness, we emphasize the safety of robot navigation. Specifically, we build a speed field in accordance with several human driving experiences, like slowing down or detouring at a narrow aisle, and keeping a safe distance to the obstacles. Based on this speed field, the path curvatures, path distance, and steering speed are all integrated to form an energy function, which can be efficiently solved by the A* algorithm to seek the optimal path by resorting to an admissible heuristic function estimated from the energy function. Moreover, a simple yet effective fast path smoothing algorithm is proposed so as to ease the robots steering. Several examples are presented, demonstrating the effectiveness of our human experience–inspired path planning method.
Collapse
Affiliation(s)
- Wenyong Gong
- Department of Mathematics, Jinan University, Guangzhou, China
- Key Laboratory of Machine Intelligence and Advanced Computing, Sun Yat-sen University, Ministry of Education, Guangzhou, China
| | - Xiaohua Xie
- Key Laboratory of Machine Intelligence and Advanced Computing, Sun Yat-sen University, Ministry of Education, Guangzhou, China
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China
| | - Yong-Jin Liu
- TNList, Department of Computer Science and Technology, Tsinghua University, Beijing, China
| |
Collapse
|
21
|
Wang H, Wang J, Chen W, Xu L. Automatic illumination planning for robot vision inspection system. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.05.015] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Tai L, Li S, Liu M. Autonomous exploration of mobile robots through deep neural networks. INT J ADV ROBOT SYST 2017. [DOI: 10.1177/1729881417703571] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The exploration problem of mobile robots aims to allow mobile robots to explore an unknown environment. We describe an indoor exploration algorithm for mobile robots using a hierarchical structure that fuses several convolutional neural network layers with decision-making process. The whole system is trained end to end by taking only visual information (RGB-D information) as input and generates a sequence of main moving direction as output so that the robot achieves autonomous exploration ability. The robot is a TurtleBot with a Kinect mounted on it. The model is trained and tested in a real world environment. And the training data set is provided for download. The outputs of the test data are compared with the human decision. We use Gaussian process latent variable model to visualize the feature map of last convolutional layer, which proves the effectiveness of this deep convolution neural network mode. We also present a novel and lightweight deep-learning library libcnn especially for deep-learning processing of robotics tasks.
Collapse
Affiliation(s)
- Lei Tai
- Department of Mechanical and Biomedical Engineering, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Shaohua Li
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong
| | - Ming Liu
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong
| |
Collapse
|
23
|
An Optimal and Energy Efficient Multi-Sensor Collision-Free Path Planning Algorithm for a Mobile Robot in Dynamic Environments. ROBOTICS 2017. [DOI: 10.3390/robotics6020007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
24
|
Wang L, Liu M, Meng MQH. A Hierarchical Auction-Based Mechanism for Real-Time Resource Allocation in Cloud Robotic Systems. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:473-484. [PMID: 26887022 DOI: 10.1109/tcyb.2016.2519525] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Cloud computing enables users to share computing resources on-demand. The cloud computing framework cannot be directly mapped to cloud robotic systems with ad hoc networks since cloud robotic systems have additional constraints such as limited bandwidth and dynamic structure. However, most multirobotic applications with cooperative control adopt this decentralized approach to avoid a single point of failure. Robots need to continuously update intensive data to execute tasks in a coordinated manner, which implies real-time requirements. Thus, a resource allocation strategy is required, especially in such resource-constrained environments. This paper proposes a hierarchical auction-based mechanism, namely link quality matrix (LQM) auction, which is suitable for ad hoc networks by introducing a link quality indicator. The proposed algorithm produces a fast and robust method that is accurate and scalable. It reduces both global communication and unnecessary repeated computation. The proposed method is designed for firm real-time resource retrieval for physical multirobot systems. A joint surveillance scenario empirically validates the proposed mechanism by assessing several practical metrics. The results show that the proposed LQM auction outperforms state-of-the-art algorithms for resource allocation.
Collapse
|
25
|
Roh H, Jeong J, Cho Y, Kim A. Accurate Mobile Urban Mapping via Digital Map-Based SLAM. SENSORS 2016; 16:s16081315. [PMID: 27548175 PMCID: PMC5017480 DOI: 10.3390/s16081315] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Revised: 08/05/2016] [Accepted: 08/11/2016] [Indexed: 11/24/2022]
Abstract
This paper presents accurate urban map generation using digital map-based Simultaneous Localization and Mapping (SLAM). Throughout this work, our main objective is generating a 3D and lane map aiming for sub-meter accuracy. In conventional mapping approaches, achieving extremely high accuracy was performed by either (i) exploiting costly airborne sensors or (ii) surveying with a static mapping system in a stationary platform. Mobile scanning systems recently have gathered popularity but are mostly limited by the availability of the Global Positioning System (GPS). We focus on the fact that the availability of GPS and urban structures are both sporadic but complementary. By modeling both GPS and digital map data as measurements and integrating them with other sensor measurements, we leverage SLAM for an accurate mobile mapping system. Our proposed algorithm generates an efficient graph SLAM and achieves a framework running in real-time and targeting sub-meter accuracy with a mobile platform. Integrated with the SLAM framework, we implement a motion-adaptive model for the Inverse Perspective Mapping (IPM). Using motion estimation derived from SLAM, the experimental results show that the proposed approaches provide stable bird’s-eye view images, even with significant motion during the drive. Our real-time map generation framework is validated via a long-distance urban test and evaluated at randomly sampled points using Real-Time Kinematic (RTK)-GPS.
Collapse
Affiliation(s)
- Hyunchul Roh
- Robotics Program, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea.
| | - Jinyong Jeong
- Department of Civil and Environmental Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea.
| | - Younggun Cho
- Department of Civil and Environmental Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea.
| | - Ayoung Kim
- Department of Civil and Environmental Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea.
| |
Collapse
|