1
|
Liu S, Zhu Y, Aoyama T, Nakaya M, Hasegawa Y. Latent Space Search-Based Adaptive Template Generation for Enhanced Object Detection in Bin-Picking Applications. SENSORS (BASEL, SWITZERLAND) 2024; 24:6050. [PMID: 39338795 PMCID: PMC11435534 DOI: 10.3390/s24186050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 08/22/2024] [Accepted: 09/17/2024] [Indexed: 09/30/2024]
Abstract
Template matching is a common approach in bin-picking tasks. However, it often struggles in complex environments, such as those with different object poses, various background appearances, and varying lighting conditions, due to the limited feature representation of a single template. Additionally, during the bin-picking process, the template needs to be frequently updated to maintain detection performance, and finding an adaptive template from a vast dataset poses another challenge. To address these challenges, we propose a novel template searching method in a latent space trained by a Variational Auto-Encoder (VAE), which generates an adaptive template dynamically based on the current environment. The proposed method was evaluated experimentally under various conditions, and in all scenarios, it successfully completed the tasks, demonstrating its effectiveness and robustness for bin-picking applications. Furthermore, we integrated our proposed method with YOLO, and the experimental results indicate that our method effectively improves YOLO's detection performance.
Collapse
Affiliation(s)
- Songtao Liu
- Department of Micro-Nano Mechanical Science and Engineering, Nagoya University, Nagoya 464-8601, Japan
| | - Yaonan Zhu
- The School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Tadayoshi Aoyama
- Department of Micro-Nano Mechanical Science and Engineering, Nagoya University, Nagoya 464-8601, Japan
| | - Masayuki Nakaya
- Robot Division, System Department, NACHI-FUJIKOSHI CORP., Toyama 930-8511, Japan
| | - Yasuhisa Hasegawa
- Department of Micro-Nano Mechanical Science and Engineering, Nagoya University, Nagoya 464-8601, Japan
| |
Collapse
|
2
|
Bin Picking for Ship-Building Logistics Using Perception and Grasping Systems. ROBOTICS 2023. [DOI: 10.3390/robotics12010015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Bin picking is a challenging task involving many research domains within the perception and grasping fields, for which there are no perfect and reliable solutions available that are applicable to a wide range of unstructured and cluttered environments present in industrial factories and logistics centers. This paper contributes with research on the topic of object segmentation in cluttered scenarios, independent of previous object shape knowledge, for textured and textureless objects. In addition, it addresses the demand for extended datasets in deep learning tasks with realistic data. We propose a solution using a Mask R-CNN for 2D object segmentation, trained with real data acquired from a RGB-D sensor and synthetic data generated in Blender, combined with 3D point-cloud segmentation to extract a segmented point cloud belonging to a single object from the bin. Next, it is employed a re-configurable pipeline for 6-DoF object pose estimation, followed by a grasp planner to select a feasible grasp pose. The experimental results show that the object segmentation approach is efficient and accurate in cluttered scenarios with several occlusions. The neural network model was trained with both real and simulated data, enhancing the success rate from the previous classical segmentation, displaying an overall grasping success rate of 87.5%.
Collapse
|
3
|
Xu Y, Arai S, Liu D, Lin F, Kosuge K. FPCC: Fast point cloud clustering-based instance segmentation for industrial bin-picking. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
4
|
Li X, Cao R, Feng Y, Chen K, Yang B, Fu CW, Li Y, Dou Q, Liu YH, Heng PA. A Sim-to-Real Object Recognition and Localization Framework for Industrial Robotic Bin Picking. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3149026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Xianzhi Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Rui Cao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Yidan Feng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Kai Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Biqi Yang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Chi-Wing Fu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Yichuan Li
- Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong kong
| | - Qi Dou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| | - Yun-Hui Liu
- Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong kong
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong kong
| |
Collapse
|
5
|
Wan G, Wang G, Xing K, Fan Y, Yi T. Robot visual measurement and grasping strategy for roughcastings. INT J ADV ROBOT SYST 2021. [DOI: 10.1177/1729881421999937] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
To overcome the challenging problem of visual measurement and grasping of roughcasts, a visual grasping strategy for an industrial robot is designed and implemented on the basis of deep learning and a deformable template matching algorithm. The strategy helps realize the positioning recognition and grasping guidance for a metal blank cast in complex backgrounds under the interference of external light. The proposed strategy has two phases: target detection and target localization. In the target detection stage, a deep learning algorithm is used to recognize the combined features of the surface of an object for a stable recognition of the object in nonstructured environments. In the target localization stage, high-precision positioning of metal casts with an unclear contour is realized by combining the deformable template matching and LINE-MOD algorithms. The experimental results show that the system can accurately provide visual grasping guidance for robots.
Collapse
Affiliation(s)
- Guoyang Wan
- Department of Marine Electrical Engineering, Dalina Maritime University, Dalian, People’s Republic of China
| | - Guofeng Wang
- Department of Marine Electrical Engineering, Dalina Maritime University, Dalian, People’s Republic of China
| | - Kaisheng Xing
- Xinwu Economic Development Zone, Anhui Institute of Information Technology, Wuhu, People’s Republic of China
| | - Yunsheng Fan
- Department of Marine Electrical Engineering, Dalina Maritime University, Dalian, People’s Republic of China
| | - Tinghao Yi
- University of Science and Technology of China, Hefei, People’s Republic of China
| |
Collapse
|
6
|
Recognition and Grasping of Disorderly Stacked Wood Planks Using a Local Image Patch and Point Pair Feature Method. SENSORS 2020; 20:s20216235. [PMID: 33142905 PMCID: PMC7663447 DOI: 10.3390/s20216235] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 10/15/2020] [Accepted: 10/28/2020] [Indexed: 11/18/2022]
Abstract
Considering the difficult problem of robot recognition and grasping in the scenario of disorderly stacked wooden planks, a recognition and positioning method based on local image features and point pair geometric features is proposed here and we define a local patch point pair feature. First, we used self-developed scanning equipment to collect images of wood boards and a robot to drive a RGB-D camera to collect images of disorderly stacked wooden planks. The image patches cut from these images were input to a convolutional autoencoder to train and obtain a local texture feature descriptor that is robust to changes in perspective. Then, the small image patches around the point pairs of the plank model are extracted, and input into the trained encoder to obtain the feature vector of the image patch, combining the point pair geometric feature information to form a feature description code expressing the characteristics of the plank. After that, the robot drives the RGB-D camera to collect the local image patches of the point pairs in the area to be grasped in the scene of the stacked wooden planks, also obtaining the feature description code of the wooden planks to be grasped. Finally, through the process of point pair feature matching, pose voting and clustering, the pose of the plank to be grasped is determined. The robot grasping experiment here shows that both the recognition rate and grasping success rate of planks are high, reaching 95.3% and 93.8%, respectively. Compared with the traditional point pair feature method (PPF) and other methods, the method present here has obvious advantages and can be applied to stacked wood plank grasping environments.
Collapse
|
7
|
Hong YD, Kim YJ, Lee KB. Smart Pack: Online Autonomous Object-Packing System Using RGB-D Sensor Data. SENSORS 2020; 20:s20164448. [PMID: 32784913 PMCID: PMC7472128 DOI: 10.3390/s20164448] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 08/04/2020] [Accepted: 08/06/2020] [Indexed: 11/30/2022]
Abstract
This paper proposes a novel online object-packing system which can measure the dimensions of every incoming object and calculate its desired position in a given container. Existing object-packing systems have the limitations of requiring the exact information of objects in advance or assuming them as boxes. Thus, this paper is mainly focused on the following two points: (1) Real-time calculation of the dimensions and orientation of an object; (2) Online optimization of the object’s position in a container. The dimensions and orientation of the object are obtained using an RGB-D sensor when the object is picked by a manipulator and moved over a certain position. The optimal position of the object is calculated by recognizing the container’s available space using another RGB-D sensor and minimizing the cost function that is formulated by the available space information and the optimization criteria inspired by the way people place things. The experimental results show that the proposed system successfully places the incoming various shaped objects in their proper positions.
Collapse
Affiliation(s)
- Young-Dae Hong
- Department of Electrical Engineering, Ajou University, Suwon 443-749, Korea;
| | - Young-Joo Kim
- Korea Railroad Research Institute, Uiwang 437-757, Korea;
| | - Ki-Baek Lee
- Department of Electrical Engineering, Kwangwoon University, Seoul 01897, Korea
- Correspondence:
| |
Collapse
|
8
|
Inline Inspection with an Industrial Robot (IIIR) for Mass-Customization Production Line. SENSORS 2020; 20:s20113008. [PMID: 32466352 PMCID: PMC7309129 DOI: 10.3390/s20113008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/17/2020] [Accepted: 05/25/2020] [Indexed: 11/24/2022]
Abstract
Robots are essential for the rapid development of Industry 4.0. In order to truly achieve autonomous robot control in customizable production lines, robots need to be accurate enough and capable of recognizing the geometry and orientation of an arbitrarily shaped object. This paper presents a method of inline inspection with an industrial robot (IIIR) for mass-customization production lines. A 3D scanner was used to capture the geometry and orientation of the object to be inspected. As the object entered the working range of the robot, the end effector moved along with the object and the camera installed at the end effector performed the requested optical inspections. The detailed information about the developed methodology was introduced in this paper. The experiments showed there was a relative movement between the moving object and the following camera and the speed was around 0.34 mm per second (worst case was around 0.94 mm per second). For a camera of 60 frames per second, the relative moving speed between the object and the camera was around 6 micron (around 16 micron for the worst case), which was stable enough for most industrial production inspections.
Collapse
|
9
|
Philipsen MP, Moeslund TB. Cutting Pose Prediction from Point Clouds. SENSORS 2020; 20:s20061563. [PMID: 32168888 PMCID: PMC7146437 DOI: 10.3390/s20061563] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 03/06/2020] [Accepted: 03/09/2020] [Indexed: 11/16/2022]
Abstract
The challenge of getting machines to understand and interact with natural objects is encountered in important areas such as medicine, agriculture, and, in our case, slaughterhouse automation. Recent breakthroughs have enabled the application of Deep Neural Networks (DNN) directly to point clouds, an efficient and natural representation of 3D objects. The potential of these methods has mostly been demonstrated for classification and segmentation tasks involving rigid man-made objects. We present a method, based on the successful PointNet architecture, for learning to regress correct tool placement from human demonstrations, using virtual reality. Our method is applied to a challenging slaughterhouse cutting task, which requires an understanding of the local geometry including the shape, size, and orientation. We propose an intermediate five-Degree of Freedom (DoF) cutting plane representation, a point and a normal vector, which eases the demonstration and learning process. A live experiment is conducted in order to unveil issues and begin to understand the required accuracy. Eleven cuts are rated by an expert, with 8 / 11 being rated as acceptable. The error on the test set is subsequently reduced through the addition of more training data and improvements to the DNN. The result is a reduction in the average translation from 1.5 cm to 0.8 cm and the orientation error from 4 . 59 to 4 . 48 . The method's generalization capacity is assessed on a similar task from the slaughterhouse and on the very different public LINEMOD dataset for object pose estimation across view points. In both cases, the method shows promising results. Code, datasets, and supplementary materials are available at https://github.com/markpp/PoseFromPointClouds.
Collapse
Affiliation(s)
- Mark P. Philipsen
- Media Technology, Aalborg University, 9000 Aalborg, Denmark;
- Danish Technological Institute, Gregersensvej 9, 2630 Taastrup, Denmark
- Correspondence:
| | | |
Collapse
|
10
|
Jiang P, Ishihara Y, Sugiyama N, Oaki J, Tokura S, Sugahara A, Ogawa A. Depth Image-Based Deep Learning of Grasp Planning for Textureless Planar-Faced Objects in Vision-Guided Robotic Bin-Picking. SENSORS (BASEL, SWITZERLAND) 2020; 20:E706. [PMID: 32012874 PMCID: PMC7038393 DOI: 10.3390/s20030706] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 01/17/2020] [Accepted: 01/23/2020] [Indexed: 11/16/2022]
Abstract
Bin-picking of small parcels and other textureless planar-faced objects is a common task at warehouses. A general color image-based vision-guided robot picking system requires feature extraction and goal image preparation of various objects. However, feature extraction for goal image matching is difficult for textureless objects. Further, prior preparation of huge numbers of goal images is impractical at a warehouse. In this paper, we propose a novel depth image-based vision-guided robot bin-picking system for textureless planar-faced objects. Our method uses a deep convolutional neural network (DCNN) model that is trained on 15,000 annotated depth images synthetically generated in a physics simulator to directly predict grasp points without object segmentation. Unlike previous studies that predicted grasp points for a robot suction hand with only one vacuum cup, our DCNN also predicts optimal grasp patterns for a hand with two vacuum cups (left cup on, right cup on, or both cups on). Further, we propose a surface feature descriptor to extract surface features (center position and normal) and refine the predicted grasp point position, removing the need for texture features for vision-guided robot control and sim-to-real modification for DCNN model training. Experimental results demonstrate the efficiency of our system, namely that a robot with 7 degrees of freedom can pick randomly posed textureless boxes in a cluttered environment with a 97.5% success rate at speeds exceeding 1000 pieces per hour.
Collapse
|