1
|
Hu J, Wen X, Liu Y, Hu H, Zhang H. Research on the Method for Recognizing Bulk Grain-Loading Status Based on LiDAR. SENSORS (BASEL, SWITZERLAND) 2024; 24:5105. [PMID: 39204801 PMCID: PMC11359086 DOI: 10.3390/s24165105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 08/04/2024] [Accepted: 08/06/2024] [Indexed: 09/04/2024]
Abstract
Grain is a common bulk cargo. To ensure optimal utilization of transportation space and prevent overflow accidents, it is necessary to observe the grain's shape and determine the loading status during the loading process. Traditional methods often rely on manual judgment, which results in high labor intensity, poor safety, and low loading efficiency. Therefore, this paper proposes a method for recognizing the bulk grain-loading status based on Light Detection and Ranging (LiDAR). This method uses LiDAR to obtain point cloud data and constructs a deep learning network to perform target recognition and component segmentation on loading vehicles, extract vehicle positions and grain shapes, and recognize and make known the bulk grain-loading status. Based on the measured point cloud data of bulk grain loading, in the point cloud-classification task, the overall accuracy is 97.9% and the mean accuracy is 98.1%. In the vehicle component-segmentation task, the overall accuracy is 99.1% and the Mean Intersection over Union is 96.6%. The results indicate that the method has reliable performance in the research tasks of extracting vehicle positions, detecting grain shapes, and recognizing loading status.
Collapse
Affiliation(s)
| | | | | | | | - Hui Zhang
- School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 518107, China; (J.H.)
| |
Collapse
|
2
|
Tohidi F, Paul M, Ulhaq A, Chakraborty S. Improved Video-Based Point Cloud Compression via Segmentation. SENSORS (BASEL, SWITZERLAND) 2024; 24:4285. [PMID: 39001064 PMCID: PMC11243880 DOI: 10.3390/s24134285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/19/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024]
Abstract
A point cloud is a representation of objects or scenes utilising unordered points comprising 3D positions and attributes. The ability of point clouds to mimic natural forms has gained significant attention from diverse applied fields, such as virtual reality and augmented reality. However, the point cloud, especially those representing dynamic scenes or objects in motion, must be compressed efficiently due to its huge data volume. The latest video-based point cloud compression (V-PCC) standard for dynamic point clouds divides the 3D point cloud into many patches using computationally expensive normal estimation, segmentation, and refinement. The patches are projected onto a 2D plane to apply existing video coding techniques. This process often results in losing proximity information and some original points. This loss induces artefacts that adversely affect user perception. The proposed method segments dynamic point clouds based on shape similarity and occlusion before patch generation. This segmentation strategy helps maintain the points' proximity and retain more original points by exploiting the density and occlusion of the points. The experimental results establish that the proposed method significantly outperforms the V-PCC standard and other relevant methods regarding rate-distortion performance and subjective quality testing for both geometric and texture data of several benchmark video sequences.
Collapse
Affiliation(s)
- Faranak Tohidi
- School of Computing Mathematics and Engineering, Charles Sturt University, Bathurst, NSW 2795, Australia
| | - Manoranjan Paul
- School of Computing Mathematics and Engineering, Charles Sturt University, Bathurst, NSW 2795, Australia
| | - Anwaar Ulhaq
- School of Engineering and Technology, Centre for Intelligent Systems, Central Queensland University, Sydney Campus, Rockhampton, QLD 4701, Australia
| | - Subrata Chakraborty
- Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2351, Australia
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
- Griffith Business School, Griffith University, Brisbane, QLD 4111, Australia
| |
Collapse
|
3
|
Abohassan M, El-Basyouny K. Leveraging LiDAR-Based Simulations to Quantify the Complexity of the Static Environment for Autonomous Vehicles in Rural Settings. SENSORS (BASEL, SWITZERLAND) 2024; 24:452. [PMID: 38257547 PMCID: PMC10820782 DOI: 10.3390/s24020452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 01/05/2024] [Accepted: 01/09/2024] [Indexed: 01/24/2024]
Abstract
This paper uses virtual simulations to examine the interaction between autonomous vehicles (AVs) and their surrounding environment. A framework was developed to estimate the environment's complexity by calculating the real-time data processing requirements for AVs to navigate effectively. The VISTA simulator was used to synthesize viewpoints to replicate the captured environment accurately. With an emphasis on static physical features, roadways were dissected into relevant road features (RRFs) and full environment (FE) to study the impact of roadside features on the scene complexity and demonstrate the gravity of wildlife-vehicle collisions (WVCs) on AVs. The results indicate that roadside features substantially increase environmental complexity by up to 400%. Increasing a single lane to the road was observed to increase the processing requirements by 12.3-16.5%. Crest vertical curves decrease data rates due to occlusion challenges, with a reported average of 4.2% data loss, while sag curves can increase the complexity by 7%. In horizontal curves, roadside occlusion contributed to severe loss in road information, leading to a decrease in data rate requirements by as much as 19%. As for weather conditions, heavy rain increased the AV's processing demands by a staggering 240% when compared to normal weather conditions. AV developers and government agencies can exploit the findings of this study to better tailor AV designs and meet the necessary infrastructure requirements.
Collapse
Affiliation(s)
- Mohamed Abohassan
- Department of Civil and Environmental Engineering, Faculty of Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada;
| | | |
Collapse
|
4
|
Wang S, Jiang H, Qiao Y, Jiang S. A Method for Obtaining 3D Point Cloud Data by Combining 2D Image Segmentation and Depth Information of Pigs. Animals (Basel) 2023; 13:2472. [PMID: 37570282 PMCID: PMC10417003 DOI: 10.3390/ani13152472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 07/21/2023] [Accepted: 07/25/2023] [Indexed: 08/13/2023] Open
Abstract
This paper proposes a method for automatic pig detection and segmentation using RGB-D data for precision livestock farming. The proposed method combines the enhanced YOLOv5s model with the Res2Net bottleneck structure, resulting in improved fine-grained feature extraction and ultimately enhancing the precision of pig detection and segmentation in 2D images. Additionally, the method facilitates the acquisition of 3D point cloud data of pigs in a simpler and more efficient way by using the pig mask obtained in 2D detection and segmentation and combining it with depth information. To evaluate the effectiveness of the proposed method, two datasets were constructed. The first dataset consists of 5400 images captured in various pig pens under diverse lighting conditions, while the second dataset was obtained from the UK. The experimental results demonstrated that the improved YOLOv5s_Res2Net achieved a mAP@0.5:0.95 of 89.6% and 84.8% for both pig detection and segmentation tasks on our dataset, while achieving a mAP@0.5:0.95 of 93.4% and 89.4% on the Edinburgh pig behaviour dataset. This approach provides valuable insights for improving pig management, conducting welfare assessments, and estimating weight accurately.
Collapse
Affiliation(s)
- Shunli Wang
- College of Information Science and Engineering, Shandong Agricultural University, Tai’an 271018, China; (S.W.); (H.J.)
| | - Honghua Jiang
- College of Information Science and Engineering, Shandong Agricultural University, Tai’an 271018, China; (S.W.); (H.J.)
| | - Yongliang Qiao
- Australian Institute for Machine Learning (AIML), The University of Adelaide, Adelaide, SA 5005, Australia
| | - Shuzhen Jiang
- Key Laboratory of Efficient Utilisation of Non-Grain Feed Resources (Co-Construction by Ministry and Province), Ministry of Agriculture and Rural Affairs, Department of Animal Science and Technology, Shandong Agricultural University, Tai’an 271018, China;
| |
Collapse
|
5
|
Mari D, Camuffo E, Milani S. CACTUS: Content-Aware Compression and Transmission Using Semantics for Automotive LiDAR Data. SENSORS (BASEL, SWITZERLAND) 2023; 23:5611. [PMID: 37420777 DOI: 10.3390/s23125611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/08/2023] [Accepted: 06/13/2023] [Indexed: 07/09/2023]
Abstract
Many recent cloud or edge computing strategies for automotive applications require transmitting huge amounts of Light Detection and Ranging (LiDAR) data from terminals to centralized processing units. As a matter of fact, the development of effective Point Cloud (PC) compression strategies that preserve semantic information, which is critical for scene understanding, proves to be crucial. Segmentation and compression have always been treated as two independent tasks; however, since not all the semantic classes are equally important for the end task, this information can be used to guide data transmission. In this paper, we propose Content-Aware Compression and Transmission Using Semantics (CACTUS), which is a coding framework that exploits semantic information to optimize the data transmission, partitioning the original point set into separate data streams. Experimental results show that differently from traditional strategies, the independent coding of semantically consistent point sets preserves class information. Additionally, whenever semantic information needs to be transmitted to the receiver, using the CACTUS strategy leads to gains in terms of compression efficiency, and more in general, it improves the speed and flexibility of the baseline codec used to compress the data.
Collapse
Affiliation(s)
- Daniele Mari
- Department of Information Engineering, University of Padova, Via Gradenigo 6/A, 35131 Padova, Italy
| | - Elena Camuffo
- Department of Information Engineering, University of Padova, Via Gradenigo 6/A, 35131 Padova, Italy
| | - Simone Milani
- Department of Information Engineering, University of Padova, Via Gradenigo 6/A, 35131 Padova, Italy
| |
Collapse
|
6
|
Zhang C, Czarnuch S. Point cloud completion in challenging indoor scenarios with human motion. Front Robot AI 2023; 10:1184614. [PMID: 37251352 PMCID: PMC10209708 DOI: 10.3389/frobt.2023.1184614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 04/24/2023] [Indexed: 05/31/2023] Open
Abstract
Combining and completing point cloud data from two or more sensors with arbitrarily relative perspectives in a dynamic, cluttered, and complex environment is challenging, especially when the two sensors have significant perspective differences while the large overlap ratio and feature-rich scene cannot be guaranteed. We create a novel approach targeting this challenging scenario by registering two camera captures in a time series with unknown perspectives and human movements to easily use our system in a real-life scene. In our approach, we first reduce the six unknowns of 3D point cloud completion to three by aligning the ground planes found by our previous perspective-independent 3D ground plane estimation algorithm. Subsequently, we use a histogram-based approach to identify and extract all the humans from each frame generating a three-dimensional (3D) human walking sequence in a time series. To enhance accuracy and performance, we convert 3D human walking sequences to lines by calculating the center of mass (CoM) point of each human body and connecting them. Finally, we match the walking paths in different data trials by minimizing the Fréchet distance between two walking paths and using 2D iterative closest point (ICP) to find the remaining three unknowns in the overall transformation matrix for the final alignment. Using this approach, we can successfully register the corresponding walking path of the human between the two cameras' captures and estimate the transformation matrix between the two sensors.
Collapse
Affiliation(s)
- Chengsi Zhang
- Department of Electrical and Computer Engineering, Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John’s, NL, Canada
| | - Stephen Czarnuch
- Department of Electrical and Computer Engineering, Faculty of Engineering and Applied Science and the Discipline of Emergency Medicine, Faculty of Medicine, Memorial University of Newfoundland, St. John’s, NL, Canada
| |
Collapse
|
7
|
Kordež J, Marolt M, Bohak C. Real-Time Interpolated Rendering of Terrain Point Cloud Data. SENSORS (BASEL, SWITZERLAND) 2022; 23:72. [PMID: 36616669 PMCID: PMC9824316 DOI: 10.3390/s23010072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 12/13/2022] [Accepted: 12/20/2022] [Indexed: 06/17/2023]
Abstract
Most real-time terrain point cloud rendering techniques do not address the empty space between the points but rather try to minimize it by changing the way the points are rendered by either rendering them bigger or with more appropriate shapes such as paraboloids. In this work, we propose an alternative approach to point cloud rendering, which addresses the empty space between the points and tries to fill it with appropriate values to achieve the best possible output. The proposed approach runs in real time and outperforms several existing point cloud rendering techniques in terms of speed and render quality.
Collapse
Affiliation(s)
- Jaka Kordež
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000 Ljubljana, Slovenia
| | - Matija Marolt
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000 Ljubljana, Slovenia
| | - Ciril Bohak
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000 Ljubljana, Slovenia
- Visual Computing Center, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| |
Collapse
|
8
|
Rodziewicz-Bielewicz J, Korzeń M. Comparison of Graph Fitting and Sparse Deep Learning Model for Robot Pose Estimation. SENSORS (BASEL, SWITZERLAND) 2022; 22:6518. [PMID: 36080976 PMCID: PMC9460051 DOI: 10.3390/s22176518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 08/23/2022] [Accepted: 08/26/2022] [Indexed: 06/15/2023]
Abstract
The paper presents a simple, yet robust computer vision system for robot arm tracking with the use of RGB-D cameras. Tracking means to measure in real time the robot state given by three angles and with known restrictions about the robot geometry. The tracking system consists of two parts: image preprocessing and machine learning. In the machine learning part, we compare two approaches: fitting the robot pose to the point cloud and fitting the convolutional neural network model to the sparse 3D depth images. The advantage of the presented approach is direct use of the point cloud transformed to the sparse image in the network input and use of sparse convolutional and pooling layers (sparse CNN). The experiments confirm that the robot tracking is performed in real time and with an accuracy comparable to the accuracy of the depth sensor.
Collapse
|
9
|
Multimodal Semantic Segmentation in Autonomous Driving: A Review of Current Approaches and Future Perspectives. TECHNOLOGIES 2022. [DOI: 10.3390/technologies10040090] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The perception of the surrounding environment is a key requirement for autonomous driving systems, yet the computation of an accurate semantic representation of the scene starting from RGB information alone is very challenging. In particular, the lack of geometric information and the strong dependence on weather and illumination conditions introduce critical challenges for approaches tackling this task. For this reason, most autonomous cars exploit a variety of sensors, including color, depth or thermal cameras, LiDARs, and RADARs. How to efficiently combine all these sources of information to compute an accurate semantic description of the scene is still an unsolved task, leading to an active research field. In this survey, we start by presenting the most commonly employed acquisition setups and datasets. Then we review several different deep learning architectures for multimodal semantic segmentation. We will discuss the various techniques to combine color, depth, LiDAR, and other modalities of data at different stages of the learning architectures, and we will show how smart fusion strategies allow us to improve performances with respect to the exploitation of a single source of information.
Collapse
|
10
|
Restoration of Individual Tree Missing Point Cloud Based on Local Features of Point Cloud. REMOTE SENSING 2022. [DOI: 10.3390/rs14061346] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
LiDAR (Light Detection And Ranging) technology is an important means to obtain three-dimensional information of trees and vegetation. However, due to the influence of scanning mode, environmental occlusion and mutual occlusion between tree canopies and other factors, a tree point cloud often has different degrees of data loss, which affects the high-precision quantitative extraction of vegetation parameters. Aiming at the problem of a tree laser point cloud being missing, an individual tree incomplete point cloud restoration method based on local features of the point cloud is proposed. The L1-Median algorithm is used to extract key points of the tree skeleton, then the dominant direction of skeleton key points and local point cloud density are calculated, and the point cloud near the missing area is moved based on these features to gradually complete the incomplete point cloud compensation. The experimental results show that the above repair method can effectively repair the incomplete point cloud with good robustness and can adapt to the individual tree point cloud with different geometric structures and correct the branch topological connection errors.
Collapse
|