1
|
Ogunrinde I, Bernadin S. Improved DeepSORT-Based Object Tracking in Foggy Weather for AVs Using Sematic Labels and Fused Appearance Feature Network. SENSORS (BASEL, SWITZERLAND) 2024; 24:4692. [PMID: 39066088 PMCID: PMC11280926 DOI: 10.3390/s24144692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 07/05/2024] [Accepted: 07/17/2024] [Indexed: 07/28/2024]
Abstract
The presence of fog in the background can prevent small and distant objects from being detected, let alone tracked. Under safety-critical conditions, multi-object tracking models require faster tracking speed while maintaining high object-tracking accuracy. The original DeepSORT algorithm used YOLOv4 for the detection phase and a simple neural network for the deep appearance descriptor. Consequently, the feature map generated loses relevant details about the track being matched with a given detection in fog. Targets with a high degree of appearance similarity on the detection frame are more likely to be mismatched, resulting in identity switches or track failures in heavy fog. We propose an improved multi-object tracking model based on the DeepSORT algorithm to improve tracking accuracy and speed under foggy weather conditions. First, we employed our camera-radar fusion network (CR-YOLOnet) in the detection phase for faster and more accurate object detection. We proposed an appearance feature network to replace the basic convolutional neural network. We incorporated GhostNet to take the place of the traditional convolutional layers to generate more features and reduce computational complexities and costs. We adopted a segmentation module and fed the semantic labels of the corresponding input frame to add rich semantic information to the low-level appearance feature maps. Our proposed method outperformed YOLOv5 + DeepSORT with a 35.15% increase in multi-object tracking accuracy, a 32.65% increase in multi-object tracking precision, a speed increase by 37.56%, and identity switches decreased by 46.81%.
Collapse
Affiliation(s)
- Isaac Ogunrinde
- Department of Electrical and Computer Engineering, FAMU-FSU College of Engineering, Tallahassee, FL 32310, USA;
| | | |
Collapse
|
2
|
Denis L, Royen R, Bolsée Q, Vercheval N, Pižurica A, Munteanu A. GPU Rasterization-Based 3D LiDAR Simulation for Deep Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:8130. [PMID: 37836959 PMCID: PMC10574882 DOI: 10.3390/s23198130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/07/2023] [Accepted: 09/26/2023] [Indexed: 10/15/2023]
Abstract
High-quality data are of utmost importance for any deep-learning application. However, acquiring such data and their annotation is challenging. This paper presents a GPU-accelerated simulator that enables the generation of high-quality, perfectly labelled data for any Time-of-Flight sensor, including LiDAR. Our approach optimally exploits the 3D graphics pipeline of the GPU, significantly decreasing data generation time while preserving compatibility with all real-time rendering engines. The presented algorithms are generic and allow users to perfectly mimic the unique sampling pattern of any such sensor. To validate our simulator, two neural networks are trained for denoising and semantic segmentation. To bridge the gap between reality and simulation, a novel loss function is introduced that requires only a small set of partially annotated real data. It enables the learning of classes for which no labels are provided in the real data, hence dramatically reducing annotation efforts. With this work, we hope to provide means for alleviating the data acquisition problem that is pertinent to deep-learning applications.
Collapse
Affiliation(s)
- Leon Denis
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; (R.R.); (Q.B.); (A.M.)
- Imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Remco Royen
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; (R.R.); (Q.B.); (A.M.)
- Imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Quentin Bolsée
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; (R.R.); (Q.B.); (A.M.)
- Imec, Kapeldreef 75, 3001 Leuven, Belgium
| | - Nicolas Vercheval
- Department of Telecommunications and Information Processing (TELIN-GAIM), Ghent University, 9000 Ghent, Belgium; (N.V.); (A.P.)
- Department of Electronics and Information Systems, Clifford Research Group, Ghent University, 9000 Ghent, Belgium
| | - Aleksandra Pižurica
- Department of Telecommunications and Information Processing (TELIN-GAIM), Ghent University, 9000 Ghent, Belgium; (N.V.); (A.P.)
| | - Adrian Munteanu
- Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; (R.R.); (Q.B.); (A.M.)
- Imec, Kapeldreef 75, 3001 Leuven, Belgium
| |
Collapse
|
3
|
Semantic Segmentation of Agricultural Images Based on Style Transfer Using Conditional and Unconditional Generative Adversarial Networks. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Classification, segmentation, and recognition techniques based on deep-learning algorithms are used for smart farming. It is an important and challenging task to reduce the time, burden, and cost of annotation procedures for collected datasets from fields and crops that are changing in a wide variety of ways according to growing, weather patterns, and seasons. This study was conducted to generate crop image datasets for semantic segmentation based on an image style transfer using generative adversarial networks (GANs). To assess data-augmented performance and calculation burdens, our proposed framework comprises contrastive unpaired translation (CUT) for a conditional GAN, pix2pixHD for an unconditional GAN, and DeepLabV3+ for semantic segmentation. Using these networks, the proposed framework provides not only image generation for data augmentation, but also automatic labeling based on distinctive feature learning among domains. The Fréchet inception distance (FID) and mean intersection over union (mIoU) were used, respectively, as evaluation metrics for GANs and semantic segmentation. We used a public benchmark dataset and two original benchmark datasets to evaluate our framework of four image-augmentation types compared with the baseline without using GANs. The experimentally obtained results showed the efficacy of using augmented images, which we evaluated using FID and mIoU. The mIoU scores for the public benchmark dataset improved by 0.03 for the training subset, while remaining similar on the test subset. For the first original benchmark dataset, the mIoU scores improved by 0.01 for the test subset, while they dropped by 0.03 for the training subset. Finally, the mIoU scores for the second original benchmark dataset improved by 0.18 for the training subset and 0.03 for the test subset.
Collapse
|