1
|
Bader C, Schwieger V. Advancing ADAS Perception: A Sensor-Parameterized Implementation of the GM-PHD Filter. Sensors (Basel) 2024; 24:2436. [PMID: 38676054 PMCID: PMC11054760 DOI: 10.3390/s24082436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 04/02/2024] [Accepted: 04/07/2024] [Indexed: 04/28/2024]
Abstract
Modern vehicles equipped with Advanced Driver Assistance Systems (ADAS) rely heavily on sensor fusion to achieve a comprehensive understanding of their surrounding environment. Traditionally, the Kalman Filter (KF) has been a popular choice for this purpose, necessitating complex data association and track management to ensure accurate results. To address errors introduced by these processes, the application of the Gaussian Mixture Probability Hypothesis Density (GM-PHD) filter is a good choice. This alternative filter implicitly handles the association and appearance/disappearance of tracks. The approach presented here allows for the replacement of KF frameworks in many applications while achieving runtimes below 1 ms on the test system. The key innovations lie in the utilization of sensor-based parameter models to implicitly handle varying Fields of View (FoV) and sensing capabilities. These models represent sensor-specific properties such as detection probability and clutter density across the state space. Additionally, we introduce a method for propagating additional track properties such as classification with the GM-PHD filter, further contributing to its versatility and applicability. The proposed GM-PHD filter approach surpasses a KF approach on the KITTI dataset and another custom dataset. The mean OSPA(2) error could be reduced from 1.56 (KF approach) to 1.40 (GM-PHD approach), showcasing its potential in ADAS perception.
Collapse
Affiliation(s)
- Christian Bader
- Institute of Engineering Geodesy, University of Stuttgart, Geschwister-Scholl-Str. 24D, 70174 Stuttgart, Germany;
- Daimler Truck AG, Fasanenweg 10, 70771 Leinfelden-Echterdingen, Germany
| | - Volker Schwieger
- Institute of Engineering Geodesy, University of Stuttgart, Geschwister-Scholl-Str. 24D, 70174 Stuttgart, Germany;
| |
Collapse
|
2
|
Zhao R, Zhang X, Zhang J. PSMOT: Online Occlusion-Aware Multi-Object Tracking Exploiting Position Sensitivity. Sensors (Basel) 2024; 24:1199. [PMID: 38400356 PMCID: PMC10892105 DOI: 10.3390/s24041199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/25/2024] [Accepted: 02/06/2024] [Indexed: 02/25/2024]
Abstract
Models based on joint detection and re-identification (ReID), which significantly increase the efficiency of online multi-object tracking (MOT) systems, are an evolution from separate detection and ReID models in the tracking-by-detection (TBD) paradigm. It is observed that these joint models are typically one-stage, while the two-stage models become obsolete because of their slow speed and low efficiency. However, the two-stage models have naive advantages over the one-stage anchor-based and anchor-free models in handling feature misalignment and occlusion, which suggests that the two-stage models, via meticulous design, could be on par with the state-of-the-art one-stage models. Following this intuition, we propose a robust and efficient two-stage joint model based on R-FCN, whose backbone and neck are fully convolutional, and the RoI-wise process only involves simple calculations. In the first stage, an adaptive sparse anchoring scheme is utilized to produce adequate, high-quality proposals to improve efficiency. To boost both detection and ReID, two key elements-feature aggregation and feature disentanglement-are taken into account. To improve robustness against occlusion, the position-sensitivity is exploited, first to estimate occlusion and then to direct the post-process for anti-occlusion. Finally, we link the model to a hierarchical association algorithm to form a complete MOT system called PSMOT. Compared to other cutting-edge systems, PSMOT achieves competitive performance while maintaining time efficiency.
Collapse
Affiliation(s)
- Ranyang Zhao
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, China;
| | | | - Jianwei Zhang
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, China;
| |
Collapse
|
3
|
Zhou H, Chung S, Kakar JK, Kim SC, Kim H. Pig Movement Estimation by Integrating Optical Flow with a Multi-Object Tracking Model. Sensors (Basel) 2023; 23:9499. [PMID: 38067875 PMCID: PMC10708576 DOI: 10.3390/s23239499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/23/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023]
Abstract
Pig husbandry constitutes a significant segment within the broader framework of livestock farming, with porcine well-being emerging as a paramount concern due to its direct implications on pig breeding and production. An easily observable proxy for assessing the health of pigs lies in their daily patterns of movement. The daily movement patterns of pigs can be used as an indicator of their health, in which more active pigs are usually healthier than those who are not active, providing farmers with knowledge of identifying pigs' health state before they become sick or their condition becomes life-threatening. However, the conventional means of estimating pig mobility largely rely on manual observations by farmers, which is impractical in the context of contemporary centralized and extensive pig farming operations. In response to these challenges, multi-object tracking and pig behavior methods are adopted to monitor pig health and welfare closely. Regrettably, these existing methods frequently fall short of providing precise and quantified measurements of movement distance, thereby yielding a rudimentary metric for assessing pig health. This paper proposes a novel approach that integrates optical flow and a multi-object tracking algorithm to more accurately gauge pig movement based on both qualitative and quantitative analyses of the shortcomings of solely relying on tracking algorithms. The optical flow records accurate movement between two consecutive frames and the multi-object tracking algorithm offers individual tracks for each pig. By combining optical flow and the tracking algorithm, our approach can accurately estimate each pig's movement. Moreover, the incorporation of optical flow affords the capacity to discern partial movements, such as instances where only the pig's head is in motion while the remainder of its body remains stationary. The experimental results show that the proposed method has superiority over the method of solely using tracking results, i.e., bounding boxes. The reason is that the movement calculated based on bounding boxes is easily affected by the size fluctuation while the optical flow data can avoid these drawbacks and even provide more fine-grained motion information. The virtues inherent in the proposed method culminate in the provision of more accurate and comprehensive information, thus enhancing the efficacy of decision-making and management processes within the realm of pig farming.
Collapse
Affiliation(s)
- Heng Zhou
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea; (H.Z.); (J.K.K.)
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju 54896, Republic of Korea; (S.C.); (S.C.K.)
| | - Seyeon Chung
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju 54896, Republic of Korea; (S.C.); (S.C.K.)
| | - Junaid Khan Kakar
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea; (H.Z.); (J.K.K.)
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju 54896, Republic of Korea; (S.C.); (S.C.K.)
| | - Sang Cheol Kim
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju 54896, Republic of Korea; (S.C.); (S.C.K.)
| | - Hyongsuk Kim
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju 54896, Republic of Korea; (S.C.); (S.C.K.)
- Department of Electronics Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
4
|
Syed MAB, Ahmed I. A CNN-LSTM Architecture for Marine Vessel Track Association Using Automatic Identification System (AIS) Data. Sensors (Basel) 2023; 23:6400. [PMID: 37514694 PMCID: PMC10386167 DOI: 10.3390/s23146400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 07/06/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
In marine surveillance, distinguishing between normal and anomalous vessel movement patterns is critical for identifying potential threats in a timely manner. Once detected, it is important to monitor and track these vessels until a necessary intervention occurs. To achieve this, track association algorithms are used, which take sequential observations comprising the geological and motion parameters of the vessels and associate them with respective vessels. The spatial and temporal variations inherent in these sequential observations make the association task challenging for traditional multi-object tracking algorithms. Additionally, the presence of overlapping tracks and missing data can further complicate the trajectory tracking process. To address these challenges, in this study, we approach this tracking task as a multivariate time series problem and introduce a 1D CNN-LSTM architecture-based framework for track association. This special neural network architecture can capture the spatial patterns as well as the long-term temporal relations that exist among the sequential observations. During the training process, it learns and builds the trajectory for each of these underlying vessels. Once trained, the proposed framework takes the marine vessel's location and motion data collected through the automatic identification system (AIS) as input and returns the most likely vessel track as output in real-time. To evaluate the performance of our approach, we utilize an AIS dataset containing observations from 327 vessels traveling in a specific geographic region. We measure the performance of our proposed framework using standard performance metrics such as accuracy, precision, recall, and F1 score. When compared with other competitive neural network architectures, our approach demonstrates a superior tracking performance.
Collapse
Affiliation(s)
- Md Asif Bin Syed
- Industrial and Management Systems Engineering Department, West Virginia University, Morgantown, WV 26506, USA
| | - Imtiaz Ahmed
- Industrial and Management Systems Engineering Department, West Virginia University, Morgantown, WV 26506, USA
| |
Collapse
|
5
|
Park J, Hong J, Shim W, Jung DJ. Multi-Object Tracking on SWIR Images for City Surveillance in an Edge-Computing Environment. Sensors (Basel) 2023; 23:6373. [PMID: 37514671 PMCID: PMC10385020 DOI: 10.3390/s23146373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 07/07/2023] [Accepted: 07/09/2023] [Indexed: 07/30/2023]
Abstract
Although Short-Wave Infrared (SWIR) sensors have advantages in terms of robustness in bad weather and low-light conditions, the SWIR images have not been well studied for automated object detection and tracking systems. The majority of previous multi-object tracking studies have focused on pedestrian tracking in visible-spectrum images, but tracking different types of vehicles is also important in city-surveillance scenarios. In addition, the previous studies were based on high-computing-power environments such as GPU workstations or servers, but edge computing should be considered to reduce network bandwidth usage and privacy concerns in city-surveillance scenarios. In this paper, we propose a fast and effective multi-object tracking method, called Multi-Class Distance-based Tracking (MCDTrack), on SWIR images of city-surveillance scenarios in a low-power and low-computation edge-computing environment. Eight-bit integer quantized object detection models are used, and simple distance and IoU-based similarity scores are employed to realize effective multi-object tracking in an edge-computing environment. Our MCDTrack is not only superior to previous multi-object tracking methods but also shows high tracking accuracy of 77.5% MOTA and 80.2% IDF1 although the object detection and tracking are performed on the edge-computing device. Our study results indicate that a robust city-surveillance solution can be developed based on the edge-computing environment and low-frame-rate SWIR images.
Collapse
Affiliation(s)
- Jihun Park
- A2Mind Inc., Daejeon 34087, Republic of Korea
| | | | - Wooil Shim
- A2Mind Inc., Daejeon 34087, Republic of Korea
| | | |
Collapse
|
6
|
Chen X, Pu H, He Y, Lai M, Zhang D, Chen J, Pu H. An Efficient Method for Monitoring Birds Based on Object Detection and Multi-Object Tracking Networks. Animals (Basel) 2023; 13:ani13101713. [PMID: 37238144 DOI: 10.3390/ani13101713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 05/28/2023] Open
Abstract
To protect birds, it is crucial to identify their species and determine their population across different regions. However, currently, bird monitoring methods mainly rely on manual techniques, such as point counts conducted by researchers and ornithologists in the field. This method can sometimes be inefficient, prone to errors, and have limitations, which may not always be conducive to bird conservation efforts. In this paper, we propose an efficient method for wetland bird monitoring based on object detection and multi-object tracking networks. First, we construct a manually annotated dataset for bird species detection, annotating the entire body and head of each bird separately, comprising 3737 bird images. We also built a new dataset containing 11,139 complete, individual bird images for the multi-object tracking task. Second, we perform comparative experiments using a state-of-the-art batch of object detection networks, and the results demonstrated that the YOLOv7 network, trained with a dataset labeling the entire body of the bird, was the most effective method. To enhance YOLOv7 performance, we added three GAM modules on the head side of the YOLOv7 to minimize information diffusion and amplify global interaction representations and utilized Alpha-IoU loss to achieve more accurate bounding box regression. The experimental results revealed that the improved method offers greater accuracy, with mAP@0.5 improving to 0.951 and mAP@0.5:0.95 improving to 0.815. Then, we send the detection information to DeepSORT for bird tracking and classification counting. Finally, we use the area counting method to count according to the species of birds to obtain information about flock distribution. The method described in this paper effectively addresses the monitoring challenges in bird conservation.
Collapse
Affiliation(s)
- Xian Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Hongli Pu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Yihui He
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Mengzhen Lai
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Daike Zhang
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Junyang Chen
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
| | - Haibo Pu
- College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China
- Ya'an Digital Agricultural Engineering Technology Research Center, Ya'an 625000, China
| |
Collapse
|
7
|
Li Q, Hu S, Shimasaki K, Ishii I. An Active Multi-Object Ultrafast Tracking System with CNN-Based Hybrid Object Detection. Sensors (Basel) 2023; 23:4150. [PMID: 37112491 PMCID: PMC10145589 DOI: 10.3390/s23084150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 04/09/2023] [Accepted: 04/11/2023] [Indexed: 06/19/2023]
Abstract
This study proposes a visual tracking system that can detect and track multiple fast-moving appearance-varying targets simultaneously with 500 fps image processing. The system comprises a high-speed camera and a pan-tilt galvanometer system, which can rapidly generate large-scale high-definition images of the wide monitored area. We developed a CNN-based hybrid tracking algorithm that can robustly track multiple high-speed moving objects simultaneously. Experimental results demonstrate that our system can track up to three moving objects with velocities lower than 30 m per second simultaneously within an 8-m range. The effectiveness of our system was demonstrated through several experiments conducted on simultaneous zoom shooting of multiple moving objects (persons and bottles) in a natural outdoor scene. Moreover, our system demonstrates high robustness to target loss and crossing situations.
Collapse
Affiliation(s)
| | | | | | - Idaku Ishii
- Correspondence: ; Tel.: +81-82-424-7692; Fax: +81-82-422-7158
| |
Collapse
|
8
|
Hou H, Shen C, Zhang X, Gao W. CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again. Sensors (Basel) 2023; 23:3782. [PMID: 37050842 PMCID: PMC10098982 DOI: 10.3390/s23073782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 03/24/2023] [Accepted: 04/04/2023] [Indexed: 06/19/2023]
Abstract
The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder-decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline.
Collapse
Affiliation(s)
- Haoxiong Hou
- Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China
- University of Chinese Academy of Sciences, Beijing 101408, China
| | - Chao Shen
- Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China
- University of Chinese Academy of Sciences, Beijing 101408, China
| | - Ximing Zhang
- Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China
| | - Wei Gao
- Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China
| |
Collapse
|
9
|
Mohammed SAK, Razak MZA, Rahman AHA. 3D-DIoU: 3D Distance Intersection over Union for Multi-Object Tracking in Point Cloud. Sensors (Basel) 2023; 23:3390. [PMID: 37050449 PMCID: PMC10098770 DOI: 10.3390/s23073390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 03/05/2023] [Accepted: 03/08/2023] [Indexed: 06/19/2023]
Abstract
Multi-object tracking (MOT) is a prominent and important study in point cloud processing and computer vision. The main objective of MOT is to predict full tracklets of several objects in point cloud. Occlusion and similar objects are two common problems that reduce the algorithm's performance throughout the tracking phase. The tracking performance of current MOT techniques, which adopt the 'tracking-by-detection' paradigm, is degrading, as evidenced by increasing numbers of identification (ID) switch and tracking drifts because it is difficult to perfectly predict the location of objects in complex scenes that are unable to track. Since the occluded object may have been visible in former frames, we manipulated the speed and location position of the object in the previous frames in order to guess where the occluded object might have been. In this paper, we employed a unique intersection over union (IoU) method in three-dimension (3D) planes, namely a distance IoU non-maximum suppression (DIoU-NMS) to accurately detect objects, and consequently we use 3D-DIoU for an object association process in order to increase tracking robustness and speed. By using a hybrid 3D DIoU-NMS and 3D-DIoU method, the tracking speed improved significantly. Experimental findings on the Waymo Open Dataset and nuScenes dataset, demonstrate that our multistage data association and tracking technique has clear benefits over previously developed algorithms in terms of tracking accuracy. In comparison with other 3D MOT tracking methods, our proposed approach demonstrates significant enhancement in tracking performances.
Collapse
Affiliation(s)
- Sazan Ali Kamal Mohammed
- Institute of Microengineering and Nanoelectronics (IMEN), Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia
- Department of Automotive Technology, Erbil Technology College, Erbil Polytechnic University, Erbil 44001, Iraq
| | - Mohd Zulhakimi Ab Razak
- Institute of Microengineering and Nanoelectronics (IMEN), Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia
| | - Abdul Hadi Abd Rahman
- Center for Artificial Intelligence Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia
| |
Collapse
|
10
|
Zhou X, Chan S, Qiu C, Jiang X, Tang T. Multi-Target Tracking Based on a Combined Attention Mechanism and Occlusion Sensing in a Behavior-Analysis System. Sensors (Basel) 2023; 23:2956. [PMID: 36991667 PMCID: PMC10056893 DOI: 10.3390/s23062956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 03/02/2023] [Accepted: 03/02/2023] [Indexed: 06/19/2023]
Abstract
Multi-object tracking (MOT) is a topic of great interest in the field of computer vision, which is essential in smart behavior-analysis systems for healthcare, such as human-flow monitoring, crime analysis, and behavior warnings. Most MOT methods achieve stability by combining object-detection and re-identification networks. However, MOT requires high efficiency and accuracy in complex environments with occlusions and interference. This often increases the algorithm's complexity, affects the speed of tracking calculations, and reduces real-time performance. In this paper, we present an improved MOT method combining an attention mechanism and occlusion sensing as a solution. A convolutional block attention module (CBAM) calculates the weights of space and channel attention from the feature map. The attention weights are used to fuse the feature maps to extract adaptively robust object representations. An occlusion-sensing module detects an object's occlusion, and the appearance characteristics of an occluded object are not updated. This can enhance the model's ability to extract object features and improve appearance feature pollution caused by the short-term occlusion of an object. Experiments on public datasets demonstrate the competitive performance of the proposed method compared with the state-of-the-art MOT methods. The experimental results show that our method has powerful data association capability, e.g., 73.2% MOTA and 73.9% IDF1 on the MOT17 dataset.
Collapse
Affiliation(s)
- Xiaolong Zhou
- College of Electrical and Information Engineering at Quzhou University, Quzhou 324000, China
- Key Lab of Spatial Data Mining & Information Sharing of Ministry of Education, Fuzhou 350108, China
| | - Sixian Chan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
- Hubei Key Laboratory of Intelligent Vision-Based Monitoring for Hydroelectric Engineering, The College of Computer and Information at China Three Gorges University, Yichang 443002, China
| | - Chenhao Qiu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaodan Jiang
- College of Electrical and Information Engineering at Quzhou University, Quzhou 324000, China
| | - Tinglong Tang
- Hubei Key Laboratory of Intelligent Vision-Based Monitoring for Hydroelectric Engineering, The College of Computer and Information at China Three Gorges University, Yichang 443002, China
| |
Collapse
|
11
|
Das A, Hameed M, Prather R, Farias M, Divo E, Kassab A, Nykanen D, DeCampli W. In-Silico and In-Vitro Analysis of the Novel Hybrid Comprehensive Stage II Operation for Single Ventricle Circulation. Bioengineering (Basel) 2023; 10:bioengineering10020135. [PMID: 36829630 PMCID: PMC9952694 DOI: 10.3390/bioengineering10020135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 12/22/2022] [Accepted: 01/05/2023] [Indexed: 01/20/2023] Open
Abstract
Single ventricle (SV) anomalies account for one-fourth of all congenital heart disease cases. The existing palliative treatment for this anomaly achieves a survival rate of only 50%. To reduce the trauma associated with surgical management, the hybrid comprehensive stage II (HCSII) operation was designed as an alternative for a select subset of SV patients with the adequate antegrade aortic flow. This study aims to provide better insight into the hemodynamics of HCSII patients utilizing a multiscale Computational Fluid Dynamics (CFD) model and a mock flow loop (MFL). Both 3D-0D loosely coupled CFD and MFL models have been tuned to match baseline hemodynamic parameters obtained from patient-specific catheterization data. The hemodynamic findings from clinical data closely match the in-vitro and in-silico measurements and show a strong correlation (r = 0.9). The geometrical modification applied to the models had little effect on the oxygen delivery. Similarly, the particle residence time study reveals that particles injected in the main pulmonary artery (MPA) have successfully ejected within one cardiac cycle, and no pathological flows were observed.
Collapse
Affiliation(s)
- Arka Das
- Department of Mechanical Engineering, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA
- Correspondence: ; Tel.: +1-386-241-1457
| | - Marwan Hameed
- Department of Mechanical Engineering, American University of Bahrain, Riffa 942, Bahrain
| | - Ray Prather
- Department of Mechanical Engineering, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA
- Department of Mechanical and Aerospace Engineering, University of Central Florida, Orlando, FL 32816, USA
- The Heart Center at Orlando Health Arnold Palmer Hospital for Children, Orlando, FL 32806, USA
| | - Michael Farias
- The Heart Center at Orlando Health Arnold Palmer Hospital for Children, Orlando, FL 32806, USA
- Department of Clinical Sciences, College of Medicine, University of Central Florida, Orlando, FL 32816, USA
| | - Eduardo Divo
- Department of Mechanical Engineering, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114, USA
| | - Alain Kassab
- Department of Mechanical and Aerospace Engineering, University of Central Florida, Orlando, FL 32816, USA
| | - David Nykanen
- The Heart Center at Orlando Health Arnold Palmer Hospital for Children, Orlando, FL 32806, USA
- Department of Clinical Sciences, College of Medicine, University of Central Florida, Orlando, FL 32816, USA
| | - William DeCampli
- The Heart Center at Orlando Health Arnold Palmer Hospital for Children, Orlando, FL 32806, USA
- Department of Clinical Sciences, College of Medicine, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
12
|
Myat Noe S, Zin TT, Tin P, Kobayashi I. Comparing State-of-the-Art Deep Learning Algorithms for the Automated Detection and Tracking of Black Cattle. Sensors (Basel) 2023; 23:532. [PMID: 36617130 PMCID: PMC9824081 DOI: 10.3390/s23010532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 12/28/2022] [Accepted: 12/29/2022] [Indexed: 06/17/2023]
Abstract
Effective livestock management is critical for cattle farms in today's competitive era of smart modern farming. To ensure farm management solutions are efficient, affordable, and scalable, the manual identification and detection of cattle are not feasible in today's farming systems. Fortunately, automatic tracking and identification systems have greatly improved in recent years. Moreover, correctly identifying individual cows is an integral part of predicting behavior during estrus. By doing so, we can monitor a cow's behavior, and pinpoint the right time for artificial insemination. However, most previous techniques have relied on direct observation, increasing the human workload. To overcome this problem, this paper proposes the use of state-of-the-art deep learning-based Multi-Object Tracking (MOT) algorithms for a complete system that can automatically and continuously detect and track cattle using an RGB camera. This study compares state-of-the-art MOTs, such as Deep-SORT, Strong-SORT, and customized light-weight tracking algorithms. To improve the tracking accuracy of these deep learning methods, this paper presents an enhanced re-identification approach for a black cattle dataset in Strong-SORT. For evaluating MOT by detection, the system used the YOLO v5 and v7, as a comparison with the instance segmentation model Detectron-2, to detect and classify the cattle. The high cattle-tracking accuracy with a Multi-Object Tracking Accuracy (MOTA) was 96.88%. Using these methods, the findings demonstrate a highly accurate and robust cattle tracking system, which can be applied to innovative monitoring systems for agricultural applications. The effectiveness and efficiency of the proposed system were demonstrated by analyzing a sample of video footage. The proposed method was developed to balance the trade-off between costs and management, thereby improving the productivity and profitability of dairy farms; however, this method can be adapted to other domestic species.
Collapse
Affiliation(s)
- Su Myat Noe
- Interdisciplinary Graduate School of Agriculture and Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
| | - Thi Thi Zin
- Graduate School of Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
| | - Pyke Tin
- Graduate School of Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
| | - Ikuo Kobayashi
- Field Science Center, Faculty of Agriculture, University of Miyazaki, Miyazaki 889-2192, Japan
| |
Collapse
|
13
|
Zhang G, Yin J, Deng P, Sun Y, Zhou L, Zhang K. Achieving Adaptive Visual Multi-Object Tracking with Unscented Kalman Filter. Sensors (Basel) 2022; 22:9106. [PMID: 36501808 PMCID: PMC9741288 DOI: 10.3390/s22239106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 11/18/2022] [Accepted: 11/21/2022] [Indexed: 06/17/2023]
Abstract
As an essential part of intelligent monitoring, behavior recognition, automatic driving, and others, the challenge of multi-object tracking is still to ensure tracking accuracy and robustness, especially in complex occlusion environments. Aiming at the issues of the occlusion, background noise, and motion state violent change for multi-object in a complex scene, an improved DeepSORT algorithm based on YOLOv5 is proposed for multi-object tracking to enhance the speed and accuracy of tracking. Firstly, a general object motion model is devised, which is similar to the variable acceleration motion model, and a multi-object tracking framework with the general motion model is established. Then, the latest YOLOv5 algorithm, which has satisfactory detection accuracy, is utilized to obtain the object information as the input of multi-object tracking. An unscented Kalman filter (UKF) is proposed to estimate the motion state of multi-object to solve nonlinear errors. In addition, the adaptive factor is introduced to evaluate observation noise and detect abnormal observations so as to adaptively adjust the innovation covariance matrix. Finally, an improved DeepSORT algorithm for multi-object tracking is formed to promote robustness and accuracy. Extensive experiments are carried out on the MOT16 data set, and we compare the proposed algorithm with the DeepSORT algorithm. The results indicate that the speed and precision of the improved DeepSORT are increased by 4.75% and 2.30%, respectively. Especially in the MOT16 of the dynamic camera, the improved DeepSORT shows better performance.
Collapse
Affiliation(s)
- Guowei Zhang
- School of Safety Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Jiyao Yin
- Shenzhen Urban Public Safety and Technology Institute, Shenzhen 518046, China
- Key Laboratory of Urban Safety Risk Monitoring and Early Warning, Ministry of Emergency Management, Shenzhen 518046, China
| | - Peng Deng
- Shenzhen Urban Public Safety and Technology Institute, Shenzhen 518046, China
- Key Laboratory of Urban Safety Risk Monitoring and Early Warning, Ministry of Emergency Management, Shenzhen 518046, China
| | - Yanlong Sun
- Shenzhen Urban Public Safety and Technology Institute, Shenzhen 518046, China
- Key Laboratory of Urban Safety Risk Monitoring and Early Warning, Ministry of Emergency Management, Shenzhen 518046, China
| | - Lin Zhou
- Shenzhen Urban Public Safety and Technology Institute, Shenzhen 518046, China
- Key Laboratory of Urban Safety Risk Monitoring and Early Warning, Ministry of Emergency Management, Shenzhen 518046, China
| | - Kuiyuan Zhang
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
14
|
Boragule A, Jang H, Ha N, Jeon M. Pixel-Guided Association for Multi-Object Tracking. Sensors (Basel) 2022; 22:8922. [PMID: 36433519 PMCID: PMC9692782 DOI: 10.3390/s22228922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/08/2022] [Accepted: 11/12/2022] [Indexed: 06/16/2023]
Abstract
Propagation and association tasks in Multi-Object Tracking (MOT) play a pivotal role in accurately linking the trajectories of moving objects. Recently, modern deep learning models have been addressing these tasks by introducing fragmented solutions for each different problem such as appearance modeling, motion modeling, and object associations. To bring unification in the MOT task, we introduce a pixel-guided approach to efficiently build the joint-detection and tracking framework for multi-object tracking. Specifically, the up-sampled multi-scale features from consecutive frames are queued to detect the object locations by using a transformer-decoder, and per-pixel distributions are utilized to compute the association matrix according to object queries. Additionally, we introduce a long-term appearance association on track features to learn the long-term association of tracks against detections to compute the similarity matrix. Finally, a similarity matrix is jointly integrated with the Byte-Tracker resulting in a state-of-the-art MOT performance. The experiments with the standard MOT15 and MOT17 benchmarks show that our approach achieves significant tracking performance.
Collapse
Affiliation(s)
- Abhijeet Boragule
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Hyunsung Jang
- LIG Nex1 Company Ltd., Yongin-si 16911, Republic of Korea
| | - Namkoo Ha
- LIG Nex1 Company Ltd., Yongin-si 16911, Republic of Korea
| | - Moongu Jeon
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| |
Collapse
|
15
|
Kim Y, Cho J. AIDM-Strat: Augmented Illegal Dumping Monitoring Strategy through Deep Neural Network-Based Spatial Separation Attention of Garbage. Sensors (Basel) 2022; 22:8819. [PMID: 36433416 PMCID: PMC9696417 DOI: 10.3390/s22228819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 11/10/2022] [Accepted: 11/13/2022] [Indexed: 06/16/2023]
Abstract
Economic and social progress in the Republic of Korea resulted in an increased standard of living, which subsequently produced more waste. The Korean government implemented a volume-based trash disposal system that may modify waste disposal characteristics to handle vast volumes of waste efficiently. However, the inconvenience of having to purchase standard garbage bags on one's own led to passive participation by citizens and instances of illegally dumping waste in non-standard plastic bags. As a result, there is a need for the development of automatic detection and reporting of illegal acts of garbage dumping. To achieve this, we suggest a system for tracking unlawful rubbish disposal that is based on deep neural networks. The proposed monitoring approach obtains the articulation points (joints) of a dumper through OpenPose and identifies the type of garbage bag through the object detection model, You Only Look Once (YOLO), to determine the distance of the dumper's wrist to the garbage bag and decide whether it is illegal dumping. Additionally, we introduced a method of tracking the IDs issued to the waste bags using the multi-object tracking (MOT) model to reduce the false detection of illegal dumping. To evaluate the efficacy of the proposed illegal dumping monitoring system, we compared it with the other systems based on behavior recognition. As a result, it was validated that the suggested approach had a higher degree of accuracy and a lower percentage of false alarms, making it useful for a variety of upcoming applications.
Collapse
|
16
|
Yoo YS, Lee SH, Bae SH. Effective Multi-Object Tracking via Global Object Models and Object Constraint Learning. Sensors (Basel) 2022; 22:7943. [PMID: 36298293 PMCID: PMC9609386 DOI: 10.3390/s22207943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 10/12/2022] [Accepted: 10/12/2022] [Indexed: 06/16/2023]
Abstract
Effective multi-object tracking is still challenging due to the trade-off between tracking accuracy and speed. Because the recent multi-object tracking (MOT) methods leverage object appearance and motion models so as to associate detections between consecutive frames, the key for effective multi-object tracking is to reduce the computational complexity of learning both models. To this end, this work proposes global appearance and motion models to discriminate multiple objects instead of learning local object-specific models. In concrete detail, it learns a global appearance model using contrastive learning between object appearances. In addition, we learn a global relation motion model using relative motion learning between objects. Moreover, this paper proposes object constraint learning for improving tracking efficiency. This study considers the discriminability of the models as a constraint, and learns both models when inconsistency with the constraint occurs. Therefore, object constraint learning differs from the conventional online learning for multi-object tracking which updates learnable parameters per frame. This work incorporates global models and object constraint learning into the confidence-based association method, and compare our tracker with the state-of-the-art methods on public available MOT Challenge datasets. As a result, we achieve 64.5% MOTA (multi-object tracking accuracy) and 6.54 Hz tracking speed on the MOT16 test dataset. The comparison results show that our methods can contribute to improve tracking accuracy and tracking speed together.
Collapse
|
17
|
Dave S, Clark R, Lee RSK. RSOnet: An Image-Processing Framework for a Dual-Purpose Star Tracker as an Opportunistic Space Surveillance Sensor. Sensors (Basel) 2022; 22:s22155688. [PMID: 35957245 PMCID: PMC9370977 DOI: 10.3390/s22155688] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/12/2022] [Accepted: 07/01/2022] [Indexed: 06/12/2023]
Abstract
A catalogue of over 22,000 objects in Earth's orbit is currently maintained, and that number is expected to double within the next decade. Novel data collection regimes are needed to scale our ability to detect, track, classify and characterize resident space objects in a crowded low Earth orbit. This research presents RSOnet, an image-processing framework for space domain awareness using star trackers. Star trackers are cost-effective, flight proven, and require basic image processing to be used as an attitude-determination sensor. RSOnet is designed to augment the capabilities of a star tracker by becoming an opportunistic space-surveillance sensor. Our research demonstrates that star trackers are a feasible source for RSO detections in LEO by demonstrating the performance of RSOnet on real detections from a star-tracker-like imager in space. RSOnet convolutional-neural-network model architecture, graph-based multi-object classifier and characterization results are described in this paper.
Collapse
|
18
|
Heo J, Kwon Y(J. 3D Vehicle Trajectory Extraction Using DCNN in an Overlapping Multi-Camera Crossroad Scene. Sensors (Basel) 2021; 21:s21237879. [PMID: 34883887 PMCID: PMC8659789 DOI: 10.3390/s21237879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/15/2021] [Accepted: 11/24/2021] [Indexed: 11/16/2022]
Abstract
The 3D vehicle trajectory in complex traffic conditions such as crossroads and heavy traffic is practically very useful in autonomous driving. In order to accurately extract the 3D vehicle trajectory from a perspective camera in a crossroad where the vehicle has an angular range of 360 degrees, problems such as the narrow visual angle in single-camera scene, vehicle occlusion under conditions of low camera perspective, and lack of vehicle physical information must be solved. In this paper, we propose a method for estimating the 3D bounding boxes of vehicles and extracting trajectories using a deep convolutional neural network (DCNN) in an overlapping multi-camera crossroad scene. First, traffic data were collected using overlapping multi-cameras to obtain a wide range of trajectories around the crossroad. Then, 3D bounding boxes of vehicles were estimated and tracked in each single-camera scene through DCNN models (YOLOv4, multi-branch CNN) combined with camera calibration. Using the abovementioned information, the 3D vehicle trajectory could be extracted on the ground plane of the crossroad by calculating results obtained from the overlapping multi-camera with a homography matrix. Finally, in experiments, the errors of extracted trajectories were corrected through a simple linear interpolation and regression, and the accuracy of the proposed method was verified by calculating the difference with ground-truth data. Compared with other previously reported methods, our approach is shown to be more accurate and more practical.
Collapse
|
19
|
Lee Y, Lee SH, Yoo J, Kwon S. Efficient Single-Shot Multi-Object Tracking for Vehicles in Traffic Scenarios. Sensors (Basel) 2021; 21:s21196358. [PMID: 34640675 PMCID: PMC8512362 DOI: 10.3390/s21196358] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/10/2021] [Accepted: 09/18/2021] [Indexed: 11/16/2022]
Abstract
Multi-object tracking is a significant field in computer vision since it provides essential information for video surveillance and analysis. Several different deep learning-based approaches have been developed to improve the performance of multi-object tracking by applying the most accurate and efficient combinations of object detection models and appearance embedding extraction models. However, two-stage methods show a low inference speed since the embedding extraction can only be performed at the end of the object detection. To alleviate this problem, single-shot methods, which simultaneously perform object detection and embedding extraction, have been developed and have drastically improved the inference speed. However, there is a trade-off between accuracy and efficiency. Therefore, this study proposes an enhanced single-shot multi-object tracking system that displays improved accuracy while maintaining a high inference speed. With a strong feature extraction and fusion, the object detection of our model achieves an AP score of 69.93% on the UA-DETRAC dataset and outperforms previous state-of-the-art methods, such as FairMOT and JDE. Based on the improved object detection performance, our multi-object tracking system achieves a MOTA score of 68.5% and a PR-MOTA score of 24.5% on the same dataset, also surpassing the previous state-of-the-art trackers.
Collapse
Affiliation(s)
- Youngkeun Lee
- Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea; (Y.L.); (S.-h.L.); (J.Y.)
| | - Sang-ha Lee
- Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea; (Y.L.); (S.-h.L.); (J.Y.)
| | - Jisang Yoo
- Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Korea; (Y.L.); (S.-h.L.); (J.Y.)
| | - Soonchul Kwon
- Graduate School of Smart Convergence, Kwangwoon University, Seoul 01897, Korea
- Correspondence: ; Tel.: +82-2-940-8637
| |
Collapse
|
20
|
Dao MQ, Frémont V. A Two-Stage Data Association Approach for 3D Multi-Object Tracking. Sensors (Basel) 2021; 21:2894. [PMID: 33919034 DOI: 10.3390/s21092894] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 04/09/2021] [Accepted: 04/16/2021] [Indexed: 11/16/2022]
Abstract
Multi-Object Tracking (MOT) is an integral part of any autonomous driving pipelines because it produces trajectories of other moving objects in the scene and predicts their future motion. Thanks to the recent advances in 3D object detection enabled by deep learning, track-by-detection has become the dominant paradigm in 3D MOT. In this paradigm, a MOT system is essentially made of an object detector and a data association algorithm which establishes track-to-detection correspondence. While 3D object detection has been actively researched, association algorithms for 3D MOT has settled at bipartite matching formulated as a Linear Assignment Problem (LAP) and solved by the Hungarian algorithm. In this paper, we adapt a two-stage data association method which was successfully applied to image-based tracking to the 3D setting, thus providing an alternative for data association for 3D MOT. Our method outperforms the baseline using one-stage bipartite matching for data association by achieving 0.587 Average Multi-Object Tracking Accuracy (AMOTA) in NuScenes validation set and 0.365 AMOTA (at level 2) in Waymo test set.
Collapse
|
21
|
Himri K, Ridao P, Gracias N. Underwater Object Recognition Using Point-Features, Bayesian Estimation and Semantic Information. Sensors (Basel) 2021; 21:s21051807. [PMID: 33807708 PMCID: PMC7961582 DOI: 10.3390/s21051807] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 02/18/2021] [Accepted: 02/23/2021] [Indexed: 11/30/2022]
Abstract
This paper proposes a 3D object recognition method for non-coloured point clouds using point features. The method is intended for application scenarios such as Inspection, Maintenance and Repair (IMR) of industrial sub-sea structures composed of pipes and connecting objects (such as valves, elbows and R-Tee connectors). The recognition algorithm uses a database of partial views of the objects, stored as point clouds, which is available a priori. The recognition pipeline has 5 stages: (1) Plane segmentation, (2) Pipe detection, (3) Semantic Object-segmentation and detection, (4) Feature based Object Recognition and (5) Bayesian estimation. To apply the Bayesian estimation, an object tracking method based on a new Interdistance Joint Compatibility Branch and Bound (IJCBB) algorithm is proposed. The paper studies the recognition performance depending on: (1) the point feature descriptor used, (2) the use (or not) of Bayesian estimation and (3) the inclusion of semantic information about the objects connections. The methods are tested using an experimental dataset containing laser scans and Autonomous Underwater Vehicle (AUV) navigation data. The best results are obtained using the Clustered Viewpoint Feature Histogram (CVFH) descriptor, achieving recognition rates of 51.2%, 68.6% and 90%, respectively, clearly showing the advantages of using the Bayesian estimation (18% increase) and the inclusion of semantic information (21% further increase).
Collapse
|
22
|
Zhang WL, Yang K, Xin YT, Zhao TS. Multi-Object Tracking Algorithm for RGB-D Images Based on Asymmetric Dual Siamese Networks. Sensors (Basel) 2020; 20:E6745. [PMID: 33255800 PMCID: PMC7728318 DOI: 10.3390/s20236745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Revised: 11/14/2020] [Accepted: 11/23/2020] [Indexed: 06/12/2023]
Abstract
Currently, intelligent security systems are widely deployed in indoor buildings to ensure the safety of people in shopping malls, banks, train stations, and other indoor buildings. Multi-Object Tracking (MOT), as an important component of intelligent security systems, has received much attention from many researchers in recent years. However, existing multi-objective tracking algorithms still suffer from trajectory drift and interruption problems in crowded scenes, which cannot provide valuable data for managers. In order to solve the above problems, this paper proposes a Multi-Object Tracking algorithm for RGB-D images based on Asymmetric Dual Siamese networks (ADSiamMOT-RGBD). This algorithm combines appearance information from RGB images and target contour information from depth images. Furthermore, the attention module is applied to repress the redundant information in the combined features to overcome the trajectory drift problem. We also propose a trajectory analysis module, which analyzes whether the head movement trajectory is correct in combination with time-context information. It reduces the number of human error trajectories. The experimental results show that the proposed method in this paper has better tracking quality on the MICC, EPFL, and UMdatasets than the previous work.
Collapse
|
23
|
Fahimipirehgalin M, Vogel-Heuser B, Trunzer E, Odenweller M. Visual Leakage Inspection in Chemical Process Plants Using Thermographic Videos and Motion Pattern Detection. Sensors (Basel) 2020; 20:E6659. [PMID: 33233733 DOI: 10.3390/s20226659] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 11/13/2020] [Accepted: 11/18/2020] [Indexed: 11/22/2022]
Abstract
Liquid leakage from pipelines is a critical issue in large-scale chemical process plants since it can affect the normal operation of the plant and pose unsafe and hazardous situations. Therefore, leakage detection in the early stages can prevent serious damage. Developing a vision-based inspection system by means of IR imaging can be a promising approach for accurate leakage detection. IR cameras can capture the effect of leaking drops if they have higher (or lower) temperature than their surroundings. Since the leaking drops can be observed in an IR video as a repetitive phenomenon with specific patterns, motion pattern detection methods can be utilized for leakage detection. In this paper, an approach based on the Kalman filter is proposed to track the motion of leaking drops and differentiate them from noise. The motion patterns are learned from the training data and applied to the test data to evaluate the accuracy of the method. For this purpose, a laboratory demonstrator plant is assembled to simulate the leakages from pipelines, and to generate training and test videos. The results show that the proposed method can detect the leaking drops by tracking them based on obtained motion patterns. Furthermore, the possibilities and conditions for applying the proposed method in a real industrial chemical plant are discussed at the end.
Collapse
|
24
|
Velastin SA, Fernández R, Espinosa JE, Bay A. Detecting, Tracking and Counting People Getting On/Off a Metropolitan Train Using a Standard Video Camera. Sensors (Basel) 2020; 20:E6251. [PMID: 33147784 DOI: 10.3390/s20216251] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 10/26/2020] [Accepted: 10/30/2020] [Indexed: 11/21/2022]
Abstract
The main source of delays in public transport systems (buses, trams, metros, railways) takes place in their stations. For example, a public transport vehicle can travel at 60 km per hour between stations, but its commercial speed (average en-route speed, including any intermediate delay) does not reach more than half of that value. Therefore, the problem that public transport operators must solve is how to reduce the delay in stations. From the perspective of transport engineering, there are several ways to approach this issue, from the design of infrastructure and vehicles to passenger traffic management. The tools normally available to traffic engineers are analytical models, microscopic traffic simulation, and, ultimately, real-scale laboratory experiments. In any case, the data that are required are number of passengers that get on and off from the vehicles, as well as the number of passengers waiting on platforms. Traditionally, such data has been collected manually by field counts or through videos that are then processed by hand. On the other hand, public transport networks, specially metropolitan railways, have an extensive monitoring infrastructure based on standard video cameras. Traditionally, these are observed manually or with very basic signal processing support, so there is significant scope for improving data capture and for automating the analysis of site usage, safety, and surveillance. This article shows a way of collecting and analyzing the data needed to feed both traffic models and analyze laboratory experimentation, exploiting recent intelligent sensing approaches. The paper presents a new public video dataset gathered using real-scale laboratory recordings. Part of this dataset has been annotated by hand, marking up head locations to provide a ground-truth on which to train and evaluate deep learning detection and tracking algorithms. Tracking outputs are then used to count people getting on and off, achieving a mean accuracy of 92% with less than 0.15% standard deviation on 322 mostly unseen dataset video sequences.
Collapse
|
25
|
Dimitrievski M, Van Hamme D, Veelaert P, Philips W. Cooperative Multi-Sensor Tracking of Vulnerable Road Users in the Presence of Missing Detections. Sensors (Basel) 2020; 20:E4817. [PMID: 32858942 DOI: 10.3390/s20174817] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 08/17/2020] [Accepted: 08/19/2020] [Indexed: 11/16/2022]
Abstract
This paper presents a vulnerable road user (VRU) tracking algorithm capable of handling noisy and missing detections from heterogeneous sensors. We propose a cooperative fusion algorithm for matching and reinforcing of radar and camera detections using their proximity and positional uncertainty. The belief in the existence and position of objects is then maximized by temporal integration of fused detections by a multi-object tracker. By switching between observation models, the tracker adapts to the detection noise characteristics making it robust to individual sensor failures. The main novelty of this paper is an improved imputation sampling function for updating the state when detections are missing. The proposed function uses a likelihood without association that is conditioned on the sensor information instead of the sensor model. The benefits of the proposed solution are two-fold: firstly, particle updates become computationally tractable and secondly, the problem of imputing samples from a state which is predicted without an associated detection is bypassed. Experimental evaluation shows a significant improvement in both detection and tracking performance over multiple control algorithms. In low light situations, the cooperative fusion outperforms intermediate fusion by as much as 30%, while increases in tracking performance are most significant in complex traffic scenes.
Collapse
|
26
|
T. Psota E, Schmidt T, Mote B, C. Pérez L. Long-Term Tracking of Group-Housed Livestock Using Keypoint Detection and MAP Estimation for Individual Animal Identification. Sensors (Basel) 2020; 20:s20133670. [PMID: 32630011 PMCID: PMC7374513 DOI: 10.3390/s20133670] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/08/2020] [Accepted: 06/16/2020] [Indexed: 02/05/2023]
Abstract
Tracking individual animals in a group setting is a exigent task for computer vision and animal science researchers. When the objective is months of uninterrupted tracking and the targeted animals lack discernible differences in their physical characteristics, this task introduces significant challenges. To address these challenges, a probabilistic tracking-by-detection method is proposed. The tracking method uses, as input, visible keypoints of individual animals provided by a fully-convolutional detector. Individual animals are also equipped with ear tags that are used by a classification network to assign unique identification to instances. The fixed cardinality of the targets is leveraged to create a continuous set of tracks and the forward-backward algorithm is used to assign ear-tag identification probabilities to each detected instance. Tracking achieves real-time performance on consumer-grade hardware, in part because it does not rely on complex, costly, graph-based optimizations. A publicly available, human-annotated dataset is introduced to evaluate tracking performance. This dataset contains 15 half-hour long videos of pigs with various ages/sizes, facility environments, and activity levels. Results demonstrate that the proposed method achieves an average precision and recall greater than 95% across the entire dataset. Analysis of the error events reveals environmental conditions and social interactions that are most likely to cause errors in real-world deployments.
Collapse
Affiliation(s)
- Eric T. Psota
- Department of Electrical and Computer Engineering, University of Nebraska–Lincoln, Lincoln, NE 68505, USA;
- Correspondence:
| | - Ty Schmidt
- Department of Animal Science, University of Nebraska–Lincoln, Lincoln, NE 68588, USA; (T.S.); (B.M.)
| | - Benny Mote
- Department of Animal Science, University of Nebraska–Lincoln, Lincoln, NE 68588, USA; (T.S.); (B.M.)
| | - Lance C. Pérez
- Department of Electrical and Computer Engineering, University of Nebraska–Lincoln, Lincoln, NE 68505, USA;
| |
Collapse
|
27
|
Meng F, Wang X, Wang D, Shao F, Fu L. Spatial-Semantic and Temporal Attention Mechanism-Based Online Multi-Object Tracking. Sensors (Basel) 2020; 20:E1653. [PMID: 32188090 PMCID: PMC7146429 DOI: 10.3390/s20061653] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 02/28/2020] [Accepted: 03/10/2020] [Indexed: 11/17/2022]
Abstract
Multi-object tracking (MOT) plays a crucial role in various platforms. Occlusion and insertion among targets, complex backgrounds and higher real-time requirements increase the difficulty of MOT problems. Most state-of-the-art MOT approaches adopt the tracking-by-detection strategy, which relies on compute-intensive sliding windows or anchoring schemes to detect matching targets or candidates in each frame. In this work, we introduce a more efficient and effective spatial-temporal attention scheme to track multiple objects in various scenarios. Using a semantic-feature-based spatial attention mechanism and a novel Motion Model, we address the insertion and location of candidates. Some online-learned target-specific convolutional neural networks (CNNs) were used to estimate target occlusion and classify by adapting the appearance model. A temporal attention mechanism was adopted to update the online module by balancing current and history frames. Extensive experiments were performed on Karlsruhe Institute of Technologyand Toyota Technological Institute (KITTI) benchmarks and an Armored Target Tracking Dataset (ATTD) built for ground-armored targets. Experimental results show that the proposed method achieved outstanding tracking performance and met the actual application requirements.
Collapse
Affiliation(s)
- Fanjie Meng
- Department of Mechanical Engineering, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China; (F.M.); (D.W.); (F.S.)
| | - Xinqing Wang
- Department of Mechanical Engineering, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China; (F.M.); (D.W.); (F.S.)
| | - Dong Wang
- Department of Mechanical Engineering, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China; (F.M.); (D.W.); (F.S.)
| | - Faming Shao
- Department of Mechanical Engineering, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China; (F.M.); (D.W.); (F.S.)
| | - Lei Fu
- Department of Armament Science and Technology, College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China;
| |
Collapse
|
28
|
Muresan MP, Giosan I, Nedevschi S. Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation. Sensors (Basel) 2020; 20:s20041110. [PMID: 32085608 PMCID: PMC7070899 DOI: 10.3390/s20041110] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 01/15/2020] [Accepted: 02/13/2020] [Indexed: 11/16/2022]
Abstract
The stabilization and validation process of the measured position of objects is an important step for high-level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super-sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long-range RADAR (Radio detection and ranging), and 4-layer and 16-layer LIDARs (Light Detection and Ranging). We propose two original data association methods used in the sensor fusion and tracking processes. The first data association algorithm is created for tracking LIDAR objects and combines multiple appearance and motion features in order to exploit the available information for road objects. The second novel data association algorithm is designed for trifocal camera objects and has the objective of finding measurement correspondences to sensor fused objects such that the super-sensor data are enriched by adding the semantic class information. The implemented trifocal object association solution uses a novel polar association scheme combined with a decision tree to find the best hypothesis–measurement correlations. Another contribution we propose for stabilizing object position and unpredictable behavior of road objects, provided by multiple types of complementary sensors, is the use of a fusion approach based on the Unscented Kalman Filter and a single-layer perceptron. The last novel contribution is related to the validation of the 3D object position, which is solved using a fuzzy logic technique combined with a semantic segmentation image. The proposed algorithms have a real-time performance, achieving a cumulative running time of 90 ms, and have been evaluated using ground truth data extracted from a high-precision GPS (global positioning system) with 2 cm accuracy, obtaining an average error of 0.8 m.
Collapse
|
29
|
Al-Kaff A, Gómez-Silva MJ, Moreno FM, de la Escalera A, Armingol JM. An Appearance-Based Tracking Algorithm for Aerial Search and Rescue Purposes. Sensors (Basel) 2019; 19:s19030652. [PMID: 30764528 PMCID: PMC6387277 DOI: 10.3390/s19030652] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Revised: 01/19/2019] [Accepted: 02/02/2019] [Indexed: 11/16/2022]
Abstract
The automation of the Wilderness Search and Rescue (WiSAR) task aims for high levels of understanding of various scenery. In addition, working in unfriendly and complex environments may cause a time delay in the operation and consequently put human lives at stake. In order to address this problem, Unmanned Aerial Vehicles (UAVs), which provide potential support to the conventional methods, are used. These vehicles are provided with reliable human detection and tracking algorithms; in order to be able to find and track the bodies of the victims in complex environments, and a robust control system to maintain safe distances from the detected bodies. In this paper, a human detection based on the color and depth data captured from onboard sensors is proposed. Moreover, the proposal of computing data association from the skeleton pose and a visual appearance measurement allows the tracking of multiple people with invariance to the scale, translation and rotation of the point of view with respect to the target objects. The system has been validated with real and simulation experiments, and the obtained results show the ability to track multiple individuals even after long-term disappearances. Furthermore, the simulations present the robustness of the implemented reactive control system as a promising tool for assisting the pilot to perform approaching maneuvers in a safe and smooth manner.
Collapse
Affiliation(s)
- Abdulla Al-Kaff
- Intelligent Systems Lab (LSI), Universidad Carlos III de Madrid, Avnd. de la Universidad 30, 28911 Madrid, Spain.
| | - María José Gómez-Silva
- Intelligent Systems Lab (LSI), Universidad Carlos III de Madrid, Avnd. de la Universidad 30, 28911 Madrid, Spain.
| | - Francisco Miguel Moreno
- Intelligent Systems Lab (LSI), Universidad Carlos III de Madrid, Avnd. de la Universidad 30, 28911 Madrid, Spain.
| | - Arturo de la Escalera
- Intelligent Systems Lab (LSI), Universidad Carlos III de Madrid, Avnd. de la Universidad 30, 28911 Madrid, Spain.
| | - José María Armingol
- Intelligent Systems Lab (LSI), Universidad Carlos III de Madrid, Avnd. de la Universidad 30, 28911 Madrid, Spain.
| |
Collapse
|
30
|
Yoon K, Kim DY, Yoon YC, Jeon M. Data Association for Multi-Object Tracking via Deep Neural Networks. Sensors (Basel) 2019; 19:s19030559. [PMID: 30700017 PMCID: PMC6387419 DOI: 10.3390/s19030559] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2018] [Revised: 01/22/2019] [Accepted: 01/25/2019] [Indexed: 11/16/2022]
Abstract
With recent advances in object detection, the tracking-by-detection method has become mainstream for multi-object tracking in computer vision. The tracking-by-detection scheme necessarily has to resolve a problem of data association between existing tracks and newly received detections at each frame. In this paper, we propose a new deep neural network (DNN) architecture that can solve the data association problem with a variable number of both tracks and detections including false positives. The proposed network consists of two parts: encoder and decoder. The encoder is the fully connected network with several layers that take bounding boxes of both detection and track-history as inputs. The outputs of the encoder are sequentially fed into the decoder which is composed of the bi-directional Long Short-Term Memory (LSTM) networks with a projection layer. The final output of the proposed network is an association matrix that reflects matching scores between tracks and detections. To train the network, we generate training samples using the annotation of Stanford Drone Dataset (SDD). The experiment results show that the proposed network achieves considerably high recall and precision rate as the binary classifier for the assignment tasks. We apply our network to track multiple objects on real-world datasets and evaluate the tracking performance. The performance of our tracker outperforms previous works based on DNN and comparable to other state-of-the-art methods.
Collapse
Affiliation(s)
- Kwangjin Yoon
- School of Electrical Engineering and Computer Science, Gwanju Institute of Science and Technology, Gwangju 61005, Korea.
| | - Du Yong Kim
- School of Engineering, RMIT University, Melbourne, VIC 3000, Australia.
| | - Young-Chul Yoon
- School of Electrical Engineering and Computer Science, Gwanju Institute of Science and Technology, Gwangju 61005, Korea.
| | - Moongu Jeon
- School of Electrical Engineering and Computer Science, Gwanju Institute of Science and Technology, Gwangju 61005, Korea.
| |
Collapse
|
31
|
Dimitrievski M, Veelaert P, Philips W. Behavioral Pedestrian Tracking Using a Camera and LiDAR Sensors on a Moving Vehicle. Sensors (Basel) 2019; 19:s19020391. [PMID: 30669359 PMCID: PMC6359120 DOI: 10.3390/s19020391] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2018] [Revised: 01/04/2019] [Accepted: 01/15/2019] [Indexed: 11/16/2022]
Abstract
In this paper, we present a novel 2D–3D pedestrian tracker designed for applications in autonomous vehicles. The system operates on a tracking by detection principle and can track multiple pedestrians in complex urban traffic situations. By using a behavioral motion model and a non-parametric distribution as state model, we are able to accurately track unpredictable pedestrian motion in the presence of heavy occlusion. Tracking is performed independently, on the image and ground plane, in global, motion compensated coordinates. We employ Camera and LiDAR data fusion to solve the association problem where the optimal solution is found by matching 2D and 3D detections to tracks using a joint log-likelihood observation model. Each 2D–3D particle filter then updates their state from associated observations and a behavioral motion model. Each particle moves independently following the pedestrian motion parameters which we learned offline from an annotated training dataset. Temporal stability of the state variables is achieved by modeling each track as a Markov Decision Process with probabilistic state transition properties. A novel track management system then handles high level actions such as track creation, deletion and interaction. Using a probabilistic track score the track manager can cull false and ambiguous detections while updating tracks with detections from actual pedestrians. Our system is implemented on a GPU and exploits the massively parallelizable nature of particle filters. Due to the Markovian nature of our track representation, the system achieves real-time performance operating with a minimal memory footprint. Exhaustive and independent evaluation of our tracker was performed by the KITTI benchmark server, where it was tested against a wide variety of unknown pedestrian tracking situations. On this realistic benchmark, we outperform all published pedestrian trackers in a multitude of tracking metrics.
Collapse
Affiliation(s)
- Martin Dimitrievski
- TELIN-IPI, Ghent University - imec, St-Pietersnieuwstraat 41, B-9000 Gent, Belgium.
| | - Peter Veelaert
- TELIN-IPI, Ghent University - imec, St-Pietersnieuwstraat 41, B-9000 Gent, Belgium.
| | - Wilfried Philips
- TELIN-IPI, Ghent University - imec, St-Pietersnieuwstraat 41, B-9000 Gent, Belgium.
| |
Collapse
|
32
|
Tan Y, Tai Y, Xiong S. NCA-Net for Tracking Multiple Objects across Multiple Cameras. Sensors (Basel) 2018; 18:E3400. [PMID: 30314285 DOI: 10.3390/s18103400] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 10/08/2018] [Accepted: 10/08/2018] [Indexed: 01/26/2023]
Abstract
Tracking multiple pedestrians across multi-camera scenarios is an important part of intelligent video surveillance and has great potential application for public security, which has been an attractive topic in the literature in recent years. In most previous methods, artificial features such as color histograms, HOG descriptors and Haar-like feature were adopted to associate objects among different cameras. But there are still many challenges caused by low resolution, variation of illumination, complex background and posture change. In this paper, a feature extraction network named NCA-Net is designed to improve the performance of multiple objects tracking across multiple cameras by avoiding the problem of insufficient robustness caused by hand-crafted features. The network combines features learning and metric learning via a Convolutional Neural Network (CNN) model and the loss function similar to neighborhood components analysis (NCA). The loss function is adapted from the probability loss of NCA aiming at object tracking. The experiments conducted on the NLPR_MCT dataset show that we obtain satisfactory results even with a simple matching operation. In addition, we embed the proposed NCA-Net with two existing tracking systems. The experimental results on the corresponding datasets demonstrate that the extracted features using NCA-net can effectively make improvement on the tracking performance.
Collapse
|
33
|
Wang Z, Fan L, Cai B. A 3D Relative-Motion Context Constraint-Based MAP Solution for Multiple-Object Tracking Problems. Sensors (Basel) 2018; 18:E2363. [PMID: 30037032 PMCID: PMC6069259 DOI: 10.3390/s18072363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 07/08/2018] [Accepted: 07/15/2018] [Indexed: 11/16/2022]
Abstract
Multi-object tracking (MOT), especially by using a moving monocular camera, is a very challenging task in the field of visual object tracking. To tackle this problem, the traditional tracking-by-detection-based method is heavily dependent on detection results. Occlusion and mis-detections will often lead to tracklets or drifting. In this paper, the tasks of MOT and camera motion estimation are formulated as finding a maximum a posteriori (MAP) solution of joint probability and synchronously solved in a unified framework. To improve performance, we incorporate the three-dimensional (3D) relative-motion model into a sequential Bayesian framework to track multiple objects and the camera's ego-motion estimation. A 3D relative-motion model that describes spatial relations among objects is exploited for predicting object states robustly and recovering objects when occlusion and mis-detections occur. Reversible jump Markov chain Monte Carlo (RJMCMC) particle filtering is applied to solve the posteriori estimation problem. Both quantitative and qualitative experiments with benchmark datasets and video collected on campus were conducted, which confirms that the proposed method is outperformed in many evaluation metrics.
Collapse
Affiliation(s)
- Zhongli Wang
- School of Electronic Information and Engineering, Beijing Jiaotong University, Beijing 100044, China.
- Beijing Engineering Research Center of EMC and GNSS Technology for Rail Transportation, Beijing 100044, China.
| | - Litong Fan
- School of Electronic Information and Engineering, Beijing Jiaotong University, Beijing 100044, China.
- School of Computer Information Management, Inner Mongolia University of Finance and Economics, Hohot 010010, China.
| | - Baigen Cai
- School of Electronic Information and Engineering, Beijing Jiaotong University, Beijing 100044, China.
- Beijing Engineering Research Center of EMC and GNSS Technology for Rail Transportation, Beijing 100044, China.
| |
Collapse
|
34
|
Zhao D, Fu H, Xiao L, Wu T, Dai B. Multi-Object Tracking with Correlation Filter for Autonomous Vehicle. Sensors (Basel) 2018; 18:s18072004. [PMID: 29932136 PMCID: PMC6068606 DOI: 10.3390/s18072004] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Revised: 06/17/2018] [Accepted: 06/18/2018] [Indexed: 11/24/2022]
Abstract
Multi-object tracking is a crucial problem for autonomous vehicle. Most state-of-the-art approaches adopt the tracking-by-detection strategy, which is a two-step procedure consisting of the detection module and the tracking module. In this paper, we improve both steps. We improve the detection module by incorporating the temporal information, which is beneficial for detecting small objects. For the tracking module, we propose a novel compressed deep Convolutional Neural Network (CNN) feature based Correlation Filter tracker. By carefully integrating these two modules, the proposed multi-object tracking approach has the ability of re-identification (ReID) once the tracked object gets lost. Extensive experiments were performed on the KITTI and MOT2015 tracking benchmarks. Results indicate that our approach outperforms most state-of-the-art tracking approaches.
Collapse
Affiliation(s)
- Dawei Zhao
- College of Artificial Intelligence, National University of Defense Technology, Changsha 410073, China.
| | - Hao Fu
- College of Artificial Intelligence, National University of Defense Technology, Changsha 410073, China.
| | - Liang Xiao
- National Innovation Institute of Defense Technology, Beijing 100091, China.
| | - Tao Wu
- College of Artificial Intelligence, National University of Defense Technology, Changsha 410073, China.
| | - Bin Dai
- College of Artificial Intelligence, National University of Defense Technology, Changsha 410073, China.
- National Innovation Institute of Defense Technology, Beijing 100091, China.
| |
Collapse
|
35
|
Roussel N, Sprenger J, Tappan SJ, Glaser JR. Robust tracking and quantification of C. elegans body shape and locomotion through coiling, entanglement, and omega bends. Worm 2015; 3:e982437. [PMID: 26435884 DOI: 10.4161/21624054.2014.982437] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Revised: 10/17/2014] [Accepted: 10/27/2014] [Indexed: 01/27/2023]
Abstract
The behavior of the well-characterized nematode, Caenorhabditis elegans (C. elegans), is often used to study the neurologic control of sensory and motor systems in models of health and neurodegenerative disease. To advance the quantification of behaviors to match the progress made in the breakthroughs of genetics, RNA, proteins, and neuronal circuitry, analysis must be able to extract subtle changes in worm locomotion across a population. The analysis of worm crawling motion is complex due to self-overlap, coiling, and entanglement. Using current techniques, the scope of the analysis is typically restricted to worms to their non-occluded, uncoiled state which is incomplete and fundamentally biased. Using a model describing the worm shape and crawling motion, we designed a deformable shape estimation algorithm that is robust to coiling and entanglement. This model-based shape estimation algorithm has been incorporated into a framework where multiple worms can be automatically detected and tracked simultaneously throughout the entire video sequence, thereby increasing throughput as well as data validity. The newly developed algorithms were validated against 10 manually labeled datasets obtained from video sequences comprised of various image resolutions and video frame rates. The data presented demonstrate that tracking methods incorporated in WormLab enable stable and accurate detection of these worms through coiling and entanglement. Such challenging tracking scenarios are common occurrences during normal worm locomotion. The ability for the described approach to provide stable and accurate detection of C. elegans is critical to achieve unbiased locomotory analysis of worm motion.
Collapse
|