1
|
Xiao X, Xiong X, Meng F, Chen Z. Multi-Scale Feature Interactive Fusion Network for RGBT Tracking. SENSORS (BASEL, SWITZERLAND) 2023; 23:3410. [PMID: 37050470 PMCID: PMC10098685 DOI: 10.3390/s23073410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 03/16/2023] [Accepted: 03/22/2023] [Indexed: 06/19/2023]
Abstract
The fusion tracking of RGB and thermal infrared image (RGBT) is paid wide attention to due to their complementary advantages. Currently, most algorithms obtain modality weights through attention mechanisms to integrate multi-modalities information. They do not fully exploit the multi-scale information and ignore the rich contextual information among features, which limits the tracking performance to some extent. To solve this problem, this work proposes a new multi-scale feature interactive fusion network (MSIFNet) for RGBT tracking. Specifically, we use different convolution branches for multi-scale feature extraction and aggregate them through the feature selection module adaptively. At the same time, a Transformer interactive fusion module is proposed to build long-distance dependencies and enhance semantic representation further. Finally, a global feature fusion module is designed to adjust the global information adaptively. Numerous experiments on publicly available GTOT, RGBT234, and LasHeR datasets show that our algorithm outperforms the current mainstream tracking algorithms.
Collapse
Affiliation(s)
- Xianbing Xiao
- School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644000, China
| | - Xingzhong Xiong
- Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 644000, China
| | - Fanqin Meng
- Artificial Intelligence Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 644000, China
| | - Zhen Chen
- School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644000, China
| |
Collapse
|
2
|
Ma C, Zhuo L, Li J, Zhang Y, Zhang J. Occluded Prohibited Object Detection in X-ray Images with Global Context-aware Multi-Scale Feature Aggregation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
3
|
Composite Electromagnetic Scattering and High-Resolution SAR Imaging of Multiple Targets above Rough Surface. REMOTE SENSING 2022. [DOI: 10.3390/rs14122910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Aiming at the high efficiency of composite electromagnetic scattering analysis and radar target detection and recognition utilizing high-range resolution profile (HRRP) characteristics and high-resolution synthetic aperture radar (SAR) images, a near-field modified iterative physical optics and facet-based two-scale model for analysis of composite electromagnetic scattering from multiple targets above rough surface have been presented. In this method, the coupling scattering of multiple targets is calculated by near-field iterative physical optics and the far-field scattering is calculated by the physical optics method. For the evaluation of the scattering of an electrically large sea surface, a slope cutoff probability distribution function is introduced in the two-scale model. Moreover, a fast imaging method is introduced based on the proposed hybrid electromagnetic scattering method. The numerical results show the effectiveness of the proposed method, which can generate backscattering data accurately and obtain high-resolution SAR images. It is concluded that the proposed method has the advantages of accurate computation and good recognition performance.
Collapse
|
4
|
SCA-MMA: Spatial and Channel-Aware Multi-Modal Adaptation for Robust RGB-T Object Tracking. ELECTRONICS 2022. [DOI: 10.3390/electronics11121820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
The RGB and thermal (RGB-T) object tracking task is challenging, especially with various target changes caused by deformation, abrupt motion, background clutter and occlusion. It is critical to employ the complementary nature between visual RGB and thermal infrared data. In this work, we address the RGB-T object tracking task with a novel spatial- and channel-aware multi-modal adaptation (SCA-MMA) framework, which builds an adaptive feature learning process for better mining this object-aware information in a unified network. For each type of modality information, the spatial-aware adaptation mechanism is introduced to dynamically learn the location-based characteristics of specific tracking objects at multiple convolution layers. Further, the channel-aware multi-modal adaptation mechanism is proposed to adaptively learn the feature fusion/aggregation of different modalities. In order to perform object tracking, we employ a binary classification module with two fully connected layers to predict the bounding boxes of specific targets. Comprehensive evaluations on GTOT and RGBT234 datasets demonstrate the significant superiority of our proposed SCA-MMA for robust RGB-T object tracking tasks. In particular, the precision rate (PR) and success rate (SR) on GTOT and RGBT234 datasets can reach 90.5%/73.2% and 80.2%/56.9%, significantly higher than the state-of-the-art algorithms.
Collapse
|
5
|
Multiple Cues-Based Robust Visual Object Tracking Method. ELECTRONICS 2022. [DOI: 10.3390/electronics11030345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Visual object tracking is still considered a challenging task in computer vision research society. The object of interest undergoes significant appearance changes because of illumination variation, deformation, motion blur, background clutter, and occlusion. Kernelized correlation filter- (KCF) based tracking schemes have shown good performance in recent years. The accuracy and robustness of these trackers can be further enhanced by incorporating multiple cues from the response map. Response map computation is the complementary step in KCF-based tracking schemes, and it contains a bundle of information. The majority of the tracking methods based on KCF estimate the target location by fetching a single cue-like peak correlation value from the response map. This paper proposes to mine the response map in-depth to fetch multiple cues about the target model. Furthermore, a new criterion based on the hybridization of multiple cues i.e., average peak correlation energy (APCE) and confidence of squared response map (CSRM), is presented to enhance the tracking efficiency. We update the following tracking modules based on hybridized criterion: (i) occlusion detection, (ii) adaptive learning rate adjustment, (iii) drift handling using adaptive learning rate, (iv) handling, and (v) scale estimation. We integrate all these modules to propose a new tracking scheme. The proposed tracker is evaluated on challenging videos selected from three standard datasets, i.e., OTB-50, OTB-100, and TC-128. A comparison of the proposed tracking scheme with other state-of-the-art methods is also presented in this paper. Our method improved considerably by achieving a center location error of 16.06, distance precision of 0.889, and overlap success rate of 0.824.
Collapse
|
6
|
Mehmood K, Ali A, Jalil A, Khan B, Cheema KM, Murad M, Milyani AH. Efficient Online Object Tracking Scheme for Challenging Scenarios. SENSORS 2021; 21:s21248481. [PMID: 34960574 PMCID: PMC8706150 DOI: 10.3390/s21248481] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 12/03/2021] [Accepted: 12/14/2021] [Indexed: 11/20/2022]
Abstract
Visual object tracking (VOT) is a vital part of various domains of computer vision applications such as surveillance, unmanned aerial vehicles (UAV), and medical diagnostics. In recent years, substantial improvement has been made to solve various challenges of VOT techniques such as change of scale, occlusions, motion blur, and illumination variations. This paper proposes a tracking algorithm in a spatiotemporal context (STC) framework. To overcome the limitations of STC based on scale variation, a max-pooling-based scale scheme is incorporated by maximizing over posterior probability. To avert target model from drift, an efficient mechanism is proposed for occlusion handling. Occlusion is detected from average peak to correlation energy (APCE)-based mechanism of response map between consecutive frames. On successful occlusion detection, a fractional-gain Kalman filter is incorporated for handling the occlusion. An additional extension to the model includes APCE criteria to adapt the target model in motion blur and other factors. Extensive evaluation indicates that the proposed algorithm achieves significant results against various tracking methods.
Collapse
Affiliation(s)
- Khizer Mehmood
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (K.M.); (A.J.); (B.K.); (M.M.)
| | - Ahmad Ali
- Department of Software Engineering, Bahria University, Islamabad 44000, Pakistan;
| | - Abdul Jalil
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (K.M.); (A.J.); (B.K.); (M.M.)
| | - Baber Khan
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (K.M.); (A.J.); (B.K.); (M.M.)
| | - Khalid Mehmood Cheema
- School of Electrical Engineering, Southeast University, Nanjing 210096, China
- Correspondence:
| | - Maria Murad
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (K.M.); (A.J.); (B.K.); (M.M.)
| | - Ahmad H. Milyani
- School of Electrical and Computer Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia;
| |
Collapse
|
7
|
Mehmood K, Jalil A, Ali A, Khan B, Murad M, Cheema KM, Milyani AH. Spatio-Temporal Context, Correlation Filter and Measurement Estimation Collaboration Based Visual Object Tracking. SENSORS 2021; 21:s21082841. [PMID: 33920648 PMCID: PMC8073341 DOI: 10.3390/s21082841] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 04/01/2021] [Accepted: 04/15/2021] [Indexed: 11/16/2022]
Abstract
Despite eminent progress in recent years, various challenges associated with object tracking algorithms such as scale variations, partial or full occlusions, background clutters, illumination variations are still required to be resolved with improved estimation for real-time applications. This paper proposes a robust and fast algorithm for object tracking based on spatio-temporal context (STC). A pyramid representation-based scale correlation filter is incorporated to overcome the STC’s inability on the rapid change of scale of target. It learns appearance induced by variations in the target scale sampled at a different set of scales. During occlusion, most correlation filter trackers start drifting due to the wrong update of samples. To prevent the target model from drift, an occlusion detection and handling mechanism are incorporated. Occlusion is detected from the peak correlation score of the response map. It continuously predicts target location during occlusion and passes it to the STC tracking model. After the successful detection of occlusion, an extended Kalman filter is used for occlusion handling. This decreases the chance of tracking failure as the Kalman filter continuously updates itself and the tracking model. Further improvement to the model is provided by fusion with average peak to correlation energy (APCE) criteria, which automatically update the target model to deal with environmental changes. Extensive calculations on the benchmark datasets indicate the efficacy of the proposed tracking method with state of the art in terms of performance analysis.
Collapse
Affiliation(s)
- Khizer Mehmood
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (A.J.); (B.K.); (M.M.)
- Correspondence: (K.M.); (K.M.C.)
| | - Abdul Jalil
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (A.J.); (B.K.); (M.M.)
| | - Ahmad Ali
- Department of Software Engineering, Bahria University, Islamabad 44000, Pakistan;
| | - Baber Khan
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (A.J.); (B.K.); (M.M.)
| | - Maria Murad
- Department of Electrical Engineering, International Islamic University, Islamabad 44000, Pakistan; (A.J.); (B.K.); (M.M.)
| | - Khalid Mehmood Cheema
- School of Electrical Engineering, Southeast University, Nanjing 210096, China
- Correspondence: (K.M.); (K.M.C.)
| | - Ahmad H. Milyani
- Department of Electrical and Computer Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia;
| |
Collapse
|