1
|
Zhang H, Qian F, Shi P, Du W, Tang Y, Qian J, Gong C, Yang J. Generalized Nonconvex Nonsmooth Low-Rank Matrix Recovery Framework With Feasible Algorithm Designs and Convergence Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:5342-5353. [PMID: 35737613 DOI: 10.1109/tnnls.2022.3183970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Decomposing data matrix into low-rank plus additive matrices is a commonly used strategy in pattern recognition and machine learning. This article mainly studies the alternating direction method of multiplier (ADMM) with two dual variables, which is used to optimize the generalized nonconvex nonsmooth low-rank matrix recovery problems. Furthermore, the minimization framework with a feasible optimization procedure is designed along with the theoretical analysis, where the variable sequences generated by the proposed ADMM can be proved to be bounded. Most importantly, it can be concluded from the Bolzano-Weierstrass theorem that there must exist a subsequence converging to a critical point, which satisfies the Karush-Kuhn-Tucher (KKT) conditions. Meanwhile, we further ensure the local and global convergence properties of the generated sequence relying on constructing the potential objective function. Particularly, the detailed convergence analysis would be regarded as one of the core contributions besides the algorithm designs and the model generality. Finally, the numerical simulations and the real-world applications are both provided to verify the consistence of the theoretical results, and we also validate the superiority in performance over several mostly related solvers to the tasks of image inpainting and subspace clustering.
Collapse
|
2
|
Wang Q, Liu R, Chen M, Li X. Robust Rank-Constrained Sparse Learning: A Graph-Based Framework for Single View and Multiview Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10228-10239. [PMID: 33872170 DOI: 10.1109/tcyb.2021.3067137] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Graph-based clustering aims to partition the data according to a similarity graph, which has shown impressive performance on various kinds of tasks. The quality of similarity graph largely determines the clustering results, but it is difficult to produce a high-quality one, especially when data contain noises and outliers. To solve this problem, we propose a robust rank constrained sparse learning (RRCSL) method in this article. The L2,1 -norm is adopted into the objective function of sparse representation to learn the optimal graph with robustness. To preserve the data structure, we construct an initial graph and search the graph within its neighborhood. By incorporating a rank constraint, the learned graph can be directly used as the cluster indicator, and the final results are obtained without additional postprocessing. In addition, the proposed method cannot only be applied to single-view clustering but also extended to multiview clustering. Plenty of experiments on synthetic and real-world datasets have demonstrated the superiority and robustness of the proposed framework.
Collapse
|
3
|
Li C, Zhu J, Bi L, Zhang W, Liu Y. A low-light image enhancement method with brightness balance and detail preservation. PLoS One 2022; 17:e0262478. [PMID: 35639677 PMCID: PMC9154181 DOI: 10.1371/journal.pone.0262478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Accepted: 12/24/2021] [Indexed: 11/18/2022] Open
Abstract
This paper proposes a new method for low-light image enhancement with balancing image brightness and preserving image details, this method can improve the brightness and contrast of low-light images while maintaining image details. Traditional histogram equalization methods often lead to excessive enhancement and loss of details, thereby resulting in an unclear and unnatural appearance. In this method, the image is processed bidirectionally. On the one hand, the image is processed by double histogram equalization with double automatic platform method based on improved cuckoo search (CS) algorithm, where the image histogram is segmented firstly, and the platform limit is selected according to the histogram statistics and improved CS technology. Then, the sub-histograms are clipped by two platforms and carried out the histogram equalization respectively. Finally, an image with balanced brightness and good contrast can be obtained. On the other hand, the main structure of the image is extracted based on the total variation model, and the image mask with all the texture details is made by removing the main structure of the image. Eventually, the final enhanced image is obtained by adding the mask with texture details to the image with balanced brightness and good contrast. Compared with the existing methods, the proposed algorithm significantly enhances the visual effect of the low-light images, based on human subjective evaluation and objective evaluation indices. Experimental results show that the proposed method in this paper is better than the existing methods.
Collapse
Affiliation(s)
- Canlin Li
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, China
- * E-mail:
| | - Jinjuan Zhu
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, China
| | - Lihua Bi
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, China
| | - Weizheng Zhang
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, China
| | - Yan Liu
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, China
| |
Collapse
|
4
|
Zhang H, Qian F, Shang F, Du W, Qian J, Yang J. Global Convergence Guarantees of (A)GIST for a Family of Nonconvex Sparse Learning Problems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3276-3288. [PMID: 32784147 DOI: 10.1109/tcyb.2020.3010960] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In recent years, most of the studies have shown that the generalized iterated shrinkage thresholdings (GISTs) have become the commonly used first-order optimization algorithms in sparse learning problems. The nonconvex relaxations of the l0 -norm usually achieve better performance than the convex case (e.g., l1 -norm) since the former can achieve a nearly unbiased solver. To increase the calculation efficiency, this work further provides an accelerated GIST version, that is, AGIST, through the extrapolation-based acceleration technique, which can contribute to reduce the number of iterations when solving a family of nonconvex sparse learning problems. Besides, we present the algorithmic analysis, including both local and global convergence guarantees, as well as other intermediate results for the GIST and AGIST, denoted as (A)GIST, by virtue of the Kurdyka-Łojasiewica (KŁ) property and some milder assumptions. Numerical experiments on both synthetic data and real-world databases can demonstrate that the convergence results of objective function accord to the theoretical properties and nonconvex sparse learning methods can achieve superior performance over some convex ones.
Collapse
|
5
|
Li J, Fang B, Zhou M. Multi-Modal Sparse Tracking by Jointing Timing and Modal Consistency. INT J PATTERN RECOGN 2022. [DOI: 10.1142/s0218001422510089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we propose a multi-modal sparse tracking by jointing timing and modal consistency to locate the target location with the similarity of multiple local appearances. First, we propose an alignable patching strategy for red-green-blue (RGB) color mode and thermal infrared mode to adapt to the local changes of the target. Second, we propose a consistency expression of the corresponding aligned patches between the modes and the correlation of the gaussian mapping within mode to reconstruct the target judgment likelihood function. Finally, we propose an updating scenario based on timing correlation and mode sparsity to fit with the target changes. According to the experimental results, significant improvement in terms of tracking accuracy can be achieved on average compared with the state-of-the-art algorithms. The source code of our algorithm is available on https://github.com/Liincq/tracker.
Collapse
Affiliation(s)
- Jiajun Li
- College of Computer Science, Chongqing University, Chongqing 400044, P. R. China
| | - Bin Fang
- College of Computer Science, Chongqing University, Chongqing 400044, P. R. China
| | - Mingliang Zhou
- College of Computer Science, Chongqing University, Chongqing 400044, P. R. China
| |
Collapse
|
6
|
Qian J, Wong WK, Zhang H, Xie J, Yang J. Joint Optimal Transport With Convex Regularization for Robust Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1553-1564. [PMID: 32452782 DOI: 10.1109/tcyb.2020.2991219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The critical step of learning the robust regression model from high-dimensional visual data is how to characterize the error term. The existing methods mainly employ the nuclear norm to describe the error term, which are robust against structure noises (e.g., illumination changes and occlusions). Although the nuclear norm can describe the structure property of the error term, global distribution information is ignored in most of these methods. It is known that optimal transport (OT) is a robust distribution metric scheme due to that it can handle correspondences between different elements in the two distributions. Leveraging this property, this article presents a novel robust regression scheme by integrating OT with convex regularization. The OT-based regression with L2 norm regularization (OTR) is first proposed to perform image classification. The alternating direction method of multipliers is developed to handle the model. To further address the occlusion problem in image classification, the extended OTR (EOTR) model is then presented by integrating the nuclear norm error term with an OTR model. In addition, we apply the alternating direction method of multipliers with Gaussian back substitution to solve EOTR and also provide the complexity and convergence analysis of our algorithms. Experiments were conducted on five benchmark datasets, including illumination changes and various occlusions. The experimental results demonstrate the performance of our robust regression model on biometric image classification against several state-of-the-art regression-based classification methods.
Collapse
|
7
|
Chen M, Li X. Robust Matrix Factorization With Spectral Embedding. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5698-5707. [PMID: 33090957 DOI: 10.1109/tnnls.2020.3027351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Nonnegative matrix factorization (NMF) and spectral clustering are two of the most widely used clustering techniques. However, NMF cannot deal with the nonlinear data, and spectral clustering relies on the postprocessing. In this article, we propose a Robust Matrix factorization with Spectral embedding (RMS) approach for data clustering, which inherits the advantages of NMF and spectral clustering, while avoiding their shortcomings. In addition, to cluster the data represented by multiple views, we present the multiview version of RMS (M-RMS), and the weights of different views are self-tuned. The main contributions of this research are threefold: 1) by integrating spectral clustering and matrix factorization, the proposed methods are able to capture the nonlinear data structure and obtain the cluster indicator directly; 2) instead of using the squared Frobenius-norm, the objectives are developed with the l2,1 -norm, such that the effects of the outliers are alleviated; and 3) the proposed methods are totally parameter-free, which increases the applicability for various real-world problems. Extensive experiments on several single-view/multiview data sets demonstrate the effectiveness of our methods and verify their superior clustering performance over the state of the arts.
Collapse
|
8
|
Zhang X, Ma S, Wang S, Zhang J, Sun H, Gao W. Divisively Normalized Sparse Coding: Toward Perceptual Visual Signal Representation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4237-4250. [PMID: 30843814 DOI: 10.1109/tcyb.2019.2899005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Sparse representation has been shown to be highly correlated with the visual perception of natural images, which can be characterized by a linear combination of neuronal responses in the visual cortex. Divisive normalization transform (DNT) has been proven to be an effective method in reducing statistical and perceptual dependencies for nonlinear properties in primary visual cortex. In this paper, we develop a divisively normalized sparse coding scheme, aiming to further bridge the gap between sparse representation and human visual perception. We show that such a scheme is perceptually meaningful for representing visual signals, with which the pixel-domain image representation and processing tasks can be feasibly and efficiently achieved in the divisively normalized sparse-domain. Specifically, we develop a sparse-domain similarity (SDS) index for perceptual quality evaluation, where the DNT is employed for transforming image signals into a perceptually uniform space. Furthermore, the proposed SDS index is employed to optimize the sparse coding process when representing natural images. The experimental results indicate that the SDS can provide accurate and consistent predictions of perceived image quality, and the performance of sparse coding can be significantly improved in terms of both objective and subjective quality evaluations.
Collapse
|
9
|
Hsu HM, Cai J, Wang Y, Hwang JN, Kim KJ. Multi-Target Multi-Camera Tracking of Vehicles Using Metadata-Aided Re-ID and Trajectory-Based Camera Link Model. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5198-5210. [PMID: 33999821 DOI: 10.1109/tip.2021.3078124] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this paper, we propose a novel framework for multi-target multi-camera tracking (MTMCT) of vehicles based on metadata-aided re-identification (MA-ReID) and the trajectory-based camera link model (TCLM). Given a video sequence and the corresponding frame-by-frame vehicle detections, we first address the isolated tracklets issue from single camera tracking (SCT) by the proposed traffic-aware single-camera tracking (TSCT). Then, after automatically constructing the TCLM, we solve MTMCT by the MA-ReID. The TCLM is generated from camera topological configuration to obtain the spatial and temporal information to improve the performance of MTMCT by reducing the candidate search of ReID. We also use the temporal attention model to create more discriminative embeddings of trajectories from each camera to achieve robust distance measures for vehicle ReID. Moreover, we train a metadata classifier for MTMCT to obtain the metadata feature, which is concatenated with the temporal attention based embeddings. Finally, the TCLM and hierarchical clustering are jointly applied for global ID assignment. The proposed method is evaluated on the CityFlow dataset, achieving IDF1 76.77%, which outperforms the state-of-the-art MTMCT methods.
Collapse
|
10
|
Zhang Z, Zhang Y, Xu M, Zhang L, Yang Y, Yan S. A Survey on Concept Factorization: From Shallow to Deep Representation Learning. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2021.102534] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
11
|
JROTM: Jointly reinforced object tracking with temporal content reference and motion guidance. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.12.111] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
12
|
Zhou Q, Fan H, Yang H, Su H, Zheng S, Wu S, Ling H. Robust and Efficient Graph Correspondence Transfer for Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:1623-1638. [PMID: 31071040 DOI: 10.1109/tip.2019.2914575] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Spatial misalignment caused by variations in poses and viewpoints is one of the most critical issues that hinder the performance improvement in existing person re-identification (Re-ID) algorithms. Although it is straightforward to explore correspondence learning algorithms for alignment, online learning is intractable for negative pairs due to the intrinsic visual difference between negative pairs and efficiency concern. To address this problem, in this paper, we present a robust and efficient graph correspondence transfer (REGCT) approach for explicit spatial alignment in Re-ID. Specifically, we propose the off-line correspondence learning and on-line correspondence transfer framework. During training, patch-wise correspondences between positive training pairs are established via graph matching. By exploiting both spatial and visual contexts of human appearance in graph matching, meaningful semantic correspondences can be obtained. During testing, the off-line learned patch-wise correspondence templates are transferred to test pairs with similar pose-pair configurations for local feature distance calculation. To enhance the robustness of correspondence transfer, we design a novel pose context descriptor to accurately model human body configurations, and present an approach to measure the similarity between a pair of pose context descriptors. Meanwhile, to improve testing efficiency, we propose a correspondence template ensemble method using the voting mechanism, which significantly reduces the amount of patch-wise matchings involved in distance calculation. With the aforementioned strategies, the REGCT model can effectively and efficiently handle the spatial misalignment problem in Re-ID. Extensive experiments on five challenging benchmarks, including VIPeR, Road, PRID450S, 3DPES, and CUHK01, evidence the superior performance of REGCT over other state-of-the-art approaches.
Collapse
|
13
|
Dubey SR, Chakraborty S, Roy SK, Mukherjee S, Singh SK, Chaudhuri BB. diffGrad: An Optimization Method for Convolutional Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4500-4511. [PMID: 31880565 DOI: 10.1109/tnnls.2019.2955777] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Stochastic gradient descent (SGD) is one of the core techniques behind the success of deep neural networks. The gradient provides information on the direction in which a function has the steepest rate of change. The main problem with basic SGD is to change by equal-sized steps for all parameters, irrespective of the gradient behavior. Hence, an efficient way of deep network optimization is to have adaptive step sizes for each parameter. Recently, several attempts have been made to improve gradient descent methods such as AdaGrad, AdaDelta, RMSProp, and adaptive moment estimation (Adam). These methods rely on the square roots of exponential moving averages of squared past gradients. Thus, these methods do not take advantage of local change in gradients. In this article, a novel optimizer is proposed based on the difference between the present and the immediate past gradient (i.e., diffGrad). In the proposed diffGrad optimization technique, the step size is adjusted for each parameter in such a way that it should have a larger step size for faster gradient changing parameters and a lower step size for lower gradient changing parameters. The convergence analysis is done using the regret bound approach of the online learning framework. In this article, thorough analysis is made over three synthetic complex nonconvex functions. The image categorization experiments are also conducted over the CIFAR10 and CIFAR100 data sets to observe the performance of diffGrad with respect to the state-of-the-art optimizers such as SGDM, AdaGrad, AdaDelta, RMSProp, AMSGrad, and Adam. The residual unit (ResNet)-based convolutional neural network (CNN) architecture is used in the experiments. The experiments show that diffGrad outperforms other optimizers. Also, we show that diffGrad performs uniformly well for training CNN using different activation functions. The source code is made publicly available at https://github.com/shivram1987/diffGrad.
Collapse
|
14
|
Li WH, Xiang S, Nie WZ, Song D, Liu AA, Li XY, Hao T. Joint deep feature learning and unsupervised visual domain adaptation for cross-domain 3D object retrieval. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102275] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
15
|
Wan A. Stable recovery of approximately k-sparse signals in noisy cases via ℓ minimization. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
16
|
Talal TM, Attiya G, Metwalli MR, Abd El-Samie FE, Dessouky MI. Satellite image fusion based on modified central force optimization. MULTIMEDIA TOOLS AND APPLICATIONS 2020; 79:21129-21154. [DOI: 10.1007/s11042-019-08471-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Revised: 08/10/2019] [Accepted: 11/12/2019] [Indexed: 09/02/2023]
|
17
|
Deng C, Han Y, Zhao B. High-Performance Visual Tracking With Extreme Learning Machine Framework. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2781-2792. [PMID: 30624237 DOI: 10.1109/tcyb.2018.2886580] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In real-time applications, a fast and robust visual tracker should generally have the following important properties: 1) feature representation of an object that is not only efficient but also has a good discriminative capability and 2) appearance modeling which can quickly adapt to the variations of foreground and backgrounds. However, most of the existing tracking algorithms cannot achieve satisfactory performance in both of the two aspects. To address this issue, in this paper, we advocate a novel and efficient visual tracker by exploiting the excellent feature learning and classification capabilities of an emerging learning technique, that is, extreme learning machine (ELM). The contributions of the proposed work are as follows: 1) motivated by the simplicity and learning ability of the ELM autoencoder (ELM-AE), an ELM-AE-based feature extraction model is presented, and this model can provide a compact and discriminative representation of the inputs efficiently and 2) due to the fast learning speed of an ELM classifier, an ELM-based appearance model is developed for feature classification, and is able to rapidly distinguish the object of interest from its surroundings. In addition, in order to cope with the visual changes of the target and its backgrounds, the online sequential ELM is used to incrementally update the appearance model. Plenty of experiments on challenging image sequences demonstrate the effectiveness and robustness of the proposed tracker.
Collapse
|
18
|
|
19
|
Zhou T, Zhang C, Gong C, Bhaskar H, Yang J. Multiview Latent Space Learning With Feature Redundancy Minimization. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1655-1668. [PMID: 30571651 DOI: 10.1109/tcyb.2018.2883673] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multiview learning has received extensive research interest and has demonstrated promising results in recent years. Despite the progress made, there are two significant challenges within multiview learning. First, some of the existing methods directly use original features to reconstruct data points without considering the issue of feature redundancy. Second, existing methods cannot fully exploit the complementary information across multiple views and meanwhile preserve the view-specific properties; therefore, the degraded learning performance will be generated. To address the above issues, we propose a novel multiview latent space learning framework with feature redundancy minimization. We aim to learn a latent space to mitigate the feature redundancy and use the learned representation to reconstruct every original data point. More specifically, we first project the original features from multiple views onto a latent space, and then learn a shared dictionary and view-specific dictionaries to, respectively, exploit the correlations across multiple views as well as preserve the view-specific properties. Furthermore, the Hilbert-Schmidt independence criterion is adopted as a diversity constraint to explore the complementarity of multiview representations, which further ensures the diversity from multiple views and preserves the local structure of the data in each view. Experimental results on six public datasets have demonstrated the effectiveness of our multiview learning approach against other state-of-the-art methods.
Collapse
|
20
|
Jia F, Wang X, Guan J, Liao Q, Zhang J, Li H, Qi S. Bi-Connect Net for salient object detection. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
21
|
Ashiba MI, Tolba MS, El-Fishawy AS, El-Samie FEA. Hybrid enhancement of infrared night vision imaging system. MULTIMEDIA TOOLS AND APPLICATIONS 2020; 79:6085-6108. [DOI: 10.1007/s11042-019-7510-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 02/07/2019] [Accepted: 03/18/2019] [Indexed: 09/02/2023]
|
22
|
|
23
|
Lan X, Ye M, Zhang S, Zhou H, Yuen PC. Modality-correlation-aware sparse representation for RGB-infrared object tracking. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2018.10.002] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
24
|
Zhang H, Qian J, Zhang B, Yang J, Gong C, Wei Y. Low-Rank Matrix Recovery via Modified Schatten-p Norm Minimization with Convergence Guarantees. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:3132-3142. [PMID: 31831418 DOI: 10.1109/tip.2019.2957925] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In recent years, low-rank matrix recovery problems have attracted much attention in computer vision and machine learning. The corresponding rank minimization problems are both combinational and NP-hard in general, which are mainly solved by both nuclear norm and Schatten-p (0
Collapse
|
25
|
Zhang W, He X, Lu W, Qiao H, Li Y. Feature Aggregation With Reinforcement Learning for Video-Based Person Re-Identification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3847-3852. [PMID: 30872245 DOI: 10.1109/tnnls.2019.2899588] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Video-based person re-identification (re-id) matches two tracks of persons from different cameras. Features are extracted from the images of a sequence and then aggregated as a track feature. Compared to existing works that aggregate frame features by simply averaging them or using temporal models such as recurrent neural networks, we propose an intelligent feature aggregate method based on reinforcement learning. Specifically, we train an agent to determine which frames in the sequence should be abandoned in the aggregation, which can be treated as a decision making process. By this way, the proposed method avoids introducing noisy information of the sequence and retains these valuable frames when generating a track feature. On benchmark data sets, experimental results show that our method can boost the re-id accuracy obviously based on the state-of-the-art models.
Collapse
|
26
|
Rajendra Kurup A, Ajith M, Martínez Ramón M. Semi-supervised facial expression recognition using reduced spatial features and Deep Belief Networks. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.029] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
27
|
Byeon M, Lee M, Kim K, Choi JY. Variational Inference for 3-D Localization and Tracking of Multiple Targets Using Multiple Cameras. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3260-3274. [PMID: 30703042 DOI: 10.1109/tnnls.2018.2890526] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper proposes a novel unified framework to solve the 3-D localization and tracking problem that occurs multiple camera settings with overlapping views. The main challenge is to overcome the uncertainty of the back projection arising from the challenges of ground point detection in an environment that includes severe occlusions and the unknown heights of people. To tackle this challenge, we establish a Bayesian learning framework that maximizes a posterior over the trajectory assignments and 3-D positions for given detections from multiple cameras. To solve the Bayesian learning problem in a tractable form, we develop an expectation-maximization scheme based on the variation inference approximation, where the probability distributions are designed to follow Boltzmann distributions of seven terms that are induced from multicamera tracking settings. The experimental results show that the proposed method outperforms the state-of-the-art methods on the challenging multicamera data sets.
Collapse
|
28
|
Abdellatef E, Ismail NA, Abd Elrahman SESE, Ismail KN, Rihan M, Abd El-Samie FE. Cancelable fusion-based face recognition. MULTIMEDIA TOOLS AND APPLICATIONS 2019; 78:31557-31580. [DOI: 10.1007/s11042-019-07848-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 05/15/2019] [Accepted: 05/31/2019] [Indexed: 09/01/2023]
|
29
|
Abstract
Object tracking has always been an interesting and essential research topic in the domain of computer vision, of which the model update mechanism is an essential work, therefore the robustness of it has become a crucial factor influencing the quality of tracking of a sequence. This review analyses on recent tracking model update strategies, where target model update occasion is first discussed, then we give a detailed discussion on update strategies of the target model based on the mainstream tracking frameworks, and the background update frameworks are discussed afterwards. The experimental performances of the trackers in recent researches acting on specific sequences are listed in this review, where the superiority and some failure cases on each of them are discussed, and conclusions based on those performances are then drawn. It is a crucial point that design of a proper background model as well as its update strategy ought to be put into consideration. A cascade update of the template corresponding to each deep network layer based on the contributions of them to the target recognition can also help with more accurate target location, where target saliency information can be utilized as a tool for state estimation.
Collapse
|
30
|
Zhu G, Zhang Z, Wang J, Wu Y, Lu H. Dynamic Collaborative Tracking. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3035-3046. [PMID: 32175852 DOI: 10.1109/tnnls.2018.2861838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Correlation filter has been demonstrated remarkable success for visual tracking recently. However, most existing methods often face model drift caused by several factors, such as unlimited boundary effect, heavy occlusion, fast motion, and distracter perturbation. To address the issue, this paper proposes a unified dynamic collaborative tracking framework that can perform more flexible and robust position prediction. Specifically, the framework learns the object appearance model by jointly training the objective function with three components: target regression submodule, distracter suppression submodule, and maximum margin relation submodule. The first submodule mainly takes advantage of the circulant structure of training samples to obtain the distinguishing ability between the target and its surrounding background. The second submodule optimizes the label response of the possible distracting region close to zero for reducing the peak value of the confidence map in the distracting region. Inspired by the structure output support vector machines, the third submodule is introduced to utilize the differences between target appearance representation and distracter appearance representation in the discriminative mapping space for alleviating the disturbance of the most possible hard negative samples. In addition, a CUR filter as an assistant detector is embedded to provide effective object candidates for alleviating the model drift problem. Comprehensive experimental results show that the proposed approach achieves the state-of-the-art performance in several public benchmark data sets.
Collapse
|
31
|
Zhang H, Gong C, Qian J, Zhang B, Xu C, Yang J. Efficient Recovery of Low-Rank Matrix via Double Nonconvex Nonsmooth Rank Minimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2916-2925. [PMID: 30892254 DOI: 10.1109/tnnls.2019.2900572] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Recently, there is a rapidly increasing attraction for the efficient recovery of low-rank matrix in computer vision and machine learning. The popular convex solution of rank minimization is nuclear norm-based minimization (NNM), which usually leads to a biased solution since NNM tends to overshrink the rank components and treats each rank component equally. To address this issue, some nonconvex nonsmooth rank (NNR) relaxations have been exploited widely. Different from these convex and nonconvex rank substitutes, this paper first introduces a general and flexible rank relaxation function named weighted NNR relaxation function, which is actually derived from the initial double NNR (DNNR) relaxations, i.e., DNNR relaxation function acts on the nonconvex singular values function (SVF). An iteratively reweighted SVF optimization algorithm with continuation technology through computing the supergradient values to define the weighting vector is devised to solve the DNNR minimization problem, and the closed-form solution of the subproblem can be efficiently obtained by a general proximal operator, in which each element of the desired weighting vector usually satisfies the nondecreasing order. We next prove that the objective function values decrease monotonically, and any limit point of the generated subsequence is a critical point. Combining the Kurdyka-Łojasiewicz property with some milder assumptions, we further give its global convergence guarantee. As an application in the matrix completion problem, experimental results on both synthetic data and real-world data can show that our methods are competitive with several state-of-the-art convex and nonconvex matrix completion methods.
Collapse
|
32
|
|
33
|
Computational Imaging Method with a Learned Plug-and-Play Prior for Electrical Capacitance Tomography. Cognit Comput 2019. [DOI: 10.1007/s12559-019-09682-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
34
|
Abstract
Kernel correlation filters (KCF) demonstrate significant potential in visual object tracking by employing robust descriptors. Proper selection of color and texture features can provide robustness against appearance variations. However, the use of multiple descriptors would lead to a considerable feature dimension. In this paper, we propose a novel low-rank descriptor, that provides better precision and success rate in comparison to state-of-the-art trackers. We accomplished this by concatenating the magnitude component of the Overlapped Multi-oriented Tri-scale Local Binary Pattern (OMTLBP), Robustness-Driven Hybrid Descriptor (RDHD), Histogram of Oriented Gradients (HoG), and Color Naming (CN) features. We reduced the rank of our proposed multi-channel feature to diminish the computational complexity. We formulated the Support Vector Machine (SVM) model by utilizing the circulant matrix of our proposed feature vector in the kernel correlation filter. The use of discrete Fourier transform in the iterative learning of SVM reduced the computational complexity of our proposed visual tracking algorithm. Extensive experimental results on Visual Tracker Benchmark dataset show better accuracy in comparison to other state-of-the-art trackers.
Collapse
|
35
|
Zhou JT, Fang M, Zhang H, Gong C, Peng X, Cao Z, Goh RSM. Learning With Annotation of Various Degrees. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2794-2804. [PMID: 30640630 DOI: 10.1109/tnnls.2018.2885854] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we study a new problem in the scenario of sequences labeling. To be exact, we consider that the training data are with annotation of various degrees, namely, fully labeled, unlabeled, and partially labeled sequences. The learning with fully un/labeled sequence refers to the standard setting in traditional un/supervised learning, and the proposed partially labeling specifies the subject that the element does not belong to. The partially labeled data are cheaper to obtain compared with the fully labeled data though it is less informative, especially when the tasks require a lot of domain knowledge. To solve such a practical challenge, we propose a novel deep conditional random field (CRF) model which utilizes an end-to-end learning manner to smoothly handle fully/un/partially labeled sequences within a unified framework. To the best of our knowledge, this could be one of the first works to utilize the partially labeled instance for sequence labeling, and the proposed algorithm unifies the deep learning and CRF in an end-to-end framework. Extensive experiments show that our method achieves state-of-the-art performance in two sequence labeling tasks on some popular data sets.
Collapse
|
36
|
Zhang H, Qian J, Gao J, Yang J, Xu C. Scalable Proximal Jacobian Iteration Method With Global Convergence Analysis for Nonconvex Unconstrained Composite Optimizations. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2825-2839. [PMID: 30668503 DOI: 10.1109/tnnls.2018.2885699] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The recent studies have found that the nonconvex relaxation functions usually perform better than the convex counterparts in the l0 -norm and rank function minimization problems. However, due to the absence of convexity in these nonconvex problems, developing efficient algorithms with convergence guarantee becomes very challenging. Inspired by the basic ideas of both the Jacobian alternating direction method of multipliers (JADMMs) for solving linearly constrained problems with separable objectives and the proximal gradient methods (PGMs) for optimizing the unconstrained problems with one variable, this paper focuses on extending the PGMs to the proximal Jacobian iteration methods (PJIMs) for handling with a family of nonconvex composite optimization problems with two splitting variables. To reduce the total computational complexity by decreasing the number of iterations, we devise the accelerated version of PJIMs through the well-known Nesterov's acceleration strategy and further extend both to solve the multivariable cases. Most importantly, we provide a rigorous convergence analysis, in theory, to show that the generated variable sequence globally converges to a critical point by exploiting the Kurdyka-Łojasiewica (KŁ) property for a broad class of functions. Furthermore, we also establish the linear and sublinear convergence rates of the obtained variable sequence in the objective function. As the specific application to the nonconvex sparse and low-rank recovery problems, several numerical experiments can verify that the newly proposed algorithms not only keep fast convergence speed but also have high precision.
Collapse
|
37
|
Zhang C, Cheng J, Tian Q. Multi-View Image Classification With Visual, Semantic And View Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:617-627. [PMID: 31425078 DOI: 10.1109/tip.2019.2934576] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Multi-view visual classification methods have been widely applied to use discriminative information of different views. This strategy has been proven very effective by many researchers. On the one hand, images are often treated independently without fully considering their visual and semantic correlations. On the other hand, view consistency is often ignored. To solve these problems, in this paper, we propose a novel multi-view image classification method with visual, semantic and view consistency (VSVC). For each image, we linearly combine multi-view information for image classification. The combination parameters are determined by considering both the classification loss and the visual, semantic and view consistency. Visual consistency is imposed by ensuring that visually similar images of the same view are predicted to have similar values. For semantic consistency, we impose the locality constraint that nearby images should be predicted to have the same class by multiview combination. View consistency is also used to ensure that similar images have consistent multi-view combination parameters. An alternative optimization strategy is used to learn the combination parameters. To evaluate the effectiveness of VSVC, we perform image classification experiments on several public datasets. The experimental results on these datasets show the effectiveness of the proposed VSVC method.
Collapse
|
38
|
Abstract
Curriculum Learning (CL) is a recently proposed learning paradigm that aims to achieve satisfactory performance by properly organizing the learning sequence from simple curriculum examples to more difficult ones. Up to now, few works have been done to explore CL for the data with graph structure. Therefore, this article proposes a novel CL algorithm that can be utilized to guide the Label Propagation (LP) over graphs, of which the target is to “learn” the labels of unlabeled examples on the graphs. Specifically, we assume that different unlabeled examples have different levels of difficulty for propagation, and their label learning should follow a simple-to-difficult sequence with the updated curricula. Furthermore, considering that the practical data are often characterized by multiple modalities, every modality in our method is associated with a “teacher” that not only evaluates the difficulties of examples from its own viewpoint, but also cooperates with other teachers to generate the overall simplest curriculum examples for propagation. By taking the curriculums suggested by the teachers as a whole, the common preference (i.e., commonality) of teachers on selecting the simplest examples can be discovered by a row-sparse matrix, and their distinct opinions (i.e., individuality) are captured by a sparse noise matrix. As a result, an accurate curriculum sequence can be established and the propagation quality can thus be improved. Theoretically, we prove that the propagation risk bound is closely related to the examples’ difficulty information, and empirically, we show that our method can generate higher accuracy than the state-of-the-art CL approach and LP algorithms on various multi-modal tasks.
Collapse
Affiliation(s)
- Chen Gong
- PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Jian Yang
- PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Dacheng Tao
- UBTECH Sydney Artificial Intelligence Centre and the School of Computer Science, Faculty of Engineering and Information Technologies, the University of Sydney, Sydney, Australia
| |
Collapse
|
39
|
Lücke J, Forster D. k-means as a variational EM approximation of Gaussian mixture models. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.04.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
40
|
|
41
|
Zhao Y, Liu Y, Wen G, Huang T. Finite-Time Distributed Average Tracking for Second-Order Nonlinear Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1780-1789. [PMID: 30371392 DOI: 10.1109/tnnls.2018.2873676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper studies the distributed average tracking (DAT) problem for multiple reference signals described by the second-order nonlinear dynamical systems. Leveraging the state-dependent gain design and the adaptive control approaches, a couple of DAT algorithms are developed in this paper, which are named finite-time and adaptive-gain DAT algorithms. Based on the finite-time one, the states of the physical agents in this paper can track the average of the time-varying reference signals within a finite settling time. Furthermore, the finite settling time is also estimated by considering a well-designed Lyapunov function in this paper. Compared with asymptotical DAT algorithms, the proposed finite-time algorithm not only solve finite-time DAT problems but also ensure states of physical agents to achieve an accurate average of the multiple signals. Then, an adaptive-gain DAT algorithm is designed. Based on the adaptive-gain one, the DAT problem is solved without global information. Thus, it is fully distributed. Finally, numerical simulations show the effectiveness of the theoretical results.
Collapse
|
42
|
Han Y, Deng C, Zhao B, Tao D. State-aware Anti-drift Object Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:4075-4086. [PMID: 30892207 DOI: 10.1109/tip.2019.2905984] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Correlation filter (CF) based trackers have aroused increasing attentions in visual tracking field due to the superior performance on several datasets while maintaining high running speed. For each frame, an ideal filter is trained in order to discriminate the target from its surrounding background. Considering that the target always undergoes external and internal interference during tracking procedure, the trained tracker should not only have the ability to judge the current state when failure occurs, but also to resist the model drift caused by challenging distractions. To this end, we present a State-aware Anti-drift Tracker (SAT) in this paper, which jointly model the discrimination and reliability information in filter learning. Specifically, global context patches are incorporated into filter training stage to better distinguish the target from backgrounds. Meanwhile, a color-based reliable mask is learned to encourage the filter to focus on more reliable regions suitable for tracking. We show that the proposed optimization problem could be efficiently solved using Alternative Direction Method of Multipliers and fully carried out in Fourier domain. Furthermore, a Kurtosis-based updating scheme is advocated to reveal the tracking condition as well as guarantee a high-confidence template updating. Extensive experiments are conducted on OTB-100 and UAV-20L datasets to compare the SAT tracker with other relevant state-of-the-art methods. Both quantitative and qualitative evaluations further demonstrate the effectiveness and robustness of the proposed work.
Collapse
|
43
|
|
44
|
Superpixel based Feature Specific Sparse Representation for Spectral-Spatial Classification of Hyperspectral Images. REMOTE SENSING 2019. [DOI: 10.3390/rs11050536] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
To improve the performance of the sparse representation classification (SRC), we propose a superpixel-based feature specific sparse representation framework (SPFS-SRC) for spectral-spatial classification of hyperspectral images (HSI) at superpixel level. First, the HSI is divided into different spatial regions, each region is shape- and size-adapted and considered as a superpixel. For each superpixel, it contains a number of pixels with similar spectral characteristic. Since the utilization of multiple features in HSI classification has been proved to be an effective strategy, we have generated both spatial and spectral features for each superpixel. By assuming that all the pixels in a superpixel belongs to one certain class, a kernel SRC is introduced to the classification of HSI. In the SRC framework, we have employed a metric learning strategy to exploit the commonalities of different features. Experimental results on two popular HSI datasets have demonstrated the efficacy of our proposed methodology.
Collapse
|
45
|
|
46
|
Ye M, Li J, Ma AJ, Zheng L, Yuen PC. Dynamic Graph Co-Matching for Unsupervised Video-based Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2976-2990. [PMID: 30640612 DOI: 10.1109/tip.2019.2893066] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cross-camera label estimation from a set of unlabelled training data is an extremely important component in unsupervised person re-identification (re-ID) systems. With the estimated labels, existing advanced supervised learning methods can be leveraged to learn discriminative re-ID models. In this paper, we utilize the graph matching technique for accurate label estimation due to its advantages in optimal global matching and intra-camera relationship mining. However, the graph structure constructed with non-learnt similarity measurement cannot handle the large cross-camera variations, which leads to noisy and inaccurate label outputs. This paper designs a Dynamic Graph Matching (DGM) framework, which improves the label estimation process by iteratively refining the graph structure with better similarity measurement learnt from intermediate estimated labels. In addition, we design a positive re-weighting strategy to refine the intermediate labels, which enhances the robustness against inaccurate matching output and noisy initial training data. To fully utilize the abundant video information and reduce false matchings, a co-matching strategy is further incorporated into the framework. Comprehensive experiments conducted on three video benchmarks demonstrate that DGM outperforms state-of-the-art unsupervised re-ID methods and yields competitive performance to fully supervised upper bounds.
Collapse
|
47
|
|
48
|
Cui A, Peng J, Li H. Exact recovery low-rank matrix via transformed affine matrix rank minimization. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.05.092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
49
|
|
50
|
Zhou JT, Zhao H, Peng X, Fang M, Qin Z, Goh RSM. Transfer Hashing: From Shallow to Deep. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:6191-6201. [PMID: 29993900 DOI: 10.1109/tnnls.2018.2827036] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
One major assumption used in most existing hashing approaches is that the domain of interest (i.e., the target domain) could provide sufficient training data, either labeled or unlabeled. However, this assumption may be violated in practice. To address this so-called data sparsity issue in hashing, a new framework termed transfer hashing with privileged information (THPI) is proposed, which marriages hashing and transfer learning (TL). To show the efficacy of THPI, we propose three variants of the well-known iterative quantization (ITQ) as a showcase. The proposed methods, ITQ+, LapITQ+, and deep transfer hashing (DTH), solve the aforementioned data sparsity issue from different aspects. Specifically, ITQ+ is a shallow model, which makes ITQ achieve hashing in a TL manner. ITQ+ learns a new slack function from the source domain to approximate the quantization error on the target domain given by ITQ. To further improve the performance of ITQ+, LapITQ+ is proposed by embedding the geometric relationship of the source domain into the target domain. Moreover, DTH is proposed to show the generality of our framework by utilizing the powerful representative capacity of deep learning. To the best of our knowledge, this could be one of the first DTH works. Extensive experiments on several popular data sets demonstrate the effectiveness of our shallow and DTH approaches comparing with several state-of-the-art hashing approaches.
Collapse
|