1
|
Wang Y, Fu Y, Sun X. Knockoffs-SPR: Clean Sample Selection in Learning With Noisy Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:3242-3256. [PMID: 38039178 DOI: 10.1109/tpami.2023.3338268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2023]
Abstract
A noisy training set usually leads to the degradation of the generalization and robustness of neural networks. In this article, we propose a novel theoretically guaranteed clean sample selection framework for learning with noisy labels. Specifically, we first present a Scalable Penalized Regression (SPR) method, to model the linear relation between network features and one-hot labels. In SPR, the clean data are identified by the zero mean-shift parameters solved in the regression model. We theoretically show that SPR can recover clean data under some conditions. Under general scenarios, the conditions may be no longer satisfied; and some noisy data are falsely selected as clean data. To solve this problem, we propose a data-adaptive method for Scalable Penalized Regression with Knockoff filters (Knockoffs-SPR), which is provable to control the False-Selection-Rate (FSR) in the selected clean data. To improve the efficiency, we further present a split algorithm that divides the whole training set into small pieces that can be solved in parallel to make the framework scalable to large datasets. While Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline, we further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data. Experimental results on several benchmark datasets and real-world noisy datasets show the effectiveness of our framework and validate the theoretical results of Knockoffs-SPR.
Collapse
|
2
|
Ali H, Gilani SO, Waris A, Shah UH, Khattak MAK, Khan MJ, Afzal N. Memorability-based multimedia analytics for robotic interestingness prediction system using trimmed Q-learning algorithm. Sci Rep 2023; 13:19799. [PMID: 37957144 PMCID: PMC10643645 DOI: 10.1038/s41598-023-44553-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 10/10/2023] [Indexed: 11/15/2023] Open
Abstract
Mobile robots are increasingly employed in today's environment. Perceiving the environment to perform a task plays a major role in the robots. The service robots are wisely employed in the fully (or) partially known user's environment. The exploration and exploitation of the unknown environment is a tedious task. This paper introduces a novel Trimmed Q-learning algorithm to predict interesting scenes via efficient memorability-oriented robotic behavioral scene activity training. The training process involves three stages: online learning and short-term and long-term learning modules. It is helpful for autonomous exploration and making wiser decisions about the environment. A simplified three-stage learning framework is introduced to train and predict interesting scenes using memorability. A proficient visual memory schema (VMS) is designed to tune the learning parameters. A role-based profile arrangement is made to explore the unknown environment for a long-term learning process. The online and short-term learning frameworks are designed using a novel Trimmed Q-learning algorithm. The underestimated bias in robotic actions must be minimized by introducing a refined set of practical candidate actions. Finally, the recalling ability of each learning module is estimated to predict the interesting scenes. Experiments conducted on public datasets, SubT, and SUN databases demonstrate the proposed technique's efficacy. The proposed framework has yielded better memorability scores in short-term and online learning at 72.84% and in long-term learning at 68.63%.
Collapse
Affiliation(s)
- Hasnain Ali
- School of Mechanical & Manufacturing Engineering, National University of Sciences and Technology, Robotics & AI, Islamabad, 44000, Pakistan.
| | - Syed Omer Gilani
- Department of Electrical, Computer, and Biomedical Engineering, Abu Dhabi University, Abu Dhabi, UAE
| | - Asim Waris
- School of Mechanical & Manufacturing Engineering, National University of Sciences and Technology, Biomedical Engineering & Sciences, Islamabad, 44000, Pakistan
| | - Umer Hameed Shah
- Department of Mechanical Engineering and Artificial Intelligence Research Center, College of Engineering and Information Technology, Ajman University, Ajman, UAE.
| | | | - Muhammad Jawad Khan
- School of Mechanical & Manufacturing Engineering, National University of Sciences and Technology, Robotics & AI, Islamabad, 44000, Pakistan
| | - Namra Afzal
- Department of Biomedical Engineering, University of Engineering and Technology, Lahore, 54000, Pakistan
| |
Collapse
|
3
|
Xu Q, Yang Z, Zhao Y, Cao X, Huang Q. Rethinking Label Flipping Attack: From Sample Masking to Sample Thresholding. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:7668-7685. [PMID: 37819793 DOI: 10.1109/tpami.2022.3220849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Nowadays, machine learning (ML) and deep learning (DL) methods have become fundamental building blocks for a wide range of AI applications. The popularity of these methods also makes them widely exposed to malicious attacks, which may cause severe security concerns. To understand the security properties of the ML/DL methods, researchers have recently started to turn their focus to adversarial attack algorithms that could successfully corrupt the model or clean data owned by the victim with imperceptible perturbations. In this paper, we study the Label Flipping Attack (LFA) problem, where the attacker expects to corrupt an ML/DL model's performance by flipping a small fraction of the labels in the training data. Prior art along this direction adopts combinatorial optimization problems, leading to limited scalability toward deep learning models. To this end, we propose a novel minimax problem which provides an efficient reformulation of the sample selection process in LFA. In the new optimization problem, the sample selection operation could be implemented with a single thresholding parameter. This leads to a novel training algorithm called Sample Thresholding. Since the objective function is differentiable and the model complexity does not depend on the sample size, we can apply Sample Thresholding to attack deep learning models. Moreover, since the victim's behavior is not predictable in a poisonous attack setting, we have to employ surrogate models to simulate the true model employed by the victim model. Seeing the problem, we provide a theoretical analysis of such a surrogate paradigm. Specifically, we show that the performance gap between the true model employed by the victim and the surrogate model is small under mild conditions. On top of this paradigm, we extend Sample Thresholding to the crowdsourced ranking task, where labels collected from the annotators are vulnerable to adversarial attacks. Finally, experimental analyses on three real-world datasets speak to the efficacy of our method.
Collapse
|
4
|
Wang Y, Zhang L, Yao Y, Fu Y. How to Trust Unlabeled Data? Instance Credibility Inference for Few-Shot Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:6240-6253. [PMID: 34081579 DOI: 10.1109/tpami.2021.3086140] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Deep learning based models have excelled in many computer vision tasks and appear to surpass humans' performance. However, these models require an avalanche of expensive human labeled training data and many iterations to train their large number of parameters. This severely limits their scalability to the real-world long-tail distributed categories, some of which are with a large number of instances, but with only a few manually annotated. Learning from such extremely limited labeled examples is known as Few-Shot Learning (FSL). Different to prior arts that leverage meta-learning or data augmentation strategies to alleviate this extremely data-scarce problem, this paper presents a statistical approach, dubbed Instance Credibility Inference (ICI) to exploit the support of unlabeled instances for few-shot visual recognition. Typically, we repurpose the self-taught learning paradigm to predict pseudo-labels of unlabeled instances with an initial classifier trained from the few shot and then select the most confident ones to augment the training set to re-train the classifier. This is achieved by constructing a (Generalized) Linear Model (LM/GLM) with incidental parameters to model the mapping from (un-)labeled features to their (pseudo-)labels, in which the sparsity of the incidental parameters indicates the credibility of the corresponding pseudo-labeled instance. We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances. This process is repeated until all the unlabeled samples are included in the expanded training set. Theoretically, under the conditions of restricted eigenvalue, irrepresentability, and large error, our approach is guaranteed to collect all the correctly-predicted pseudo-labeled instances from the noisy pseudo-labeled set. Extensive experiments under two few-shot settings show the effectiveness of our approach on four widely used few-shot visual recognition benchmark datasets including miniImageNet, tieredImageNet, CIFAR-FS, and CUB. Code and models are released at https://github.com/Yikai-Wang/ICI-FSL.
Collapse
|
5
|
Li C, Xiang Z, Tang J, Luo B, Wang F. RGBT Tracking via Noise-Robust Cross-Modal Ranking. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5019-5031. [PMID: 33861706 DOI: 10.1109/tnnls.2021.3067107] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Existing RGBT tracking methods usually localize a target object with a bounding box, in which the trackers are often affected by the inclusion of background clutter. To address this issue, this article presents a novel algorithm, called noise-robust cross-modal ranking, to suppress background effects in target bounding boxes for RGBT tracking. In particular, we handle the noise interference in cross-modal fusion and seed labels from the following two aspects. First, the soft cross-modality consistency is proposed to allow the sparse inconsistency in fusing different modalities, aiming to take both collaboration and heterogeneity of different modalities into account for more effective fusion. Second, the optimal seed learning is designed to handle label noises of ranking seeds caused by some problems, such as irregular object shape and occlusion. In addition, to deploy the complementarity and maintain the structural information of different features within each modality, we perform an individual ranking for each feature and employ a cross-feature consistency to pursue their collaboration. A unified optimization framework with an efficient convergence speed is developed to solve the proposed model. Extensive experiments demonstrate the effectiveness and efficiency of the proposed approach comparing with state-of-the-art tracking methods on GTOT and RGBT234 benchmark data sets.
Collapse
|
6
|
Geng X, Zheng R, Lv J, Zhang Y. Multilabel Ranking With Inconsistent Rankers. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:5211-5224. [PMID: 33798071 DOI: 10.1109/tpami.2021.3070709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
While most existing multilabel ranking methods assume the availability of a single objective label ranking for each instance in the training set, this paper deals with a more common case where only subjective inconsistent rankings from multiple rankers are associated with each instance. Two ranking methods are proposed from the perspective of instances and rankers, respectively. The first method, Instance-oriented Preference Distribution Learning (IPDL), is to learn a latent preference distribution for each instance. IPDL generates a common preference distribution that is most compatible to all the personal rankings, and then learns a mapping from the instances to the preference distributions. The second method, Ranker-oriented Preference Distribution Learning (RPDL), is proposed by leveraging interpersonal inconsistency among rankers, to learn a unified model from personal preference distribution models of all rankers. These two methods are applied to natural scene images dataset and 3D facial expression dataset BU_3DFE. Experimental results show that IPDL and RPDL can effectively incorporate the information given by the inconsistent rankers, and perform remarkably better than the compared state-of-the-art multilabel ranking algorithms.
Collapse
|
7
|
RGBT tracking based on cooperative low-rank graph model. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
8
|
Xu Q, Yang Z, Jiang Y, Cao X, Yao Y, Huang Q. Not All Samples are Trustworthy: Towards Deep Robust SVP Prediction. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:3154-3169. [PMID: 33373295 DOI: 10.1109/tpami.2020.3047817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this paper, we study the problem of estimating subjective visual properties (SVP) for images, which is an emerging task in Computer Vision. Generally speaking, collecting SVP datasets involves a crowdsourcing process where annotations are obtained from a wide range of online users. Since the process is done without quality control, SVP datasets are known to suffer from noise. This leads to the issue that not all samples are trustworthy. Facing this problem, we need to develop robust models for learning SVP from noisy crowdsourced annotations. In this paper, we construct two general robust learning frameworks for this application. Specifically, in the first framework, we propose a probabilistic framework to explicitly model the sparse unreliable patterns that exist in the dataset. It is noteworthy that we then provide an alternative framework that could reformulate the sparse unreliable patterns as a "contraction" operation over the original loss function. The latter framework leverages not only efficient end-to-end training but also rigorous theoretical analyses. To apply these frameworks, we further provide two models as implementations of the frameworks, where the sparse noise parameters could be interpreted with the HodgeRank theory. Finally, extensive theoretical and empirical studies show the effectiveness of our proposed framework.
Collapse
|
9
|
|
10
|
Abstract
AbstractWe propose a scalable Bayesian preference learning method for jointly predicting the preferences of individuals as well as the consensus of a crowd from pairwise labels. Peoples’ opinions often differ greatly, making it difficult to predict their preferences from small amounts of personal data. Individual biases also make it harder to infer the consensus of a crowd when there are few labels per item. We address these challenges by combining matrix factorisation with Gaussian processes, using a Bayesian approach to account for uncertainty arising from noisy and sparse data. Our method exploits input features, such as text embeddings and user metadata, to predict preferences for new items and users that are not in the training set. As previous solutions based on Gaussian processes do not scale to large numbers of users, items or pairwise labels, we propose a stochastic variational inference approach that limits computational and memory costs. Our experiments on a recommendation task show that our method is competitive with previous approaches despite our scalable inference approximation. We demonstrate the method’s scalability on a natural language processing task with thousands of users and items, and show improvements over the state of the art on this task. We make our software publicly available for future work (https://github.com/UKPLab/tacl2018-preference-convincing/tree/crowdGPPL).
Collapse
|
11
|
Xu Q, Xiong J, Cao X, Huang Q, Yao Y. From Social to Individuals: A Parsimonious Path of Multi-Level Models for Crowdsourced Preference Aggregation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:844-856. [PMID: 29993767 DOI: 10.1109/tpami.2018.2817205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or social utility function which generates their comparison behaviors in experiments. However, in reality, annotators are subject to variations due to multi-criteria, abnormal, or a mixture of such behaviors. In this paper, we propose a parsimonious mixed-effects model, which takes into account both the fixed effect that the majority of annotators follows a common linear utility model, and the random effect that some annotators might deviate from the common significantly and exhibit strongly personalized preferences. The key algorithm in this paper establishes a dynamic path from the social utility to individual variations, with different levels of sparsity on personalization. The algorithm is based on the Linearized Bregman Iterations, which leads to easy parallel implementations to meet the need of large-scale data analysis. In this unified framework, three kinds of random utility models are presented, including the basic linear model with L2 loss, Bradley-Terry model, and Thurstone-Mosteller model. The validity of these multi-level models are supported by experiments with both simulated and real-world datasets, which shows that the parsimonious multi-level models exhibit improvements in both interpretability and predictive precision compared with traditional HodgeRank.
Collapse
|
12
|
Lu Z, Fu Z, Xiang T, Han P, Wang L, Gao X. Learning from Weak and Noisy Labels for Semantic Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2017; 39:486-500. [PMID: 28113885 DOI: 10.1109/tpami.2016.2552172] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A weakly supervised semantic segmentation (WSSS) method aims to learn a segmentation model from weak (image-level) as opposed to strong (pixel-level) labels. By avoiding the tedious pixel-level annotation process, it can exploit the unlimited supply of user-tagged images from media-sharing sites such as Flickr for large scale applications. However, these `free' tags/labels are often noisy and few existing works address the problem of learning with both weak and noisy labels. In this work, we cast the WSSS problem into a label noise reduction problem. Specifically, after segmenting each image into a set of superpixels, the weak and potentially noisy image-level labels are propagated to the superpixel level resulting in highly noisy labels; the key to semantic segmentation is thus to identify and correct the superpixel noisy labels. To this end, a novel L1-optimisation based sparse learning model is formulated to directly and explicitly detect noisy labels. To solve the L1-optimisation problem, we further develop an efficient learning algorithm by introducing an intermediate labelling variable. Extensive experiments on three benchmark datasets show that our method yields state-of-the-art results given noise-free labels, whilst significantly outperforming the existing methods when the weak labels are also noisy.
Collapse
|