1
|
Lv J, Liu B, Feng L, Xu N, Xu M, An B, Niu G, Geng X, Sugiyama M. On the Robustness of Average Losses for Partial-Label Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:2569-2583. [PMID: 37167048 DOI: 10.1109/tpami.2023.3275249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Partial-label learning (PLL) utilizes instances with PLs, where a PL includes several candidate labels but only one is the true label (TL). In PLL, identification-based strategy (IBS) purifies each PL on the fly to select the (most likely) TL for training; average-based strategy (ABS) treats all candidate labels equally for training and let trained models be able to predict TL. Although PLL research has focused on IBS for better performance, ABS is also worthy of study since modern IBS behaves like ABS in the beginning of training to prepare for PL purification and TL selection. In this paper, we analyze why ABS was unsatisfactory and propose how to improve it. Theoretically, we propose two problem settings of PLL and prove that average PL losses (APLLs) with bounded multi-class losses are always robust, while APLLs with unbounded losses may be non-robust, which is the first robustness analysis for PLL. Experimentally, we have two promising findings: ABS using bounded losses can match/exceed state-of-the-art performance of IBS using unbounded losses; after using robust APLLs to warm start, IBS can further improve upon itself. Our work draws attention to ABS research, which can in turn boost IBS and push forward the whole PLL.
Collapse
|
2
|
Wang H, Xiao R, Li Y, Feng L, Niu G, Chen G, Zhao J. PiCO+: Contrastive Label Disambiguation for Robust Partial Label Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:3183-3198. [PMID: 38090836 DOI: 10.1109/tpami.2023.3342650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Partial label learning (PLL) is an important problem that allows each training example to be labeled with a coarse candidate set with the ground-truth label included. However, in a more practical but challenging scenario, the annotator may miss the ground-truth and provide a wrong candidate set, which is known as the noisy PLL problem. To remedy this problem, we propose the PiCO+ framework that simultaneously disambiguates the candidate sets and mitigates label noise. Core to PiCO+, we develop a novel label disambiguation algorithm PiCO that consists of a contrastive learning module along with a novel class prototype-based disambiguation method. Theoretically, we show that these two components are mutually beneficial, and can be rigorously justified from an expectation-maximization (EM) algorithm perspective. To handle label noise, we extend PiCO to PiCO+, which further performs distance-based clean sample selection, and learns robust classifiers by a semi-supervised contrastive learning algorithm. Beyond this, we further investigate the robustness of PiCO+ in the context of out-of-distribution noise and incorporate a novel energy-based rejection method for improved robustness. Extensive experiments demonstrate that our proposed methods significantly outperform the current state-of-the-art approaches in standard and noisy PLL tasks and even achieve comparable results to fully supervised learning.
Collapse
|
3
|
Tian Y, Yu X, Fu S. Partial label learning: Taxonomy, analysis and outlook. Neural Netw 2023; 161:708-734. [PMID: 36848826 DOI: 10.1016/j.neunet.2023.02.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 02/13/2023] [Accepted: 02/13/2023] [Indexed: 02/18/2023]
Abstract
Partial label learning (PLL) is an emerging framework in weakly supervised machine learning with broad application prospects. It handles the case in which each training example corresponds to a candidate label set and only one label concealed in the set is the ground-truth label. In this paper, we propose a novel taxonomy framework for PLL including four categories: disambiguation strategy, transformation strategy, theory-oriented strategy and extensions. We analyze and evaluate methods in each category and sort out synthetic and real-world PLL datasets which are all hyperlinked to the source data. Future work of PLL is profoundly discussed in this article based on the proposed taxonomy framework.
Collapse
Affiliation(s)
- Yingjie Tian
- School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China.
| | - Xiaotong Yu
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China
| | - Saiji Fu
- School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing 100876, China
| |
Collapse
|
4
|
Wang DB, Zhang ML, Li L. Adaptive Graph Guided Disambiguation for Partial Label Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:8796-8811. [PMID: 34648433 DOI: 10.1109/tpami.2021.3120012] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In partial label learning, a multi-class classifier is learned from the ambiguous supervision where each training example is associated with a set of candidate labels among which only one is valid. An intuitive way to deal with this problem is label disambiguation, i.e., differentiating the labeling confidences of different candidate labels so as to try to recover ground-truth labeling information. Recently, feature-aware label disambiguation has been proposed which utilizes the graph structure of feature space to generate labeling confidences over candidate labels. Nevertheless, the existence of noises and outliers in training data makes the graph structure derived from original feature space less reliable. In this paper, a novel partial label learning approach based on adaptive graph guided disambiguation is proposed, which is shown to be more effective in revealing the intrinsic manifold structure among training examples. Other than the sequential disambiguation-then-induction learning strategy, the proposed approach jointly performs adaptive graph construction, candidate label disambiguation and predictive model induction with alternating optimization. Furthermore, we consider the particular human-in-the-loop framework in which a learner is allowed to actively query some ambiguously labeled examples for manual disambiguation. Extensive experiments clearly validate the effectiveness of adaptive graph guided disambiguation for learning from partial label examples.
Collapse
|
5
|
Yu XR, Wang DB, Zhang ML. Partial label learning with emerging new labels. Mach Learn 2022. [DOI: 10.1007/s10994-022-06244-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
6
|
Semi-supervised partial label learning algorithm via reliable label propagation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04027-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
7
|
Lyu G, Feng S, Liu W, Liu S, Lang C. Redundant Label Learning via Subspace Representation and Global Disambiguation. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3558547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
Redundant Label Learning (RLL) aims to induce a robust model from training data, where each example is associated with a set of candidate labels, among which some of them are incorrect. Most existing approaches deal with such problem by disambiguating the candidate labels first and then inducing the predictive model from the disambiguated data. However, these approaches only focus on disambiguation for each instance’ candidate label set, while the global label context tends to be ignored. Meanwhile, these approaches usually induce the objective model by directly utilizing the original feature information, which may lead the model overfitting due to high-dimensional redundant features. To tackle the above issues, we propose a novel feature
S
ubspac
E
R
epresentation
and label
G
lobal Disambiguat
IO
n
(
SERGIO
) approach, which improves the generalization ability of learning system from the perspective of both feature space and label space. Specifically, we project the original high-dimensional feature space into a low-dimensional subspace, where the projection matrix is regularized with an orthogonality constraint to make the subspace more compact. Meanwhile, we introduce a label confidence matrix and constrain it with
\(\mathcal {\mathbf {\ell _1}} \)
-norm and trace-norm regularization simultaneously, which are utilized to explore global label correlations and further well in accordance with the nature of single-label classification and multi-label classification problem, respectively. Extensive experiments on both single-label and multi-label RLL datasets demonstrate that our proposed method achieves competitive performance against state-of-the-art approaches.
Collapse
Affiliation(s)
- Gengyu Lyu
- Beijing University of Technology, China and Beijing Jiaotong University, China
| | | | - Wei Liu
- Beijing Jiaotong University, China
| | | | | |
Collapse
|
8
|
Addressing Label Ambiguity Imbalance in Candidate Labels: Measures and Disambiguation Algorithm. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Label distribution feature selection with feature weights fusion and local label correlations. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
10
|
|
11
|
Zhao L, Xiao Y, Wen K, Liu B, Kong X. Multi-task manifold learning for partial label learning. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.04.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Feature space and label space selection based on Error-correcting output codes for partial label learning. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.12.093] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
13
|
|
14
|
Lyu G, Feng S, Wang T, Lang C. A Self-Paced Regularization Framework for Partial-Label Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:899-911. [PMID: 32452795 DOI: 10.1109/tcyb.2020.2990908] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Partial-label learning (PLL) aims to solve the problem where each training instance is associated with a set of candidate labels, one of which is the correct label. Most PLL algorithms try to disambiguate the candidate label set, by either simply treating each candidate label equally or iteratively identifying the true label. Nonetheless, existing algorithms usually treat all labels and instances equally, and the complexities of both labels and instances are not taken into consideration during the learning stage. Inspired by the successful application of a self-paced learning strategy in the machine-learning field, we integrate the self-paced regime into the PLL framework and propose a novel self-paced PLL (SP-PLL) algorithm, which could control the learning process to alleviate the problem by ranking the priorities of the training examples together with their candidate labels during each learning iteration. Extensive experiments and comparisons with other baseline methods demonstrate the effectiveness and robustness of the proposed method.
Collapse
|
15
|
Qian W, Xiong Y, Yang J, Shu W. Feature selection for label distribution learning via feature similarity and label correlation. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.08.076] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
16
|
Zhang ML, Fang JP. Partial Multi-Label Learning via Credible Label Elicitation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:3587-3599. [PMID: 32286956 DOI: 10.1109/tpami.2020.2985210] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Partial multi-label learning (PML) deals with the problem where each training example is associated with an overcomplete set of candidate labels, among which only some candidate labels are valid. The task of PML naturally arises in learning scenarios with inaccurate supervision, and the goal is to induce a multi-label predictor which can assign a set of proper labels for unseen instance. The PML training procedure is prone to be misled by false positive labels concealed in the candidate label set, which serves as the major modeling difficulty for partial multi-label learning. In this paper, a novel two-stage PML approach is proposed which works by eliciting credible labels from the candidate label set for model induction. In the first stage, the labeling confidence of candidate label for each PML training example is estimated via iterative label propagation. In the second stage, by utilizing credible labels with high labeling confidence, multi-label predictor is induced via pairwise label ranking coupled with virtual label splitting or maximum a posteriori (MAP) reasoning. Experimental studies show that the proposed approach can achieve highly competitive generalization performance by excluding most false positive labels from the training procedure via credible label elicitation.
Collapse
|
17
|
Lyu G, Feng S, Li Y. Noisy label tolerance: A new perspective of Partial Multi-Label Learning. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.09.019] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
18
|
Chai J, Tsang IW, Chen W. Large Margin Partial Label Machine. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2594-2608. [PMID: 31502988 DOI: 10.1109/tnnls.2019.2933530] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Partial label learning (PLL) is a multi-class weakly supervised learning problem where each training instance is associated with a set of candidate labels but only one label is the ground truth. The main challenge of PLL is how to deal with the label ambiguities. Among various disambiguation techniques, large margin (LM)-based algorithms attract much attention due to their powerful discriminative performance. However, existing LM-based algorithms either neglect some potential candidate labels in constructing the margin or introduce auxiliary estimation of class capacities which is generally inaccurate. As a result, their generalization performances are deteriorated. To address the above-mentioned drawbacks, motivated by the optimistic superset loss, we propose an LM Partial LAbel machiNE (LM-PLANE) by extending multi-class support vector machines (SVM) to PLL. Compared with existing LM-based disambiguation algorithms, LM-PLANE considers the margin of all potential candidate labels without auxiliary estimation of class capacities. Furthermore, an efficient cutting plane (CP) method is developed to train LM-PLANE in the dual space. Theoretical insights into the effectiveness and convergence of our CP method are also presented. Extensive experiments on various PLL tasks demonstrate the superiority of LM-PLANE over existing LM based and other representative PLL algorithms in terms of classification accuracy.
Collapse
|
19
|
Abstract
Partial label learning (PLL) aims to learn from the data where each training instance is associated with a set of candidate labels, among which only one is correct. Most existing methods deal with this type of problem by either treating each candidate label equally or identifying the ground-truth label iteratively. In this article, we propose a novel PLL approach named HERA, which simultaneously incorporates the
HeterogEneous Loss
and the
SpaRse and Low-rAnk
procedure to estimate the labeling confidence for each instance while training the desired model. Specifically, the heterogeneous loss integrates the strengths of both the pairwise ranking loss and the pointwise reconstruction loss to provide informative label ranking and reconstruction information for label identification, whereas the embedded sparse and low-rank scheme constrains the sparsity of ground-truth label matrix and the low rank of noise label matrix to explore the global label relevance among the whole training data, for improving the learning model. Comprehensive ablation study demonstrates the effectiveness of our employed heterogeneous loss, and extensive experiments on both artificial and real-world datasets demonstrate that our method achieves superior or comparable performance against state-of-the-art methods.
Collapse
Affiliation(s)
- Gengyu Lyu
- Beijing Jiaotong University, Haidian District, Beijing, China
| | - Songhe Feng
- Beijing Jiaotong University, Haidian District, Beijing, China
| | - Yidong Li
- Beijing Jiaotong University, Haidian District, Beijing, China
| | - Yi Jin
- Beijing Jiaotong University, Haidian District, Beijing, China
| | - Guojun Dai
- Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Congyan Lang
- Beijing Jiaotong University, Haidian District, Beijing, China
| |
Collapse
|
20
|
Xu S, Ju H, Shang L, Pedrycz W, Yang X, Li C. Label distribution learning: A local collaborative mechanism. Int J Approx Reason 2020. [DOI: 10.1016/j.ijar.2020.02.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
21
|
Xu S, Yang M, Zhou Y, Zheng R, Liu W, He J. Partial label metric learning by collapsing classes. INT J MACH LEARN CYB 2020. [DOI: 10.1007/s13042-020-01129-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
22
|
Xue N, Deng J, Cheng S, Panagakis Y, Zafeiriou S. Side Information for Face Completion: A Robust PCA Approach. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:2349-2364. [PMID: 30843800 DOI: 10.1109/tpami.2019.2902556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Robust principal component analysis (RPCA) is a powerful method for learning low-rank feature representation of various visual data. However, for certain types as well as significant amount of error corruption, it fails to yield satisfactory results; a drawback that can be alleviated by exploiting domain-dependent prior knowledge or information. In this paper, we propose two models for the RPCA that take into account such side information, even in the presence of missing values. We apply this framework to the task of UV completion which is widely used in pose-invariant face recognition. Moreover, we construct a generative adversarial network (GAN) to extract side information as well as subspaces. These subspaces not only assist in the recovery but also speed up the process in case of large-scale data. We quantitatively and qualitatively evaluate the proposed approaches through both synthetic data and eight real-world datasets to verify their effectiveness.
Collapse
|
23
|
Lyu G, Feng S, Huang W, Dai G, Zhang H, Chen B. Partial label learning via low-rank representation and label propagation. Soft comput 2019. [DOI: 10.1007/s00500-019-04269-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|