1
|
Hamri ME, Bennani Y, Falih I. Incremental Confidence Sampling with Optimal Transport for Domain Adaptation. Int J Neural Syst 2024; 34:2450044. [PMID: 38864576 DOI: 10.1142/s0129065724500448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Domain adaptation is a subfield of statistical learning theory that takes into account the shift between the distribution of training and test data, typically known as source and target domains, respectively. In this context, this paper presents an incremental approach to tackle the intricate challenge of unsupervised domain adaptation, where labeled data within the target domain is unavailable. The proposed approach, OTP-DA, endeavors to learn a sequence of joint subspaces from both the source and target domains using Linear Discriminant Analysis (LDA), such that the projected data into these subspaces are domain-invariant and well-separated. Nonetheless, the necessity of labeled data for LDA to derive the projection matrix presents a substantial impediment, given the absence of labels within the target domain in the setting of unsupervised domain adaptation. To circumvent this limitation, we introduce a selective label propagation technique grounded on optimal transport (OTP), to generate pseudo-labels for target data, which serve as surrogates for the unknown labels. We anticipate that the process of inferring labels for target data will be substantially streamlined within the acquired latent subspaces, thereby facilitating a self-training mechanism. Furthermore, our paper provides a rigorous theoretical analysis of OTP-DA, underpinned by the concept of weak domain adaptation learners, thereby elucidating the requisite conditions for the proposed approach to solve the problem of unsupervised domain adaptation efficiently. Experimentation across a spectrum of visual domain adaptation problems suggests that OTP-DA exhibits promising efficacy and robustness, positioning it favorably compared to several state-of-the-art methods.
Collapse
Affiliation(s)
| | - Younès Bennani
- LIPN, UMR 7030, Université Sorbonne Paris Nord, Villetaneuse, France
| | - Issam Falih
- LIMOS, UMR 6158, Université Clermont-Auvergne, Clermont-Ferrand, France
| |
Collapse
|
2
|
Stan S, Rostami M. Source-free domain adaptation for semantic image segmentation using internal representations. Front Big Data 2024; 7:1359317. [PMID: 38957657 PMCID: PMC11217319 DOI: 10.3389/fdata.2024.1359317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 05/22/2024] [Indexed: 07/04/2024] Open
Abstract
Semantic segmentation models trained on annotated data fail to generalize well when the input data distribution changes over extended time period, leading to requiring re-training to maintain performance. Classic unsupervised domain adaptation (UDA) attempts to address a similar problem when there is target domain with no annotated data points through transferring knowledge from a source domain with annotated data. We develop an online UDA algorithm for semantic segmentation of images that improves model generalization on unannotated domains in scenarios where source data access is restricted during adaptation. We perform model adaptation by minimizing the distributional distance between the source latent features and the target features in a shared embedding space. Our solution promotes a shared domain-agnostic latent feature space between the two domains, which allows for classifier generalization on the target dataset. To alleviate the need of access to source samples during adaptation, we approximate the source latent feature distribution via an appropriate surrogate distribution, in this case a Gaussian mixture model (GMM).
Collapse
Affiliation(s)
| | - Mohammad Rostami
- Department of Computer Science, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
3
|
Long T, Sun Y, Gao J, Hu Y, Yin B. Domain Adaptation as Optimal Transport on Grassmann Manifolds. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7196-7209. [PMID: 35061594 DOI: 10.1109/tnnls.2021.3139119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Domain adaptation in the Euclidean space is a challenging task on which researchers recently have made great progress. However, in practice, there are rich data representations that are not Euclidean. For example, many high-dimensional data in computer vision are in general modeled by a low-dimensional manifold. This prompts the demand of exploring domain adaptation between non-Euclidean manifold spaces. This article is concerned with domain adaption over the classic Grassmann manifolds. An optimal transport-based domain adaptation model on Grassmann manifolds has been proposed. The model implements the adaption between datasets by minimizing the Wasserstein distances between the projected source data and the target data on Grassmann manifolds. Four regularization terms are introduced to keep task-related consistency in the adaptation process. Furthermore, to reduce the computational cost, a simplified model preserving the necessary adaption property and its efficient algorithm is proposed and tested. The experiments on several publicly available datasets prove the proposed model outperforms several relevant baseline domain adaptation methods.
Collapse
|
4
|
Wu F, Courty N, Jin S, Li SZ. Improving molecular representation learning with metric learning-enhanced optimal transport. PATTERNS (NEW YORK, N.Y.) 2023; 4:100714. [PMID: 37123438 PMCID: PMC10140620 DOI: 10.1016/j.patter.2023.100714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 12/29/2022] [Accepted: 03/01/2023] [Indexed: 05/02/2023]
Abstract
Training data are usually limited or heterogeneous in many chemical and biological applications. Existing machine learning models for chemistry and materials science fail to consider generalizing beyond training domains. In this article, we develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems. MROT learns a continuous label of the data by measuring a new metric of domain distances and a posterior variance regularization over the transport plan to bridge the chemical domain gap. Among downstream tasks, we consider basic chemical regression tasks in unsupervised and semi-supervised settings, including chemical property prediction and materials adsorption selection. Extensive experiments show that MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances with desired properties.
Collapse
Affiliation(s)
- Fang Wu
- School of Engineering, Westlake University, Hangzhou 310024, China
- Institute of AI Industry Research, Tsinghua University, Beijing 100084, China
| | - Nicolas Courty
- French National Centre for Scientific Research, Southern Brittany University, Lorient, France
| | - Shuting Jin
- School of Informatics, Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Stan Z. Li
- School of Engineering, Westlake University, Hangzhou 310024, China
- Corresponding author
| |
Collapse
|
5
|
Sicilia A, Zhao X, Hwang SJ. Domain adversarial neural networks for domain generalization: when it works and how to improve. Mach Learn 2023. [DOI: 10.1007/s10994-023-06324-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
Abstract
AbstractTheoretically, domain adaptation is a well-researched problem. Further, this theory has been well-used in practice. In particular, we note the bound on target error given by Ben-David et al. (Mach Learn 79(1–2):151–175, 2010) and the well-known domain-aligning algorithm based on this work using Domain Adversarial Neural Networks (DANN) presented by Ganin and Lempitsky (in International conference on machine learning, pp 1180–1189). Recently, multiple variants of DANN have been proposed for the related problem of domain generalization, but without much discussion of the original motivating bound. In this paper, we investigate the validity of DANN in domain generalization from this perspective. We investigate conditions under which application of DANN makes sense and further consider DANN as a dynamic process during training. Our investigation suggests that the application of DANN to domain generalization may not be as straightforward as it seems. To address this, we design an algorithmic extension to DANN in the domain generalization case. Our experimentation validates both theory and algorithm.
Collapse
|
6
|
Hanneke S, Kpotufe S. A no-free-lunch theorem for multitask learning. Ann Stat 2022. [DOI: 10.1214/22-aos2189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
7
|
Aritake T, Hino H. Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport. Neural Comput 2022; 34:2432-2466. [DOI: 10.1162/neco_a_01549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 08/16/2022] [Indexed: 11/09/2022]
Abstract
Abstract
Domain adaptation aims to transfer knowledge of labeled instances obtained from a source domain to a target domain to fill the gap between the domains. Most domain adaptation methods assume that the source and target domains have the same dimensionality. Methods that are applicable when the number of features is different in each domain have rarely been studied, especially when no label information is given for the test data obtained from the target domain. In this letter, it is assumed that common features exist in both domains and that extra (new additional) features are observed in the target domain; hence, the dimensionality of the target domain is higher than that of the source domain. To leverage the homogeneity of the common features, the adaptation between these source and target domains is formulated as an optimal transport (OT) problem. In addition, a learning bound in the target domain for the proposed OT-based method is derived. The proposed algorithm is validated using both simulated and real-world data.
Collapse
Affiliation(s)
| | - Hideitsu Hino
- Institute of Statistical Mathematics, Tachikawa, Tokyo, 190-8562, Japan
- RIKEN AIP, Nihon-bashi, Chuo-ku, Tokyo 103-0027, Japan
| |
Collapse
|
8
|
Chen P, Zhao R, He T, Wei K, Yang Q. Unsupervised domain adaptation of bearing fault diagnosis based on Join Sliced Wasserstein Distance. ISA TRANSACTIONS 2022; 129:504-519. [PMID: 35039152 DOI: 10.1016/j.isatra.2021.12.037] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 12/30/2021] [Accepted: 12/30/2021] [Indexed: 06/14/2023]
Abstract
Deep neural networks have been successfully utilized in the mechanical fault diagnosis, however, a large number of them have been based on the same assumption that training and test datasets followed the same distributions. Unfortunately, the mechanical systems are easily affected by environment noise interference, speed or load change. Consequently, the trained networks have poor generalization under various working conditions. Recently, unsupervised domain adaptation has been concentrated on more and more attention since it can handle different but related data. Sliced Wasserstein Distance has been successfully utilized in unsupervised domain adaptation and obtained excellent performances. However, most of the approaches have ignored the class conditional distribution. In this paper, a novel approach named Join Sliced Wasserstein Distance (JSWD) has been proposed to address the above issue. Four bearing datasets have been selected to validate the practicability and effectiveness of the JSWD framework. The experimental results have demonstrated that about 5% accuracy is improved by JSWD with consideration of the conditional probability than no the conditional probability, in addition, the other experimental results have indicated that JSWD could effectively capture the distinguishable and domain-invariant representations and have a has superior data distribution matching than the previous methods under various application scenarios.
Collapse
Affiliation(s)
- Pengfei Chen
- School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China; Gansu Agricultural Mechanization Technology Extension Station, Lanzhou 730046, China.
| | - Rongzhen Zhao
- School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China
| | - Tianjing He
- School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China
| | - Kongyuan Wei
- School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China
| | - Qidong Yang
- Gansu Agricultural Mechanization Technology Extension Station, Lanzhou 730046, China
| |
Collapse
|
9
|
Wei P, Zhang C, Tang Y, Li Z, Wang Z. Reinforced domain adaptation with attention and adversarial learning for unsupervised person Re-ID. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03640-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
10
|
Xu B, Zeng Z, Lian C, Ding Z. Few-Shot Domain Adaptation via Mixup Optimal Transport. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:2518-2528. [PMID: 35275818 DOI: 10.1109/tip.2022.3157139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Unsupervised domain adaptation aims to learn a classification model for the target domain without any labeled samples by transferring the knowledge from the source domain with sufficient labeled samples. The source and the target domains usually share the same label space but are with different data distributions. In this paper, we consider a more difficult but insufficient-explored problem named as few-shot domain adaptation, where a classifier should generalize well to the target domain given only a small number of examples in the source domain. In such a problem, we recast the link between the source and target samples by a mixup optimal transport model. The mixup mechanism is integrated into optimal transport to perform the few-shot adaptation by learning the cross-domain alignment matrix and domain-invariant classifier simultaneously to augment the source distribution and align the two probability distributions. Moreover, spectral shrinkage regularization is deployed to improve the transferability and discriminability of the mixup optimal transport model by utilizing all singular eigenvectors. Experiments conducted on several domain adaptation tasks demonstrate the effectiveness of our proposed model dealing with the few-shot domain adaptation problem compared with state-of-the-art methods.
Collapse
|
11
|
Decomposed-distance weighted optimal transport for unsupervised domain adaptation. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03112-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Li L, Wan Z, He H. Dual Alignment for Partial Domain Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:3404-3416. [PMID: 32356766 DOI: 10.1109/tcyb.2020.2983337] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Partial domain adaptation (PDA) aims to transfer knowledge from a label-rich source domain to a label-scarce target domain based on an assumption that the source label space subsumes the target label space. The major challenge is to promote positive transfer in the shared label space and circumvent negative transfer caused by the large mismatch across different label spaces. In this article, we propose a dual alignment approach for PDA (DAPDA), including three components: 1) a feature extractor extracts source and target features by the Siamese network; 2) a reweighting network produces "hard" labels, class-level weights for source features and "soft" labels, instance-level weights for target features; 3) a dual alignment network aligns intra domain and interdomain distributions. Specifically, the intra domain alignment aims to minimize the intraclass variances to enhance the intraclass compactness in both domains, and interdomain alignment attempts to reduce the discrepancies across domains by domain-wise and class-wise adaptations. The negative transfer can be alleviated by down-weighting source features with nonshared labels. The positive transfer can be enhanced by upweighting source features with shared labels. The adaptation can be achieved by minimizing the discrepancies based on class-weighted source data with hard labels and instance-weighed target data with soft labels. The effectiveness of our method has been demonstrated by outperforming state-of-the-art PDA methods on several benchmark datasets.
Collapse
|
13
|
Kouw WM, Loog M. A Review of Domain Adaptation without Target Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:766-785. [PMID: 31603771 DOI: 10.1109/tpami.2019.2945942] [Citation(s) in RCA: 105] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: How can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based, and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting, and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.
Collapse
|
14
|
Zhou F, Shui C, Abbasi M, Robitaille LE, Wang B, Gagne C. Task Similarity Estimation Through Adversarial Multitask Neural Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:466-480. [PMID: 33112753 DOI: 10.1109/tnnls.2020.3028022] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Multitask learning (MTL) aims at solving the related tasks simultaneously by exploiting shared knowledge to improve performance on individual tasks. Though numerous empirical results supported the notion that such shared knowledge among tasks plays an essential role in MTL, the theoretical understanding of the relationships between tasks and their impact on learning shared knowledge is still an open problem. In this work, we are developing a theoretical perspective of the benefits involved in using information similarity for MTL. To this end, we first propose an upper bound on the generalization error by implementing the Wasserstein distance as the similarity metric. This indicates the practical principles of applying the similarity information to control the generalization errors. Based on those theoretical results, we revisited the adversarial multitask neural network and proposed a new training algorithm to learn the task relation coefficients and neural network parameters automatically. The computer vision benchmarks reveal the abilities of the proposed algorithms to improve the empirical performance. Finally, we test the proposed approach on real medical data sets, showing its advantage for extracting task relations.
Collapse
|
15
|
Chum L, Subramanian A, Balasubramanian VN, Jawahar CV. Beyond Supervised Learning: A Computer Vision Perspective. J Indian Inst Sci 2019. [DOI: 10.1007/s41745-019-0099-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
16
|
DeepJDOT: Deep Joint Distribution Optimal Transport for Unsupervised Domain Adaptation. COMPUTER VISION – ECCV 2018 2018. [DOI: 10.1007/978-3-030-01225-0_28] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|