1
|
Chen S. Joint weight optimization for partial domain adaptation via kernel statistical distance estimation. Neural Netw 2024; 180:106739. [PMID: 39299038 DOI: 10.1016/j.neunet.2024.106739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 09/04/2024] [Accepted: 09/12/2024] [Indexed: 09/22/2024]
Abstract
The goal of Partial Domain Adaptation (PDA) is to transfer a neural network from a source domain (joint source distribution) to a distinct target domain (joint target distribution), where the source label space subsumes the target label space. To address the PDA problem, existing works have proposed to learn the marginal source weights to match the weighted marginal source distribution to the marginal target distribution. However, this is sub-optimal, since the neural network's target performance is concerned with the joint distribution disparity, not the marginal distribution disparity. In this paper, we propose a Joint Weight Optimization (JWO) approach that optimizes the joint source weights to match the weighted joint source distribution to the joint target distribution in the neural network's feature space. To measure the joint distribution disparity, we exploit two statistical distances: the distribution-difference-based L2-distance and the distribution-ratio-based χ2-divergence. Since these two distances are unknown in practice, we propose a Kernel Statistical Distance Estimation (KSDE) method to estimate them from the weighted source data and the target data. Our KSDE method explicitly expresses the two estimated statistical distances as functions of the joint source weights. Therefore, we can optimize the joint weights to minimize the estimated distance functions and reduce the joint distribution disparity. Finally, we achieve the PDA goal by training the neural network on the weighted source data. Experiments on several popular datasets are conducted to demonstrate the effectiveness of our approach. Intro video and Pytorch code are available at https://github.com/sentaochen/Joint-Weight-Optimation. Interested readers can also visit https://github.com/sentaochen for more source codes of the related domain adaptation, multi-source domain adaptation, and domain generalization approaches.
Collapse
Affiliation(s)
- Sentao Chen
- Department of Computer Science, Shantou University, China.
| |
Collapse
|
2
|
Liu L, Zhou B, Zhao Z, Liu Z. Active Dynamic Weighting for multi-domain adaptation. Neural Netw 2024; 177:106398. [PMID: 38805796 DOI: 10.1016/j.neunet.2024.106398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 03/11/2024] [Accepted: 05/19/2024] [Indexed: 05/30/2024]
Abstract
Multi-source unsupervised domain adaptation aims to transfer knowledge from multiple labeled source domains to an unlabeled target domain. Existing methods either seek a mixture of distributions across various domains or combine multiple single-source models for weighted fusion in the decision process, with little insight into the distributional discrepancy between different source domains and the target domain. Considering the discrepancies in global and local feature distributions between different domains and the complexity of obtaining category boundaries across domains, this paper proposes a novel Active Dynamic Weighting (ADW) for multi-source domain adaptation. Specifically, to effectively utilize the locally advantageous features in the source domains, ADW designs a multi-source dynamic adjustment mechanism during the training process to dynamically control the degree of feature alignment between each source and target domain in the training batch. In addition, to ensure the cross-domain categories can be distinguished, ADW devises a dynamic boundary loss to guide the model to focus on the hard samples near the decision boundary, which enhances the clarity of the decision boundary and improves the model's classification ability. Meanwhile, ADW applies active learning to multi-source unsupervised domain adaptation for the first time, guided by dynamic boundary loss, proposes an efficient importance sampling strategy to select target domain hard samples to annotate at a minimal annotation budget, integrates it into the training process, and further refines the domain alignment at the category level. Experiments on various benchmark datasets consistently demonstrate the superiority of our method.
Collapse
Affiliation(s)
- Long Liu
- Xi'an University of Technology, Xi'an, 710048, China.
| | - Bo Zhou
- Xi'an University of Technology, Xi'an, 710048, China.
| | - Zhipeng Zhao
- Xi'an University of Technology, Xi'an, 710048, China.
| | - Zening Liu
- Xi'an University of Technology, Xi'an, 710048, China.
| |
Collapse
|
3
|
Wen L, Chen S, Xie M, Liu C, Zheng L. Training multi-source domain adaptation network by mutual information estimation and minimization. Neural Netw 2024; 171:353-361. [PMID: 38128299 DOI: 10.1016/j.neunet.2023.12.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 12/01/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023]
Abstract
We address the problem of Multi-Source Domain Adaptation (MSDA), which trains a neural network using multiple labeled source datasets and an unlabeled target dataset, and expects the trained network to well classify the unlabeled target data. The main challenge in this problem is that the datasets are generated by relevant but different joint distributions. In this paper, we propose to address this challenge by estimating and minimizing the mutual information in the network latent feature space, which leads to the alignment of the source joint distributions and target joint distribution simultaneously. Here, the estimation of the mutual information is formulated into a convex optimization problem, such that the global optimal solution can be easily found. We conduct experiments on several public datasets, and show that our algorithm statistically outperforms its competitors. Video and code are available at https://github.com/sentaochen/Mutual-Information-Estimation-and-Minimization.
Collapse
Affiliation(s)
- Lisheng Wen
- Department of Computer Science, Shantou University, China
| | - Sentao Chen
- Department of Computer Science, Shantou University, China.
| | - Mengying Xie
- College of Computer Science, Chongqing University, China
| | - Cheng Liu
- Department of Computer Science, Shantou University, China
| | - Lin Zheng
- Department of Computer Science, Shantou University, China
| |
Collapse
|
4
|
Lei B, Zhu Y, Liang E, Yang P, Chen S, Hu H, Xie H, Wei Z, Hao F, Song X, Wang T, Xiao X, Wang S, Han H. Federated Domain Adaptation via Transformer for Multi-Site Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3651-3664. [PMID: 37527297 DOI: 10.1109/tmi.2023.3300725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
In multi-site studies of Alzheimer's disease (AD), the difference of data in multi-site datasets leads to the degraded performance of models in the target sites. The traditional domain adaptation method requires sharing data from both source and target domains, which will lead to data privacy issue. To solve it, federated learning is adopted as it can allow models to be trained with multi-site data in a privacy-protected manner. In this paper, we propose a multi-site federated domain adaptation framework via Transformer (FedDAvT), which not only protects data privacy, but also eliminates data heterogeneity. The Transformer network is used as the backbone network to extract the correlation between the multi-template region of interest features, which can capture the brain abundant information. The self-attention maps in the source and target domains are aligned by applying mean squared error for subdomain adaptation. Finally, we evaluate our method on the multi-site databases based on three AD datasets. The experimental results show that the proposed FedDAvT is quite effective, achieving accuracy rates of 88.75%, 69.51%, and 69.88% on the AD vs. NC, MCI vs. NC, and AD vs. MCI two-way classification tasks, respectively.
Collapse
|
5
|
Chen S, Hong Z, Harandi M, Yang X. Domain Neural Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8630-8641. [PMID: 35259116 DOI: 10.1109/tnnls.2022.3151683] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Domain adaptation is concerned with the problem of generalizing a classification model to a target domain with little or no labeled data, by leveraging the abundant labeled data from a related source domain. The source and target domains possess different joint probability distributions, making it challenging for model generalization. In this article, we introduce domain neural adaptation (DNA): an approach that exploits nonlinear deep neural network to 1) match the source and target joint distributions in the network activation space and 2) learn the classifier in an end-to-end manner. Specifically, we employ the relative chi-square divergence to compare the two joint distributions, and show that the divergence can be estimated via seeking the maximal value of a quadratic functional over the reproducing kernel hilbert space. The analytic solution to this maximization problem enables us to explicitly express the divergence estimate as a function of the neural network mapping. We optimize the network parameters to minimize the estimated joint distribution divergence and the classification loss, yielding a classification model that generalizes well to the target domain. Empirical results on several visual datasets demonstrate that our solution is statistically better than its competitors.
Collapse
|
6
|
Dan J, Jin T, Chi H, Dong S, Xie H, Cao K, Yang X. Trust-aware conditional adversarial domain adaptation with feature norm alignment. Neural Netw 2023; 168:518-530. [PMID: 37832319 DOI: 10.1016/j.neunet.2023.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/19/2023] [Accepted: 10/02/2023] [Indexed: 10/15/2023]
Abstract
Adversarial learning has proven to be an effective method for capturing transferable features for unsupervised domain adaptation. However, some existing conditional adversarial domain adaptation methods assign equal importance to different samples, ignoring the fact that hard-to-transfer samples might damage the conditional adversarial adaptation procedure. Meanwhile, some methods can only roughly align marginal distributions across domains, but cannot ensure category distributions alignment, causing classifiers to make uncertain or even wrong predictions for some target data. Furthermore, we find that the feature norms of real images usually follow a complex distribution, so directly matching the mean feature norms of two domains cannot effectively reduce the statistical discrepancy of feature norms and may potentially induce feature degradation. In this paper, we develop a Trust-aware Conditional Adversarial Domain Adaptation (TCADA) method for solving the aforementioned issues. To quantify data transferability, we suggest utilizing posterior probability modeled by a Gaussian-uniform mixture, which effectively facilitates conditional domain alignment. Based on this posterior probability, a confidence-guided alignment strategy is presented to promote precise alignment of category distributions and accelerate the learning of shared features. Moreover, a novel optimal transport-based strategy is introduced to align the feature norms and facilitate shared features becoming more informative. To encourage classifiers to make more accurate predictions for target data, we also design a mixed information-guided entropy regularization term to promote deep features being away from the decision boundaries. Extensive experiments show that our method greatly improves transfer performance on various tasks.
Collapse
Affiliation(s)
- Jun Dan
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, 310027, China.
| | - Tao Jin
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, 310027, China.
| | - Hao Chi
- School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China.
| | - Shunjie Dong
- Department of Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Haoran Xie
- Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China.
| | - Keying Cao
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, 310027, China.
| | - Xinjing Yang
- College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, 310027, China.
| |
Collapse
|
7
|
Gholami B, El-Khamy M, Song KB. Latent Feature Disentanglement for Visual Domain Generalization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5751-5763. [PMID: 37831569 DOI: 10.1109/tip.2023.3321511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2023]
Abstract
Despite remarkable success in a variety of computer vision applications, it is well-known that deep learning can fail catastrophically when presented with out-of-distribution data, where there are usually style differences between the training and test images. Toward addressing this challenge, we consider the domain generalization problem, wherein predictors are trained using data drawn from a family of related training (source) domains and then evaluated on a distinct and unseen test domain. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalizes imperfectly to test domains. Data augmentation has been shown to be an effective approach to overcome this problem. However, its application has been limited to enforcing invariance to simple transformations like rotation, brightness change, etc. Such perturbations do not necessarily cover plausible real-world variations that preserve the semantics of the input (such as a change in the image style). In this paper, taking the advantage of multiple source domains, we propose a novel approach to express and formalize robustness to these kind of real-world image perturbations. The three key ideas underlying our formulation are (1) leveraging disentangled representations of the images to define different factors of variations, (2) generating perturbed images by changing such factors composing the representations of the images, (3) enforcing the learner (classifier) to be invariant to such changes in the images. We use image-to-image translation models to demonstrate the efficacy of this approach. Based on this, we propose a domain-invariant regularization (DIR) loss function that enforces invariant prediction of targets (class labels) across domains which yields improved generalization performance. We demonstrate the effectiveness of our approach on several widely used datasets for the domain generalization problem, on all of which our results are competitive with the state-of-the-art.
Collapse
|
8
|
Sahay R, Thomas G, Jahan CS, Manjrekar M, Popp D, Savakis A. On the Importance of Attention and Augmentations for Hypothesis Transfer in Domain Adaptation and Generalization. SENSORS (BASEL, SWITZERLAND) 2023; 23:8409. [PMID: 37896503 PMCID: PMC10611075 DOI: 10.3390/s23208409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/27/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023]
Abstract
Unsupervised domain adaptation (UDA) aims to mitigate the performance drop due to the distribution shift between the training and testing datasets. UDA methods have achieved performance gains for models trained on a source domain with labeled data to a target domain with only unlabeled data. The standard feature extraction method in domain adaptation has been convolutional neural networks (CNNs). Recently, attention-based transformer models have emerged as effective alternatives for computer vision tasks. In this paper, we benchmark three attention-based architectures, specifically vision transformer (ViT), shifted window transformer (SWIN), and dual attention vision transformer (DAViT), against convolutional architectures ResNet, HRNet and attention-based ConvNext, to assess the performance of different backbones for domain generalization and adaptation. We incorporate these backbone architectures as feature extractors in the source hypothesis transfer (SHOT) framework for UDA. SHOT leverages the knowledge learned in the source domain to align the image features of unlabeled target data in the absence of source domain data, using self-supervised deep feature clustering and self-training. We analyze the generalization and adaptation performance of these models on standard UDA datasets and aerial UDA datasets. In addition, we modernize the training procedure commonly seen in UDA tasks by adding image augmentation techniques to help models generate richer features. Our results show that ConvNext and SWIN offer the best performance, indicating that the attention mechanism is very beneficial for domain generalization and adaptation with both transformer and convolutional architectures. Our ablation study shows that our modernized training recipe, within the SHOT framework, significantly boosts performance on aerial datasets.
Collapse
Affiliation(s)
| | | | | | | | | | - Andreas Savakis
- Rochester Institute of Technology, Rochester, NY 14623, USA; (R.S.); (C.S.J.)
| |
Collapse
|
9
|
Lee J, Lee G. Feature Alignment by Uncertainty and Self-Training for Source-Free Unsupervised Domain Adaptation. Neural Netw 2023; 161:682-692. [PMID: 36841039 DOI: 10.1016/j.neunet.2023.02.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 12/22/2022] [Accepted: 02/06/2023] [Indexed: 02/12/2023]
Abstract
Most unsupervised domain adaptation (UDA) methods assume that labeled source images are available during model adaptation. However, this assumption is often infeasible owing to confidentiality issues or memory constraints on mobile devices. Some recently developed approaches do not require source images during adaptation, but they show limited performance on perturbed images. To address these problems, we propose a novel source-free UDA method that uses only a pre-trained source model and unlabeled target images. Our method captures the aleatoric uncertainty by incorporating data augmentation and trains the feature generator with two consistency objectives. The feature generator is encouraged to learn consistent visual features away from the decision boundaries of the head classifier. Thus, the adapted model becomes more robust to image perturbations. Inspired by self-supervised learning, our method promotes inter-space alignment between the prediction space and the feature space while incorporating intra-space consistency within the feature space to reduce the domain gap between the source and target domains. We also consider epistemic uncertainty to boost the model adaptation performance. Extensive experiments on popular UDA benchmark datasets demonstrate that the proposed source-free method is comparable or even superior to vanilla UDA methods. Moreover, the adapted models show more robust results when input images are perturbed.
Collapse
Affiliation(s)
- JoonHo Lee
- Machine Learning Research Center, Samsung SDS Technology Research, Republic of Korea; Department of Electronic and IT Media Engineering, Seoul National University of Science and Technology, Republic of Korea
| | - Gyemin Lee
- Department of Electronic and IT Media Engineering, Seoul National University of Science and Technology, Republic of Korea.
| |
Collapse
|
10
|
Ren CX, Luo YW, Dai DQ. BuresNet: Conditional Bures Metric for Transferable Representation Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:4198-4213. [PMID: 35830411 DOI: 10.1109/tpami.2022.3190645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As a fundamental manner for learning and cognition, transfer learning has attracted widespread attention in recent years. Typical transfer learning tasks include unsupervised domain adaptation (UDA) and few-shot learning (FSL), which both attempt to sufficiently transfer discriminative knowledge from the training environment to the test environment to improve the model's generalization performance. Previous transfer learning methods usually ignore the potential conditional distribution shift between environments. This leads to the discriminability degradation in the test environments. Therefore, how to construct a learnable and interpretable metric to measure and then reduce the gap between conditional distributions is very important in the literature. In this article, we design the Conditional Kernel Bures (CKB) metric for characterizing conditional distribution discrepancy, and derive an empirical estimation with convergence guarantee. CKB provides a statistical and interpretable approach, under the optimal transportation framework, to understand the knowledge transfer mechanism. It is essentially an extension of optimal transportation from the marginal distributions to the conditional distributions. CKB can be used as a plug-and-play module and placed onto the loss layer in deep networks, thus, it plays the bottleneck role in representation learning. From this perspective, the new method with network architecture is abbreviated as BuresNet, and it can be used extract conditional invariant features for both UDA and FSL tasks. BuresNet can be trained in an end-to-end manner. Extensive experiment results on several benchmark datasets validate the effectiveness of BuresNet.
Collapse
|
11
|
Moradi M, Hamidzadeh J. A domain adaptation method by incorporating belief function in twin quarter-sphere SVM. Knowl Inf Syst 2023. [DOI: 10.1007/s10115-023-01857-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
12
|
Uncertainty-guided joint unbalanced optimal transport for unsupervised domain adaptation. Neural Comput Appl 2023. [DOI: 10.1007/s00521-022-07976-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
|
13
|
Kurcuma: a kitchen utensil recognition collection for unsupervised domain adaptation. Pattern Anal Appl 2023. [DOI: 10.1007/s10044-023-01147-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
AbstractThe use of deep learning makes it possible to achieve extraordinary results in all kinds of tasks related to computer vision. However, this performance is strongly related to the availability of training data and its relationship with the distribution in the eventual application scenario. This question is of vital importance in areas such as robotics, where the targeted environment data are barely available in advance. In this context, domain adaptation (DA) techniques are especially important to building models that deal with new data for which the corresponding label is not available. To promote further research in DA techniques applied to robotics, this work presents Kurcuma (Kitchen Utensil Recognition Collection for Unsupervised doMain Adaptation), an assortment of seven datasets for the classification of kitchen utensils—a task of relevance in home-assistance robotics and a suitable showcase for DA. Along with the data, we provide a broad description of the main characteristics of the dataset, as well as a baseline using the well-known domain-adversarial training of neural networks approach. The results show the challenge posed by DA on these types of tasks, pointing to the need for new approaches in future work.
Collapse
|
14
|
HOMDA: High-Order Moment-Based Domain Alignment for unsupervised domain adaptation. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2022.110205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
15
|
Decomposed adversarial domain generalization. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
16
|
A Two-branch Symmetric Domain Adaptation Neural Network Based on Ulam Stability Theory. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.01.096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
17
|
A family of pairwise multi-marginal optimal transports that define a generalized metric. Mach Learn 2022. [DOI: 10.1007/s10994-022-06280-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
18
|
Aritake T, Hino H. Unsupervised Domain Adaptation for Extra Features in the Target Domain Using Optimal Transport. Neural Comput 2022; 34:2432-2466. [DOI: 10.1162/neco_a_01549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 08/16/2022] [Indexed: 11/09/2022]
Abstract
Abstract
Domain adaptation aims to transfer knowledge of labeled instances obtained from a source domain to a target domain to fill the gap between the domains. Most domain adaptation methods assume that the source and target domains have the same dimensionality. Methods that are applicable when the number of features is different in each domain have rarely been studied, especially when no label information is given for the test data obtained from the target domain. In this letter, it is assumed that common features exist in both domains and that extra (new additional) features are observed in the target domain; hence, the dimensionality of the target domain is higher than that of the source domain. To leverage the homogeneity of the common features, the adaptation between these source and target domains is formulated as an optimal transport (OT) problem. In addition, a learning bound in the target domain for the proposed OT-based method is derived. The proposed algorithm is validated using both simulated and real-world data.
Collapse
Affiliation(s)
| | - Hideitsu Hino
- Institute of Statistical Mathematics, Tachikawa, Tokyo, 190-8562, Japan
- RIKEN AIP, Nihon-bashi, Chuo-ku, Tokyo 103-0027, Japan
| |
Collapse
|
19
|
Lee J, Lee G. Unsupervised Domain Adaptation Based on the Predictive Uncertainty of Models. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
Heterogeneous domain adaptation by semantic distribution alignment network. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03296-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
21
|
Fatras K, Damodaran BB, Lobry S, Flamary R, Tuia D, Courty N. Wasserstein Adversarial Regularization for Learning With Label Noise. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7296-7306. [PMID: 34232864 DOI: 10.1109/tpami.2021.3094662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Noisy labels often occur in vision datasets, especially when they are obtained from crowdsourcing or Web scraping. We propose a new regularization method, which enables learning robust classifiers in presence of noisy data. To achieve this goal, we propose a new adversarial regularization scheme based on the Wasserstein distance. Using this distance allows taking into account specific relations between classes by leveraging the geometric properties of the labels space. Our Wasserstein Adversarial Regularization (WAR) encodes a selective regularization, which promotes smoothness of the classifier between some classes, while preserving sufficient complexity of the decision boundary between others. We first discuss how and why adversarial regularization can be used in the context of noise and then show the effectiveness of our method on five datasets corrupted with noisy labels: in both benchmarks and real datasets, WAR outperforms the state-of-the-art competitors.
Collapse
|
22
|
Hierarchical optimal transport for unsupervised domain adaptation. Mach Learn 2022. [DOI: 10.1007/s10994-022-06231-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
23
|
Wang H, Tian J, Li S, Zhao H, Wu F, Li X. Structure-conditioned adversarial learning for unsupervised domain adaptation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
24
|
Deng Z, Zhou K, Li D, He J, Song YZ, Xiang T. Dynamic Instance Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4585-4597. [PMID: 35776810 DOI: 10.1109/tip.2022.3186531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Most existing studies on unsupervised domain adaptation (UDA) assume that each domain's training samples come with domain labels (e.g., painting, photo). Samples from each domain are assumed to follow the same distribution and the domain labels are exploited to learn domain-invariant features via feature alignment. However, such an assumption often does not hold true-there often exist numerous finer-grained domains (e.g., dozens of modern painting styles have been developed, each differing dramatically from those of the classic styles). Therefore, forcing feature distribution alignment across each artificially-defined and coarse-grained domain can be ineffective. In this paper, we address both single-source and multi-source UDA from a completely different perspective, which is to view each instance as a fine domain. Feature alignment across domains is thus redundant. Instead, we propose to perform dynamic instance domain adaptation (DIDA). Concretely, a dynamic neural network with adaptive convolutional kernels is developed to generate instance-adaptive residuals to adapt domain-agnostic deep features to each individual instance. This enables a shared classifier to be applied to both source and target domain data without relying on any domain annotation. Further, instead of imposing intricate feature alignment losses, we adopt a simple semi-supervised learning paradigm using only a cross-entropy loss for both labeled source and pseudo labeled target data. Our model, dubbed DIDA-Net, achieves state-of-the-art performance on several commonly used single-source and multi-source UDA datasets including Digits, Office-Home, DomainNet, Digit-Five, and PACS.
Collapse
|
25
|
Zhu Y, Venugopalan J, Zhang Z, Chanani NK, Maher KO, Wang MD. Domain Adaptation Using Convolutional Autoencoder and Gradient Boosting for Adverse Events Prediction in the Intensive Care Unit. Front Artif Intell 2022; 5:640926. [PMID: 35481281 PMCID: PMC9036368 DOI: 10.3389/frai.2022.640926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
More than 5 million patients have admitted annually to intensive care units (ICUs) in the United States. The leading causes of mortality are cardiovascular failures, multi-organ failures, and sepsis. Data-driven techniques have been used in the analysis of patient data to predict adverse events, such as ICU mortality and ICU readmission. These models often make use of temporal or static features from a single ICU database to make predictions on subsequent adverse events. To explore the potential of domain adaptation, we propose a method of data analysis using gradient boosting and convolutional autoencoder (CAE) to predict significant adverse events in the ICU, such as ICU mortality and ICU readmission. We demonstrate our results from a retrospective data analysis using patient records from a publicly available database called Multi-parameter Intelligent Monitoring in Intensive Care-II (MIMIC-II) and a local database from Children's Healthcare of Atlanta (CHOA). We demonstrate that after adopting novel data imputation on patient ICU data, gradient boosting is effective in both the mortality prediction task and the ICU readmission prediction task. In addition, we use gradient boosting to identify top-ranking temporal and non-temporal features in both prediction tasks. We discuss the relationship between these features and the specific prediction task. Lastly, we indicate that CAE might not be effective in feature extraction on one dataset, but domain adaptation with CAE feature extraction across two datasets shows promising results.
Collapse
Affiliation(s)
- Yuanda Zhu
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Janani Venugopalan
- Biomedical Engineering Department, Georgia Institute of Technology, Emory University, Atlanta, GA, United States
| | - Zhenyu Zhang
- Biomedical Engineering Department, Georgia Institute of Technology, Atlanta, GA, United States
- Department of Biomedical Engineering, Peking University, Beijing, China
| | | | - Kevin O. Maher
- Pediatrics Department, Emory University, Atlanta, GA, United States
| | - May D. Wang
- Biomedical Engineering Department, Georgia Institute of Technology, Emory University, Atlanta, GA, United States
- *Correspondence: May D. Wang
| |
Collapse
|
26
|
Unsupervised Domain Adaptation for LiDAR Panoptic Segmentation. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3147326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
27
|
Xu B, Zeng Z, Lian C, Ding Z. Few-Shot Domain Adaptation via Mixup Optimal Transport. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:2518-2528. [PMID: 35275818 DOI: 10.1109/tip.2022.3157139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Unsupervised domain adaptation aims to learn a classification model for the target domain without any labeled samples by transferring the knowledge from the source domain with sufficient labeled samples. The source and the target domains usually share the same label space but are with different data distributions. In this paper, we consider a more difficult but insufficient-explored problem named as few-shot domain adaptation, where a classifier should generalize well to the target domain given only a small number of examples in the source domain. In such a problem, we recast the link between the source and target samples by a mixup optimal transport model. The mixup mechanism is integrated into optimal transport to perform the few-shot adaptation by learning the cross-domain alignment matrix and domain-invariant classifier simultaneously to augment the source distribution and align the two probability distributions. Moreover, spectral shrinkage regularization is deployed to improve the transferability and discriminability of the mixup optimal transport model by utilizing all singular eigenvectors. Experiments conducted on several domain adaptation tasks demonstrate the effectiveness of our proposed model dealing with the few-shot domain adaptation problem compared with state-of-the-art methods.
Collapse
|
28
|
Decomposed-distance weighted optimal transport for unsupervised domain adaptation. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03112-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
29
|
Domain Adaptation in Robotics: A Study Case on Kitchen Utensil Recognition. PATTERN RECOGNITION AND IMAGE ANALYSIS 2022. [DOI: 10.1007/978-3-031-04881-4_29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
30
|
Gallego AJ, Calvo-Zaragoza J, Fisher RB. Incremental Unsupervised Domain-Adversarial Training of Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:4864-4878. [PMID: 33027004 DOI: 10.1109/tnnls.2020.3025954] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In the context of supervised statistical learning, it is typically assumed that the training set comes from the same distribution that draws the test samples. When this is not the case, the behavior of the learned model is unpredictable and becomes dependent upon the degree of similarity between the distribution of the training set and the distribution of the test set. One of the research topics that investigates this scenario is referred to as domain adaptation (DA). Deep neural networks brought dramatic advances in pattern recognition and that is why there have been many attempts to provide good DA algorithms for these models. Herein we take a different avenue and approach the problem from an incremental point of view, where the model is adapted to the new domain iteratively. We make use of an existing unsupervised domain-adaptation algorithm to identify the target samples on which there is greater confidence about their true label. The output of the model is analyzed in different ways to determine the candidate samples. The selected samples are then added to the source training set by self-labeling, and the process is repeated until all target samples are labeled. This approach implements a form of adversarial training in which, by moving the self-labeled samples from the target to the source set, the DA algorithm is forced to look for new features after each iteration. Our results report a clear improvement with respect to the non-incremental case in several data sets, also outperforming other state-of-the-art DA algorithms.
Collapse
|
31
|
Unsupervised domain adaptation with non-stochastic missing data. Data Min Knowl Discov 2021. [DOI: 10.1007/s10618-021-00775-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
32
|
Zhou K, Yang Y, Qiao Y, Xiang T. Domain Adaptive Ensemble Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:8008-8018. [PMID: 34534081 DOI: 10.1109/tip.2021.3112012] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The problem of generalizing deep neural networks from multiple source domains to a target one is studied under two settings: When unlabeled target data is available, it is a multi-source unsupervised domain adaptation (UDA) problem, otherwise a domain generalization (DG) problem. We propose a unified framework termed domain adaptive ensemble learning (DAEL) to address both problems. A DAEL model is composed of a CNN feature extractor shared across domains and multiple classifier heads each trained to specialize in a particular source domain. Each such classifier is an expert to its own domain but a non-expert to others. DAEL aims to learn these experts collaboratively so that when forming an ensemble, they can leverage complementary information from each other to be more effective for an unseen target domain. To this end, each source domain is used in turn as a pseudo-target-domain with its own expert providing supervisory signal to the ensemble of non-experts learned from the other sources. To deal with unlabeled target data under the UDA setting where real expert does not exist, DAEL uses pseudo labels to supervise the ensemble learning. Extensive experiments on three multi-source UDA datasets and two DG datasets show that DAEL improves the state of the art on both problems, often by significant margins.
Collapse
|
33
|
|
34
|
Jeon H, Lee S, Kang U. Unsupervised multi-source domain adaptation with no observable source data. PLoS One 2021; 16:e0253415. [PMID: 34242258 PMCID: PMC8270218 DOI: 10.1371/journal.pone.0253415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 06/04/2021] [Indexed: 11/18/2022] Open
Abstract
Given trained models from multiple source domains, how can we predict the labels of unlabeled data in a target domain? Unsupervised multi-source domain adaptation (UMDA) aims for predicting the labels of unlabeled target data by transferring the knowledge of multiple source domains. UMDA is a crucial problem in many real-world scenarios where no labeled target data are available. Previous approaches in UMDA assume that data are observable over all domains. However, source data are not easily accessible due to privacy or confidentiality issues in a lot of practical scenarios, although classifiers learned in source domains are readily available. In this work, we target data-free UMDA where source data are not observable at all, a novel problem that has not been studied before despite being very realistic and crucial. To solve data-free UMDA, we propose DEMS (Data-free Exploitation of Multiple Sources), a novel architecture that adapts target data to source domains without exploiting any source data, and estimates the target labels by exploiting pre-trained source classifiers. Extensive experiments for data-free UMDA on real-world datasets show that DEMS provides the state-of-the-art accuracy which is up to 27.5% point higher than that of the best baseline.
Collapse
Affiliation(s)
- Hyunsik Jeon
- Seoul National University, Seoul, Republic of Korea
| | - Seongmin Lee
- Seoul National University, Seoul, Republic of Korea
| | - U Kang
- Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
35
|
Abstract
AbstractThis paper reviews the current state of the art in artificial intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically machine learning (ML) algorithms, is provided including convolutional neural networks (CNNs), generative adversarial networks (GANs), recurrent neural networks (RNNs) and deep Reinforcement Learning (DRL). We categorize creative applications into five groups, related to how AI technologies are used: (i) content creation, (ii) information analysis, (iii) content enhancement and post production workflows, (iv) information extraction and enhancement, and (v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, ML-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of ML in domains with fewer constraints, where AI is the ‘creator’, remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human-centric—where it is designed to augment, rather than replace, human creativity.
Collapse
|
36
|
Li L, Wan Z, He H. Dual Alignment for Partial Domain Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:3404-3416. [PMID: 32356766 DOI: 10.1109/tcyb.2020.2983337] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Partial domain adaptation (PDA) aims to transfer knowledge from a label-rich source domain to a label-scarce target domain based on an assumption that the source label space subsumes the target label space. The major challenge is to promote positive transfer in the shared label space and circumvent negative transfer caused by the large mismatch across different label spaces. In this article, we propose a dual alignment approach for PDA (DAPDA), including three components: 1) a feature extractor extracts source and target features by the Siamese network; 2) a reweighting network produces "hard" labels, class-level weights for source features and "soft" labels, instance-level weights for target features; 3) a dual alignment network aligns intra domain and interdomain distributions. Specifically, the intra domain alignment aims to minimize the intraclass variances to enhance the intraclass compactness in both domains, and interdomain alignment attempts to reduce the discrepancies across domains by domain-wise and class-wise adaptations. The negative transfer can be alleviated by down-weighting source features with nonshared labels. The positive transfer can be enhanced by upweighting source features with shared labels. The adaptation can be achieved by minimizing the discrepancies based on class-weighted source data with hard labels and instance-weighed target data with soft labels. The effectiveness of our method has been demonstrated by outperforming state-of-the-art PDA methods on several benchmark datasets.
Collapse
|
37
|
Dong J, Long Z, Mao X, Lin C, He Y, Ji S. Multi-level Alignment Network for Domain Adaptive Cross-modal Retrieval. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.01.114] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
38
|
Huang Y, Wu Q, Xu J, Zhong Y, Zhang Z. Unsupervised Domain Adaptation with Background Shift Mitigating for Person Re-Identification. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01474-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
39
|
|
40
|
Lucas B, Pelletier C, Schmidt D, Webb GI, Petitjean F. A Bayesian-inspired, deep learning-based, semi-supervised domain adaptation technique for land cover mapping. Mach Learn 2021. [DOI: 10.1007/s10994-020-05942-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
41
|
Kouw WM, Loog M. A Review of Domain Adaptation without Target Labels. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:766-785. [PMID: 31603771 DOI: 10.1109/tpami.2019.2945942] [Citation(s) in RCA: 105] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: How can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based, and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting, and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.
Collapse
|
42
|
Wilson G, Cook DJ. A Survey of Unsupervised Deep Domain Adaptation. ACM T INTEL SYST TEC 2020; 11:1-46. [PMID: 34336374 PMCID: PMC8323662 DOI: 10.1145/3400066] [Citation(s) in RCA: 126] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 05/01/2020] [Indexed: 10/23/2022]
Abstract
Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
Collapse
|
43
|
Chum L, Subramanian A, Balasubramanian VN, Jawahar CV. Beyond Supervised Learning: A Computer Vision Perspective. J Indian Inst Sci 2019. [DOI: 10.1007/s41745-019-0099-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|