Overcoming learning bias via Prototypical Feature Compensation for source-free domain adaptation

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This paper addresses learning bias in source-free unsupervised domain adaptation by proposing a Prototypical Feature Compensation network that extracts source domain features to reduce discrepancy with the target domain, resulting in improved feature alignment and superior performance across three datasets.

Abstract
Translate article icon Translate Article Star icon

The focus of Source-free Unsupervised Domain Adaptation (SFUDA) is to effectively transfer a well-trained model from the source domain to an unlabelled target domain. During the target domain adaptation, the source domain data is no longer accessible. Prevalent methodologies attempt to synchronize the data distributions between the source and target domains, utilizing pseudo-labels to impart categorical information, which has made some progress in improving the model’s performance. However, performance impairments persist due to the introduction of learning bias from the source model and the impact of noisy pseudo-labels generated for the target domain. In this research, we reveal that the central cause for feature misalignment during domain transition is the learning bias, which is generated by the discrepancy of information between source and target domain data. The source domain data may contain distinguishable features that do not appear on the target domain, which causes the pre-trained source model to fail to work during domain adaptation. To overcome the information discrepancy, we propose a Prototypical Feature Compensation (PFC) Network. The network extracts representative feature maps of the source domain. Then use them to minimize the discrepancy information in the target domain feature maps. This mechanism facilitates feature alignment across different domains, allowing the model to generate more accurate categorical data through pseudo-labelling. The experimental results and ablation studies demonstrate exceptional performance on three SFUDA datasets and provide evidence of the proposed PFC method’s ability to adjust the feature distribution of both source and target domain data, ensuring their overlap in the latent space.

Similar Papers
  • Research Article
  • 10.54097/78qk1974
Systematic Analysis of Source-Free Domain Adaptation Methods
  • Mar 27, 2025
  • Frontiers in Computing and Intelligent Systems
  • Zhiyi Miao

Source-Free Domain Adaptation (SFDA) aims to address the challenge of effectively transferring a source domain model to a target domain when the target domain data is unlabeled and the source domain data is unavailable. Traditional Unsupervised Domain Adaptation (UDA) methods rely on simultaneous access to both source and target domain data. However, in many practical scenarios, such as medical data privacy protection or resource-constrained devices, direct access to source domain data is not feasible. SFDA leverages only a pre-trained source domain model and unlabeled target domain data to update the model, avoiding the direct use of source domain data and thereby meeting privacy and security requirements. This paper provides a systematic classification and review of SFDA research methods, categorizing them into three main types: data-related methods, model-related methods, and loss-related methods. Data-related methods replace missing source data by extracting data or feature augmentation information from pre-trained models; model-related methods reduce domain discrepancies by optimizing feature representations or utilizing information in the feature space; and loss-related methods enhance the model's generalization ability through specific loss functions. This paper aims to offer a clear research roadmap for researchers in the field by systematically classifying and analyzing existing SFDA methods, facilitating the selection of appropriate methods or the development of new strategies to address specific problems.

  • Research Article
  • Cite Count Icon 37
  • 10.1111/mice.12617
Reducing the effect of sample bias for small data sets with double‐weighted support vector transfer regression
  • Sep 1, 2020
  • Computer-Aided Civil and Infrastructure Engineering
  • Huan Luo + 1 more

Reducing the effect of sample bias for small data sets with double‐weighted support vector transfer regression

  • Research Article
  • Cite Count Icon 5
  • 10.1109/tim.2024.3396831
Dual Structural Consistent Partial Domain Adaptation Network for Intelligent Machinery Fault Diagnosis
  • Jan 1, 2024
  • IEEE Transactions on Instrumentation and Measurement
  • Kun Yu + 5 more

In industrial scenarios, the source domain (SD) data typically encompasses condition monitoring (CM) data from all machines within a workshop or factory setting, while the target domain (TD) data may only include CM data from one or a small number of machines. The intelligent diagnostic method based on partial domain adaptation (PDA) represents a powerful tool for aligning features between SD and TD data within partial categories. However, existing PDA techniques can only align either the marginal or conditional distributions between SD and TD data within the shared label space, but not both simultaneously. To overcome this limitation, our study introduces a dual structural consistent PDA network. This network leverages the vision transformer as its foundation, ensuring effective extraction of distinguishable features from both SD and TD data. A weight balance mechanism is integrated into the partial adversarial training process, facilitating marginal distribution alignment between SD and TD data within the shared label space. Additionally, a knowledge distillation based approach is employed for conditional distribution alignment across the two structural consistent networks, ensuring consistency in predictions for TD data. The effectiveness of our proposed method is demonstrated through its application on two sets of experimental faulty data, confirming its ability to provide a feature distribution that is not affected by domain changes but is discriminative for different classes when dealing with PDA tasks.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/bigdata50022.2020.9377756
Deep Domain Adaptation based Cloud Type Detection using Active and Passive Satellite Data
  • Dec 10, 2020
  • Xin Huang + 6 more

Domain adaptation techniques have been developed to handle data from multiple sources or domains. Most existing domain adaptation models assume that source and target domains are homogeneous, i.e., they have the same feature space. Nevertheless, many real world applications often deal with data from heterogeneous domains that come from completely different feature spaces. In our remote sensing application, data in source domain (from an active spaceborne Lidar sensor CALIOP onboard CALIPSO satellite) contain 25 attributes, while data in target domain (from a passive spectroradiometer sensor VIIRS onboard Suomi-NPP satellite) contain 20 different attributes. CALIOP has better representation capability and sensitivity to aerosol types and cloud phase, while VIIRS has wide swaths and better spatial coverage but has inherent weakness in differentiating atmospheric objects on different vertical levels. To address this mismatch of features across the domains/sensors, we propose a novel end-to-end deep domain adaptation with domain mapping and correlation alignment (DAMA) to align the heterogeneous source and target domains in active and passive satellite remote sensing data. It can learn domain invariant representation from source and target domains by transferring knowledge across these domains, and achieve additional performance improvement by incorporating weak label information into the model (DAMA-WL). Our experiments on a collocated CALIOP and VIIRS dataset show that DAMA and DAMA-WL can achieve higher classification accuracy in predicting cloud types.

  • Research Article
  • Cite Count Icon 50
  • 10.1109/tgrs.2019.2962039
A MultiKernel Domain Adaptation Method for Unsupervised Transfer Learning on Cross-Source and Cross-Region Remote Sensing Data Classification
  • Jan 17, 2020
  • IEEE Transactions on Geoscience and Remote Sensing
  • Wei Liu + 1 more

Labeling remote sensing data for classification is labor-intensive and time-consuming. Transfer learning (TL), under such context, is attracting increasing attention as it aims to harness information from data set of other regions where labels are readily available. The central topic of concern is to homogenize the large disparities of feature distribution of different data set through domain adaptation (DA). This article proposes a novel DA method for unsupervised TL, namely, multikernel jointly domain matching (MKJDM), which by definition considers multiple kernels as opposed to the currently popular single-kernel methods for measuring the distances between distributions. The single-kernel methods minimize the distances of feature distribution between the source domain (data set with training labels) and the target domain (data set to be classified) through, for example, maximum mean discrepancy (MMD) metric, formed under a kernel function mapping, while the multikernel version (MK-MMD) uses different kernel functions to encapsulate multiple aspects of distribution discrepancies, and is, therefore, more capable of distance minimization. Our MKJDM implementation also considers simultaneously aligning marginal and class conditional distributions and reweight for each instance, which further improves the performance. Two experiments performed on remote sensing images and multi-modal data sets (i.e., Orthophoto and Digital Surface Models), with regions of different countries with distinctly different land patterns serving as source and target domain data, show that the overall accuracies are improved by 37.28% and 46.62% after applications of our MKJDM method. An additional comparative experiment with five state-of-the-art DA methods also demonstrates that our method achieves the best performance.

  • Conference Article
  • 10.1109/ijcnn55064.2022.9891979
Source Free Domain Adaptation via Combined Discriminative GAN Model for Image Classification
  • Jul 18, 2022
  • Yujie Liu + 4 more

The unsupervised domain adaptive classification task can learn domain-invariant features between the unlabeled target domain data and the labeled source domain data, thereby improving the classification performance of the classifier in the target domain. However, privacy protection and memory-constrained often make it difficult to obtain labeled source domain samples, which will bring bottlenecks to the existing domain adaptation tasks. To this end, we propose a novel source free domain adaptive classification model, that is, without any source domain data, a classifier with good performance in the target domain can be obtained only by using the source domain pre-trained classifier and the target domain data. The method first proposes a novel conditional information generative adversarial module based on combined discriminators. Through the confrontation between combined discriminators and the generator, the middle domain with pseudo-labels is generated to solve the problem of missing source domain. Then when training the new classifier in the domain adaptation module, we add a distillation loss mechanism to deal with the lack of source domain data supervision, thereby minimizing the difference between the old classifier response and the new classifier response to ensure that the network output retains the source domain information. We conducted experiments on three groups of 10 data sets, which proved that our method can effectively solve the problem of source free domain adaptive classification and effectively improve the classification accuracy of the model in each domain.

  • Research Article
  • Cite Count Icon 12
  • 10.1016/j.engappai.2019.103267
A complex process fault diagnosis method based on manifold distribution adaptation
  • Oct 15, 2019
  • Engineering Applications of Artificial Intelligence
  • Xiaogang Wang + 1 more

A complex process fault diagnosis method based on manifold distribution adaptation

  • Conference Article
  • Cite Count Icon 3
  • 10.5244/c.28.103
Modeling Sequential Domain Shift through Estimation of Optimal Sub-spaces for Categorization
  • Jan 1, 2014
  • Suranjana Samanta + 2 more

Domain adaptation (DA) is the process in which labeled training samples available from one domain is used to improve the performance of statistical tasks performed on test samples drawn from a different domain. The domain from which the training samples are obtained is termed as the source domain, and the counterpart consisting of the test samples is termed as the target domain. Few unlabeled training samples are also taken from the target domain in order to approximate its distribution. In this paper, we propose a new method of unsupervised DA, where a set of domain invariant sub-spaces are estimated using the geometrical and statistical properties of the source and target domains. This is a modification of the work done by Gopalan et al. [2], where the geodesic path from the principal components of the source to that of the target is considered in the Grassmann manifold, and the intermediary points are sampled to represent the incremental change in the geometric properties of the data in source and target domains. Instead of the geodesic path, we consider an alternate path of shortest length between the principal components of source and target, with the property that the intermediary sample points on the path form domain invariant sub-spaces using the concept of Maximum Mean Discrepancy (MMD) [3]. Thus we model the change in the geometric properties of data in both the domains sequentially, in a manner such that the distributions of projected data from both the domains always remain similar along the path. The entire formulation is done in the kernel space which makes it more robust to non-linear transformations. Let X and Y be the source and target domains having nX and nY number of instances respectively. If Φ(.) is a universal kernel function, then in kernel space the source and target domains are Φ(X) ∈ RnX×d and Φ(Y ) ∈ RnX×d respectively. Let KXX and KYY be the kernel gram matrices of Φ(X) and Φ(Y ) respectively. Let D = [X ;Y ] denote the combined source and target domain data, and the corresponding data in kernel space is given as Φ(D). The kernel gram matrix formed using D is given by

  • Research Article
  • Cite Count Icon 34
  • 10.1016/j.engappai.2022.104932
Adversarial domain adaptation network with pseudo-siamese feature extractors for cross-bearing fault transfer diagnosis
  • May 17, 2022
  • Engineering Applications of Artificial Intelligence
  • Qunwang Yao + 4 more

Adversarial domain adaptation network with pseudo-siamese feature extractors for cross-bearing fault transfer diagnosis

  • Supplementary Content
  • 10.25394/pgs.12221597.v1
On Transfer Learning Techniques for Machine Learning
  • Apr 30, 2020
  • Figshare
  • Debasmit Das

Recent progress in machine learning has been mainly due to the availability of large amounts of annotated data used for training complex models with deep architectures. Annotating this training data becomes burdensome and creates a major bottleneck in maintaining machine-learning databases. Moreover, these trained models fail to generalize to new categories or new varieties of the same categories. This is because new categories or new varieties have data distribution different from the training data distribution. To tackle these problems, this thesis proposes to develop a family of transfer-learning techniques that can deal with different training (source) and testing (target) distributions with the assumption that the availability of annotated data is limited in the testing domain. This is done by using the auxiliary data-abundant source domain from which useful knowledge is transferred that can be applied to data-scarce target domain. This transferable knowledge serves as a prior that biases target-domain predictions and prevents the target-domain model from overfitting. Specifically, we explore structural priors that encode relational knowledge between different data entities, which provides more informative bias than traditional priors. The choice of the structural prior depends on the information availability and the similarity between the two domains. Depending on the domain similarity and the information availability, we divide the transfer learning problem into four major categories and propose different structural priors to solve each of these sub-problems. This thesis first focuses on the unsupervised-domain-adaptation problem, where we propose to minimize domain discrepancy by transforming labeled source-domain data to be close to unlabeled target-domain data. For this problem, the categories remain the same across the two domains and hence we assume that the structural relationship between the source-domain samples is carried over to the target domain. Thus, graph or hyper-graph is constructed as the structural prior from both domains and a graph/hyper-graph matching formulation is used to transform samples in the source domain to be closer to samples in the target domain. An efficient optimization scheme is then proposed to tackle the time and memory inefficiencies associated with the matching problem. The few-shot learning problem is studied next, where we propose to transfer knowledge from source-domain categories containing abundantly labeled data to novel categories in the target domain that contains only few labeled data. The knowledge transfer biases the novel category predictions and prevents the model from overfitting. The knowledge is encoded using a neural-network-based prior that transforms a data sample to its corresponding class prototype. This neural network is trained from the source-domain data and applied to the target-domain data, where it transforms the few-shot samples to the novel-class prototypes for better recognition performance. The few-shot learning problem is then extended to the situation, where we do not have access to the source-domain data but only have access to the source-domain class prototypes. In this limited information setting, parametric neural-network-based priors would overfit to the source-class prototypes and hence we seek a non-parametric-based prior using manifolds. A piecewise linear manifold is used as a structural prior to fit the source-domain-class prototypes. This structure is extended to the target domain, where the novel-class prototypes are found by projecting the few-shot samples onto the manifold. Finally, the zero-shot learning problem is addressed, which is an extreme case of the few-shot learning problem where we do not have any labeled data in the target domain. However, we have high-level information for both the source and target domain categories in the form of semantic descriptors. We learn the relation between the sample space and the semantic space, using a regularized neural network so that classification of the novel categories can be carried out in a common representation space. This same neural network is then used in the target domain to relate the two spaces. In case we want to generate data for the novel categories in the target domain, we can use a constrained generative adversarial network instead of a traditional neural network. Thus, we use structural priors like graphs, neural networks and manifolds to relate various data entities like samples, prototypes and semantics for these different transfer learning sub-problems. We explore additional post-processing steps like pseudo-labeling, domain adaptation and calibration and enforce algorithmic and architectural constraints to further improve recognition performance. Experimental results on standard transfer learning image recognition datasets produced competitive results with respect to previous work. Further experimentation and analyses of these methods provided better understanding of machine learning as well.

  • Research Article
  • Cite Count Icon 53
  • 10.1007/s10115-016-1021-1
Online transfer learning by leveraging multiple source domains
  • Jan 11, 2017
  • Knowledge and Information Systems
  • Qingyao Wu + 4 more

Transfer learning aims to enhance performance in a target domain by exploiting useful information from auxiliary or source domains when the labeled data in the target domain are insufficient or difficult to acquire. In some real-world applications, the data of source domain are provided in advance, but the data of target domain may arrive in a stream fashion. This kind of problem is known as online transfer learning. In practice, there can be several source domains that are related to the target domain. The performance of online transfer learning is highly associated with selected source domains, and simply combining the source domains may lead to unsatisfactory performance. In this paper, we seek to promote classification performance in a target domain by leveraging labeled data from multiple source domains in online setting. To achieve this, we propose a new online transfer learning algorithm that merges and leverages the classifiers of the source and target domain with an ensemble method. The mistake bound of the proposed algorithm is analyzed, and the comprehensive experiments on three real-world data sets illustrate that our algorithm outperforms the compared baseline algorithms.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-030-04503-6_2
Unsupervised Domain Adaptation Dictionary Learning for Visual Recognition
  • Jan 1, 2018
  • Zhun Zhong + 3 more

Over the last years, dictionary learning method has been extensively applied to deal with various computer vision recognition applications, and produced state-of-the-art results. However, when the data instances of a target domain have a different distribution than that of a source domain, the dictionary learning method may fail to perform well. In this paper, we address the cross-domain visual recognition problem and propose a simple but effective unsupervised domain adaptation approach, where labeled data are only from source domain. In order to bring the original data in source and target domain into the same distribution, the proposed method forcing nearest coupled data between source and target domain to have identical sparse representations while jointly learning dictionaries for each domain, where the learned dictionaries can reconstruct original data in source and target domain respectively. So that sparse representations of original data can be used to perform visual recognition tasks. We demonstrate the effectiveness of our approach on standard datasets. Our method performs on par or better than competitive state-of-the-art methods.

  • Research Article
  • Cite Count Icon 1
  • 10.3233/jifs-223118
A novel Pseudo-label based domain adaptation method on tabular data
  • May 4, 2023
  • Journal of Intelligent & Fuzzy Systems
  • Chu Wang + 4 more

Tabular data is a widely used data form in many fields such as product marketing. In some cases, the domain shift between source and target domain of tabular data may occur with the changing of collection conditions such as time. The extant methods on tabular data mainly consist of neural-network-based methods and tree-based methods. They both meet challenges induced by domain shift on tabular data. First, neural-network-based methods are lack of effective mechanism to extract the features of tabular data and the performance may not be higher than tree-based models. Second, tree-based methods are lack of effective feature representations to model the associations between source domain and target domain. To improve the performance of tree-based methods for domain shift, a novel pseudo-label based domain adaptation method is proposed for the tree-based method called Xgboost. The proposed method consists of pseudo-label generation and selection strategies. The pseudo-label generation strategy can control the effects of pseudo-labels on Xgboost in a more flexible way by setting proper values of pseudo-labels. The pseudo-label selection strategy can select the pseudo-labels with high confidences under a consistency condition based on the outputs of Xgboost. The quality of pseudo-labels for the data in target domain is improved and so does the performance of Xgboost trained by the data in both source domain and target domain. In the experiment, several UCI datasets and 5G terminal datasets are used to show that the proposed methods can effectively improve the performance of Xgboost.

  • Book Chapter
  • Cite Count Icon 8
  • 10.1007/978-3-030-66415-2_36
Domain Adaptation for Eye Segmentation
  • Jan 1, 2020
  • Yiru Shen + 2 more

Domain adaptation (DA) has been widely investigated as a framework to alleviate the laborious task of data annotation for image segmentation. Most DA investigations operate under the unsupervised domain adaptation (UDA) setting, where the modeler has access to a large cohort of source domain labeled data and target domain data with no annotations. UDA techniques exhibit poor performance when the domain gap, i.e., the distribution overlap between the data in source and target domain is large. We hypothesize that the DA performance gap can be improved with the availability of a small subset of labeled target domain data. In this paper, we systematically investigate the impact of varying amounts of labeled target domain data on the performance gap for DA. We specifically focus on the problem of segmenting eye-regions from eye images collected using two different head mounted display systems. Source domain is comprised of 12,759 eye images with annotations and target domain is comprised of 4,629 images with varying amounts of annotations. Experiments are performed to compare the impact on DA performance gap under three schemes: unsupervised (UDA), supervised (SDA) and semi-supervised (SSDA) domain adaptation. We evaluate these schemes by measuring the mean intersection-over-union (mIoU) metric. Using only 200 samples of labeled target data under SDA and SSDA schemes, we show an improvement in mIoU of 5.4% and 6.6% respectively, over mIoU of 81.7% under UDA. By using all available labeled target data, models trained under SSDA achieve a competitive mIoU score of 89.8%. Overall, we conclude that availability of a small subset of target domain data with annotations can substantially improve DA performance.

  • PDF Download Icon
  • Research Article
  • 10.1155/2020/8873137
Learning Transferable Convolutional Proxy by SMI-Based Matching Technique
  • Oct 14, 2020
  • Shock and Vibration
  • Wei Jin + 1 more

Domain-transfer learning is a machine learning task to explore a source domain data set to help the learning problem in a target domain. Usually, the source domain has sufficient labeled data, while the target domain does not. In this paper, we propose a novel domain-transfer convolutional model by mapping a target domain data sample to a proxy in the source domain and applying a source domain model to the proxy for the purpose of prediction. In our framework, we firstly represent both source and target domains to feature vectors by two convolutional neural networks and then construct a proxy for each target domain sample in the source domain space. The proxy is supposed to be matched to the corresponding target domain sample convolutional representation vector well. To measure the matching quality, we proposed to maximize their squared-loss mutual information (SMI) between the proxy and target domain samples. We further develop a novel neural SMI estimator based on a parametric density ratio estimation function. Moreover, we also propose to minimize the classification error of both source domain samples and target domain proxies. The classification responses are also smoothened by manifolds of both the source domain and proxy space. By minimizing an objective function of SMI, classification error, and manifold regularization, we learn the convolutional networks of both source and target domains. In this way, the proxy of a target domain sample can be matched to the source domain data and thus benefits from the rich supervision information of the source domain. We design an iterative algorithm to update the parameters alternately and test it over benchmark data sets of abnormal behavior detection in video, Amazon product reviews sentiment analysis, etc.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant