Supervised Domain Adaptation by transferring both the parameter set and its gradient

Shaya Goodman,Hayit Greenspan,Jacob Goldberger

doi:10.1016/j.neucom.2023.126828

Abstract

A well-known obstacle in the successful implementation of deep learning-based systems to real-world problems is the performance degradation that occurs when applying a network that was trained on data collected in one domain, to data from a different domain. In this study, we focus on the Supervised Domain Adaptation (SDA) setup where we assume the availability of a small amount of labeled data from the target domain. Our approach is based on transferring the gradient history of the pre-training phase to the fine-tuning phase in addition to the parameter set to improve the generalization achieved during pre-training while fine-tuning the model. We present two schemes to transfer the gradient’s information. Mixed Minibatch Transfer Learning (MMTL) is based on using examples from both the source and target domains and Optimizer-Continuation Transfer Learning (OCTL) preserves the gradient history when shifting from training to fine-tuning. This approach is also applicable to the more general setup of transfer learning across different tasks. We show that our methods outperform the state-of-the-art at different levels of data scarcity from the target domain, on multiple datasets and tasks involving both scenery and medical images.

Full Text