Improved Techniques for Adversarial Discriminative Domain Adaptation.

Aaron Chadha,Yiannis Andreopoulos

doi:10.1109/tip.2019.2950768

Abstract

Adversarial discriminative domain adaptation (ADDA) is an efficient framework for unsupervised domain adaptation in image classification, where the source and target domains are assumed to have the same classes, but no labels are available for the target domain. While ADDA has already achieved better training efficiency and competitive accuracy on image classification in comparison to other adversarial based methods, we investigate whether we can improve its performance with a new framework and new loss formulations. Following the framework of semi-supervised GANs, we first extend the discriminator output over the source classes, in order to model the joint distribution over domain and task. We thus leverage on the distribution over the source encoder posteriors (which is fixed during adversarial training) and propose maximum mean discrepancy (MMD) and reconstruction-based loss functions for aligning the target encoder distribution to the source domain. We compare and provide a comprehensive analysis of how our framework and loss formulations extend over simple multi-class extensions of ADDA and other discriminative variants of semi-supervised GANs. In addition, we introduce various forms of regularization for stabilizing training, including treating the discriminator as a denoising autoencoder and regularizing the target encoder with source examples to reduce overfitting under a contraction mapping (i.e., when the target per-class distributions are contracting during alignment with the source). Finally, we validate our framework on standard datasets like MNIST, USPS, SVHN, MNIST-M and Office-31. We additionally examine how the proposed framework benefits recognition problems based on sensing modalities that lack training data. This is realized by introducing and evaluating on a neuromorphic vision sensing (NVS) sign language recognition dataset, where the source domain constitutes emulated neuromorphic spike events converted from conventional pixel-based video and the target domain is experimental (real) spike events from an NVS camera. Our results on all datasets show that our proposal is both simple and efficient, as it competes or outperforms the state-of-the-art in unsupervised domain adaptation, such as DIFA and MCDDA, whilst offering lower complexity than other recent adversarial methods.

Full Text