Abstract

Deep unsupervised domain adaptation (UDA) has significantly boosted the performance of deep models on different domains by transferring knowledge from a source domain to a target domain. However, its robustness against adversarial attacks has not been explored due to the challenges of highly non-convex deep models and different data distribution. In this paper, we give the first attempt to analyze the vulnerability of deep UDA and propose a label-free poisoning attack (LFPA), which injects poisoning data into the training data to mislead adaptation between the two domains without ground truth in target domain. Specifically, we design an unsupervised adversarial loss as the attack goal, in which the pseudo-labels are used to approximate the ground-truth. Since retraining the model will gradually degrade the attack performance, we also add a regularization term to the unsupervised loss, which eliminates negative interactions between the training goal and the attack goal. To accelerate the craft of poisons, we select influential samples as the initial poisons and propose a fast reverse-mode optimization method which updates poisons according to the approximate truncated gradients. Experimental results on multiple state-of-the-art deep UDA methods demonstrate the effectiveness of the proposed LFPA and the high sensitivity of UDA to poisoning attacks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call