Adversarial Metric Knowledge Distillation

Zihe Dong,Xin Sun,Junyu Dong,Haoran Zhao

doi:10.1145/3442555.3442581

Abstract

Knowledge distillation is dedicated to improving the performance of light weight networks by transferring knowledge during the training process. Meanwhile, it is important to apply knowledge distillation on different situations. The previous knowledge distillation method with adversarial samples uses a traditional knowledge distillation loss to let the student learn a good decision boundary. In this paper, we propose a novel method named Adversarial Metric Knowledge Distillation (AMKD), which utilizes adversarial samples to transfer the dark knowledge from the teacher to student. We select adversarial samples which are close to the decision boundary of two classes to metric the distance with negative class samples employing triplet loss constraint. The method guarantees the student network learning relationships among samples by quantitative metric learning. Therefore, we not only transfer information of the decision boundary but also ensure the student network can always maintain a proper distance from other negative classes. This can be another good exploration for knowledge distillation with adversarial samples. The experiments on CIFAR-10, CIFAR-100 and Tiny ImageNet datasets verify that the proposed knowledge distillation method works effectively on improving the student network performance.

Full Text