A dilution-based defense method against poisoning attacks on deep learning systems

Hweerang Park,Youngho Cho

doi:10.11591/ijece.v14i1.pp645-652

Abstract

Poisoning attack in deep learning (DL) refers to a type of adversarial attack that injects maliciously manipulated data samples into a training dataset for the purpose of forcing a DL model trained based on the poisoned training dataset to misclassify inputs and thus significantly degrading its performance and reliability. Meanwhile, a traditional defense approach against poisoning attacks tries to detect poisoned data samples from the training dataset and then remove them. However, since new sophisticated attacks avoiding existing detection methods continue to emerge, a detection method alone cannot effectively counter poisoning attacks. For this reason, in this paper, we propose a novel dilution-based defense method that mitigates the effect of poisoned data by adding clean data to the training dataset. According to our experiments, our dilution-based defense technique can significantly decrease the success rate of poisoning attacks and improve classification accuracy by effectively reducing the contamination ratio of the manipulated data. Especially, our proposed method outperformed an existing defense method (Cutmix data augmentation) by 20.9%p at most in terms of classification accuracy.

Full Text