A concealed poisoning attack to reduce deep neural networks’ robustness against adversarial samples

Junhao Zheng,Patrick P.K Chan,Huiyang Chi,Zhimin He

doi:10.1016/j.ins.2022.09.060

Abstract

A poisoning attack method manipulating the training of a model is easily to be detected since the general performance of a model is downgraded. Although a backdoor attack only misleads the decisions on the samples with a trigger but not any samples, the strong association between the trigger and the class ID exposes the attack. The weak concealment limits the damage of current poisoning attacks to machine learning models. This study proposes a poisoning attack against deep neural networks, aiming to not only reduce the robustness of a model against adversarial samples but also explicitly increase its concealment, defined as the accuracy of the contaminated model on untainted samples. In order to improve the efficiency of poisoning sample generation, we propose training interval, gradient truncation, and parallel process mechanisms. As a result, the model trained on the poisoning samples generated by our method is easily misled by slight crafting, and the attack is difficult to be detected since the contaminated model performs well on clean samples. The experimental results show that our method significantly increases the attack success rate without a substantial drop in classification accuracy on clean samples. The transferability and instability of our model are confirmed experimentally.

Full Text