Backdoor attacks targeting the deep neural network are flourishing recently and are more stealthy than existing adversarial attacks. A deep understanding of the backdoor attacks targeting malware detection models is still missing. We design a highly transferable backdoor attack targeting three benchmark convolutional neural networks (CNNs) for malware detection. The designed backdoor attack involves two steps: trigger generation and trigger insertion. Firstly, based on the computation of the most significant byte sub-sequence from samples of a chosen target label, the trigger patterns are generated by training a class activation mapping-based deep neural network (CAM-DNN). Then, the byte sequence with the maximum class activation mapping score is chosen as the candidate trigger pattern. The computed trigger pattern is then inserted into an index-based place that satisfies the minimum distance between a predefined feature space to the target label. Through detailed experiments, the CAM-DNN-based backdoor considers many influential factors, including the number of backdoor triggers, the degree of perturbations applied on a single trigger pattern, the length of the inserted trigger, etc. The experiments demonstrate that the CAM-DNN-based backdoor attack achieves an 89.58% success rate on average at the cost of a 2.25% accuracy drop on clean inputs. More importantly, the poisoned malware ensures high integrity because the original malicious functions are preserved to a large extent.
Read full abstract