Backdoor Attacks with Wavelet Embedding: Revealing and enhancing the insights of vulnerabilities in visual object detection models on transformers within digital twin systems

Mingkai Shen,Ruwei Huang

doi:10.1016/j.aei.2024.102355

Abstract

Given the pervasive use of deep learning models across various domains, ensuring model security has emerged as a critical concern. This paper examines backdoor attacks, a form of security threat that compromises model output by poisoning the training data. Our investigation specifically addresses backdoor attacks on object detection models, vital for security-sensitive applications like autonomous driving and smart city systems. Consequently, such attacks on object detection models could pose significant risks to human life and property. Consequently, backdoor attacks on object detection could pose serious threats to human life and property. To elucidate this security risk, we propose and experimentally evaluate five backdoor attack methods for object detection models. The key findings are: (1) Unnecessary Object Generation: a globally embedded trigger creating false objects in the target class; (2) Partial Misclassification: a trigger causing specific class misclassification; (3) Global Misclassification: a trigger reclassifying all objects into the target class; (4) Specific Object Vanishing: a trigger causing non-detection of certain objects; (5) Object Position Shifting: a trigger causing bounding box shifts for a specific class. To assess attack effectiveness, we introduced the Attack Success Rate (ASR), which can surpass 1 in object detection tasks, thus providing a more accurate reflection of the attack impact. Experimental outcomes indicate that the ASR values of these varied backdoor attacks frequently approach or surpass 1, demonstrating our method’s capacity to impact multiple objects simultaneously. Additionally, to augment trigger stealth, we introduce Backdoor Attack with Wavelet Embedding (BAWE), which discreetly embeds triggers as image watermarks in training data. This embedding method yields more natural triggers with enhanced stealth. Highly stealthy triggers are less detectable, significantly increasing the likelihood of attack success and efficacy. We have developed a Transformer-based network architecture, diverging from traditional neural network frameworks. Our experiments across various object detection datasets highlight the susceptibility of these models and the high success rate of our approaches. This vulnerability poses significant risks to digital twin systems utilizing object detection technology. Our methodology not only enhances trigger stealth but also suits dense predictive tasks and circumvents current neural network backdoor attack detection methods. The experimental findings expose key challenges in the security of object detection models, particularly when integrated with digital twins, offering new avenues for backdoor attack research and foundational insights for devising defense strategies against these attacks.

Full Text