TAT: Targeted backdoor attacks against visual object tracking

Ziyi Cheng,Baoyuan Wu,Zhenya Zhang,Jianjun Zhao

doi:10.1016/j.patcog.2023.109629

Abstract

Visual object tracking (VOT) is a fundamental computer vision task that aims to track a target in a sequence of video frames. It has been broadly adopted in safety- and security-critical applications, such as self-driving systems and traffic control systems. However, the VOT models (i.e., the trackers) that rely on third-party training resources face a severe threat of backdoor attacks, which refer to the type of the attacks that poison a portion of training data and mislead the tracker to track a wrong target. A surge of research interest has arisen in backdoor attacks in the domain of image classification, as a measure to expose the potential security risks of the classifiers and inspire new defense techniques. Despite the prosperity of the research in backdoor attacks in image classification, there still lacks investigation in backdoor attacks against VOT, due to their unique challenges: first, the architecture of a VOT model is much more complicated than that of an image classifier; second, VOT targets a sequence of video frames rather than individual images. To bridge the gap, we propose a novel and effective targeted backdoor attack approach TAT specifically against VOT tasks. In particular, TAT includes a basic version TAT-BA that can achieve effective and stealthy backdoor attacks against VOT trackers, and an advanced version TAT-DA that can evade two representative defense techniques. Our large-scale experimental evaluation demonstrates the effectiveness and the stealthiness of TAT. Moreover, we also demonstrate the performances of TAT-BA under real-world settings and the abilities of TAT-DA to counter defense techniques. The code will be available at https://github.com/MisakaZipi/TAT.

Full Text