Black-box Attack against Self-supervised Video Object Segmentation Models with Contrastive Loss

Ying Chen,Jiaqi Zhao,Abdulmotaleb El Saddik,Yong Zhou,Bing Liu,Rui Yao

doi:10.1145/3617502

Abstract

Deep learning models have been proven to be susceptible to malicious adversarial attacks, which manipulate input images to deceive the model into making erroneous decisions. Consequently, the threat posed to these models serves as a poignant reminder of the necessity to focus on the model security of object segmentation algorithms based on deep learning. However, the current landscape of research on adversarial attacks primarily centers around static images, resulting in a dearth of studies on adversarial attacks targeting Video Object Segmentation (VOS) models. Given that a majority of self-supervised VOS models rely on affinity matrices to learn feature representations of video sequences and achieve robust pixel correspondence, our investigation has delved into the impact of adversarial attacks on self-supervised VOS models. In response, we propose an innovative black-box attack method incorporating contrastive loss. This method induces segmentation errors in the model through perturbations in the feature space and the application of a pixel-level loss function. Diverging from conventional gradient-based attack techniques, we adopt an iterative black-box attack strategy that incorporates contrastive loss across the current frame, any two consecutive frames, and multiple frames. Through extensive experimentation conducted on the DAVIS 2016 and DAVIS 2017 datasets using three self-supervised VOS models and one unsupervised VOS model, we unequivocally demonstrate the potent attack efficiency of the black-box approach. Remarkably, the J&F metric value experiences a significant decline of up to 50.08% post-attack.

Full Text