Abstract

To protect images from the tampering of deepfake, adversarial examples can be made to replace the original images by distorting the output of the deepfake model and disrupting its work. Current studies lack generalizability in that they simply focus on the adversarial examples generated by a model in a domain. To improve the generalization of adversarial examples and produce better attack effects on each domain of multiple deepfake models, this paper proposes a framework of Cross-Domain and Model Adversarial Attack (CDMAA). Firstly, CDMAA uniformly weights the loss function of each domain and calculates the cross-domain gradient. Then, inspired by the multiple gradient descent algorithm (MGDA), CDMAA integrates the cross-domain gradients of each model to obtain the cross-domain perturbation vector, which is used to optimize the adversarial example. Finally, we propose a penalty-based gradient regularization method to pre-process the cross-domain gradients to improve the success rate of attacks. CDMAA experiments on four mainstream deepfake models showed that the adversarial examples generated from CDMAA have the generalizability of attacking multiple models and multiple domains simultaneously. Ablation experiments were conducted to compare the CDMAA components with the methods used in existing studies and verify the superiority of CDMAA.

Highlights

  • Deepfake [1] constructs generator models based on generative adversarial networks (GANs) to forge images

  • We selected these domains to prevent the possibility that the STGAN output will be the same as the input when the original picture already contains the attributes of the corresponding domains; in which case, the experiment results will be affected since the model is unable to effectively forge the images even without adversarial examples [34]

  • To verify that Cross-Domain and Model Adversarial Attack (CDMAA) uses the multiple gradient descent algorithm (MGDA) to calculate the cross-model perturbation various models, we carry out the contrast attack experiment, where we keep other comvector w, which can effectively expand the generalization of adversarial examples beponents of CDMAA unchanged and only change the way to process each cross-domain tween various models, we carry out the contrast attack experiment, where we keep other gradient: (1) Single gradient: w = grad

Read more

Summary

Introduction

Deepfake [1] constructs generator models based on generative adversarial networks (GANs) to forge images. Lv et al [32] proposed that higher weight should be given to the face part of the images when calculating the loss function so that the output distortion generated by the adversarial examples is concentrated on the face to achieve a better effect of interfering deepfake models. Dong et al [33] explored the adversarial attacks on encoder–decoderbased deepfake models and proposed to use the loss function with respect to latent variables in encoders to generate the adversarial examples These studies generate adversarial examples only for certain models and do not take into account that models can output fake images of different domains by setting different condition variables, so the generalization of adversarial attacks is quite limited. The perturbation vector is used to iteratively update the adversarial example to improve its generalizability across multiple models and domains

I-FGSM Adversarial Attack Deepfake Model
Cross-Domain Adversarial Attack
Cross-Model Adversarial Attack
Gradient Regularization
CDMAA Framework
Overview
Evaluation
CDMAA Adversarial Attack Experiment
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call