Abstract
Backdoor attacks remain a critical area of focus in machine learning research, with one prominent approach being the introduction of backdoor training injection mechanisms. These mechanisms embed backdoor triggers into the training process, enabling the model to recognize specific trigger inputs and produce predefined outputs post-training. In this paper, we identify a unifying pattern across existing backdoor injection methods in generative models and propose a novel backdoor training injection paradigm. This paradigm leverages a unified loss function design to facilitate backdoor injection across diverse generative models. We demonstrate the effectiveness and generalizability of this paradigm through experiments on generative adversarial networks (GANs) and Diffusion Models. Our experimental results on GANs confirm that the proposed method successfully embeds backdoor triggers, enhancing the model's security and robustness. This work provides a new perspective and methodological framework for backdoor injection in generative models, making a significant contribution toward improving the safety and reliability of these models.
Published Version (
Free)
Join us for a 30 min session where you can share your feedback and ask us any queries you have