Backdoor Attack Against Dataset Distillation in Natural Language Processing

Yuhao Chen,Weida Xu,Sicong Zhang,Yang Xu

doi:10.3390/app142311425

Abstract

Dataset distillation has become an important technique for enhancing the efficiency of data when training machine learning models. It finds extensive applications across various fields, including computer vision (CV) and natural language processing (NLP). However, it essentially consists of a deep neural network (DNN) model, which remain susceptible to security and privacy vulnerabilities (e.g., backdoor attacks). Existing studies have primarily focused on optimizing the balance between computational efficiency and model performance, overlooking the accompanying security and privacy risks. This study presents the first backdoor attack targeting NLP models trained on distilled datasets. We introduce malicious triggers into synthetic data during the distillation phase to execute a backdoor attack on downstream models trained with these data. We employ several widely used datasets to assess how different architectures and dataset distillation techniques withstand our attack. The experimental findings reveal that the attack achieves strong performance with a high (above 0.9 and up to 1.0) attack success rate (ASR) in most cases. For backdoor attacks, high attack performance often comes at the cost of reduced model utility. Our attack maintains high ASR while maximizing the preservation of downstream model utility, as evidenced by results showing that the clean test accuracy (CTA) of the backdoored model is very close to that of the clean model. Additionally, we performed comprehensive ablation studies to identify key factors affecting attack performance. We tested our attack method against five defense strategies, including NAD, Neural Cleanse, ONION, SCPD, and RAP. The experimental results show that these defense methods are unable to reduce the attack success rate without compromising the model’s performance on normal tasks. Therefore, these methods cannot effectively defend against our attack.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Backdoor Attack Against Dataset Distillation in Natural Language Processing

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Journal: Applied Sciences	Publication Date: Dec 9, 2024
License type: CC BY 4.0

Similar Papers

One-to-N & N-to-One: Two Advanced Backdoor Attacks Against Deep Learning Models
Mingfu Xue ... Weiqiang Liu
IEEE Transactions on Dependable and Secure Computing | VOL. 19
Mingfu Xue, et. al.Mingfu Xue ... Weiqiang Liu
02 Oct 2020
IEEE Transactions on Dependable and Secure Computing | VOL. 19

PTB: Robust physical backdoor attacks against deep neural networks in real world
Mingfu Xue ... Weiqiang Liu
Computers & Security | VOL. 118
Mingfu Xue, et. al.Mingfu Xue ... Weiqiang Liu
15 Apr 2022
Computers & Security | VOL. 118

Industrial software technology: R Mitchell (ed) Peter Peregrinus Ltd (on behalf of the Institution of Electrical Engineers), London, UK (1987) 289pp £37.00
R Veryard
Information and Software Technology | VOL. 31
R VeryardR Veryard
01 May 1989
Information and Software Technology | VOL. 31

Backdoor Attacks and Defenses Targeting Multi-Domain AI Models: A Comprehensive Review
Shaobo Zhang ... Guojun Wang
ACM Computing Surveys | VOL. -
Shaobo Zhang, et. al.Shaobo Zhang ... Guojun Wang
15 Nov 2024
ACM Computing Surveys | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Backdoor Attack Against Dataset Distillation in Natural Language Processing

Abstract

Talk to us

Similar Papers

More From: Applied Sciences