Quantization Backdoors to Deep Learning Commercial Frameworks

Hua Ma,Huming Qiu,Said F Al-Sarawi,Yansong Gao,Derek Abbott,Anmin Fu,Minhui Xue,Jiliang Zhang,Zhi Zhang,Alsharif Abuadbba

doi:10.1109/tdsc.2023.3271956

Abstract

Due to their low latency and high privacy preservation, there is currently a burgeoning demand for deploying deep learning (DL) models on ubiquitous edge Internet of Things (IoT) devices. However, DL models are often large in size and require large-scale computation, which prevents them from being placed directly onto IoT devices, where resources are constrained, and 32-bit floating-point (float-32) operations are unavailable. Commercial framework (i.e., a set of toolkits) empowered model quantization is a pragmatic solution that enables DL deployment on mobile devices and embedded systems by effortlessly post-quantizing a large high-precision model (e.g., float-32) into a small low-precision model (e.g., int-8) while retaining the model inference accuracy. However, their usability might be threatened by security vulnerabilities. This work reveals that standard quantization toolkits can be abused to activate a backdoor. We demonstrate that a full-precision backdoored model which does not have any backdoor effect in the presence of a trigger—as the backdoor is dormant—can be activated by (i) TensorFlow-Lite (TFLite) quantization, the only <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">product-ready</i> quantization framework to date, and (ii) the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">beta released</i> PyTorch Mobile framework. In our experiments, we employ three popular model architectures (VGG16, ResNet18, and ResNet50), and train each across three popular datasets: MNIST, CIFAR10 and GTSRB. We ascertain that all trained float-32 backdoored models exhibit no backdoor effect <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">even in the presence of trigger inputs</i> . Particularly, four influential backdoor defenses are evaluated, and they fail to identify a backdoor in the float-32 models. When each of the float-32 models is converted into an int-8 format model through the standard TFLite or PyTorch Mobile framework's post-training quantization, the backdoor is activated in the quantized model, which shows a stable attack success rate close to 100% upon inputs with the trigger, while it usually behaves upon non-trigger inputs. This work highlights that a stealthy security threat occurs when an end-user utilizes the on-device post-training model quantization frameworks, informing security researchers of a cross-platform overhaul of DL models post-quantization even if these models pass security-aware front-end backdoor inspections. Significantly, we have identified Gaussian noise injection into the malicious full-precision model as an easy-to-use preventative defense against the PQ backdoor. The attack source code is released at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/quantization-backdoor</uri> .

Full Text