Deep Learning Framework Fuzzing Based on Model Mutation

Xiangzhong Shen,Gang Sun,Xiaonan Wang,Hongfang Yu,Jieyi Zhang

doi:10.1109/dsc53577.2021.00059

Abstract

Deep learning(DL) frameworks are widely used for neural network model training and prediction in a lot of areas such as computer vision, natural language processing, medical image and so on. However, DL frameworks may contain bugs that lead to serious security problems, e.g., resource leakage, crashes, computational errors and abnormal behaviors, etc. In this work, we proposed FAME, a DL framework fuzzing system extended from LEMON with API mutation and optimizations for layer and weight mutation. FAME takes inconsistency as feedback to guide fuzzing and it's also able to detect two types of bugs: NaN bugs and crashes. In detail, NaN bugs represent computational logic defects in DL framework while crashes means serious problem in program logic or implementation. Compared with LEMON, FAME is able to adopt API mutation to generate models which could cover a mass of available values of parameters for selected layers. To be more clear, illegal parameters could cause model construction failure or DL frameworks' abnormal behaviors. In addition, we analyzed the disadvantages of some layer mutations and weight mutations of LEMON and proposed optimized methods to overcome them. We conducted FAME on five DL frameworks, i.e., TensorFlow, Pytorch, CNTK, Theano and MXNet to evaluate the effectiveness of our approach. The results of 7-day experiments demonstrate that FAME is effective to detect bugs in DL frameworks with totally 4 NaN and Crash bugs found.

Full Text