DeepBoundary: A Coverage Testing Method of Deep Learning Software based on Decision Boundary Representation

Yue Liu,Xingya Wang,Lichao Feng,Shiyu Zhang

doi:10.1109/qrs-c57518.2022.00032

Abstract

With the increasing application of Deep Learning (DL) Software in safety-critical fields such as autonomous driving, we need adequate testing to ensure software quality. Observing the decision-making behavior of a Deep Neural Network (DNN) is an essential step in DL software testing. Taking Guiding Deep Learning System Testing Using Surprise Adequacy (SADL) as an example, it uses the independent neuron activation values in the DNN to represent the decision-making behavior. However, the behavior of the DNN needs to be jointly determined by the continuous outputs of all neurons. As a result, the coverage value of SADL constant volatility and lack of stability. To mitigate this problem, we propose a coverage testing method based on the decision boundary representation, DeepBoundary, for the decision-making behavior of DL software. Unlike SADL, DeepBoundary generates decision boundary data to represent the decision behavior of the DNN, which makes the testing results more stable. On this basis, we calculate the kernel density between the testing data and the decision boundary data. It measures the position of the testing data in the decision space and the distance from the decision boundary. Finally, as an adequacy indicator, we calculate the decision boundary density coverage (DBC) of the entire testing set. The experiment on the dataset MNIST and two DL software shows that DeepBoundary can generate actual decision boundary data. The average confidence error in the DNNs output layer is only 4.20E-05. Compared with SADL, DeepBoundary has a stronger correlation with the defect detection ratio, which can more accurately represent testing adequacy.

Full Text