Practical Accuracy Estimation for Efficient Deep Neural Network Testing

Junjie Chen,Ming Yan,Zan Wang,Zhuo Wu,Hanmo You,Lingming Zhang

doi:10.1145/3394112

Abstract

Deep neural network (DNN) has become increasingly popular and DNN testing is very critical to guarantee the correctness of DNN, i.e., the accuracy of DNN in this work. However, DNN testing suffers from a serious efficiency problem, i.e., it is costly to label each test input to know the DNN accuracy for the testing set, since labeling each test input involves multiple persons (even with domain-specific knowledge) in a manual way and the testing set is large-scale. To relieve this problem, we propose a novel and practical approach, called PACE (which is short for P ractical AC curacy E stimation), which selects a small set of test inputs that can precisely estimate the accuracy of the whole testing set. In this way, the labeling costs can be largely reduced by just labeling this small set of selected test inputs. Besides achieving a precise accuracy estimation, to make PACE more practical it is also required that it is interpretable, deterministic, and as efficient as possible. Therefore, PACE first incorporates clustering to interpretably divide test inputs with different testing capabilities (i.e., testing different functionalities of a DNN model) into different groups. Then, PACE utilizes the MMD-critic algorithm, a state-of-the-art example-based explanation algorithm, to select prototypes (i.e., the most representative test inputs) from each group, according to the group sizes, which can reduce the impact of noise due to clustering. Meanwhile, PACE also borrows the idea of adaptive random testing to select test inputs from the minority space (i.e., the test inputs that are not clustered into any group) to achieve great diversity under the required number of test inputs. The two parallel selection processes (i.e., selection from both groups and the minority space) compose the final small set of selected test inputs. We conducted an extensive study to evaluate the performance of PACE based on a comprehensive benchmark (i.e., 24 pairs of DNN models and testing sets) by considering different types of models (i.e., classification and regression models, high-accuracy and low-accuracy models, and CNN and RNN models) and different types of test inputs (i.e., original, mutated, and automatically generated test inputs). The results demonstrate that PACE is able to precisely estimate the accuracy of the whole testing set with only 1.181%∼2.302% deviations, on average, significantly outperforming the state-of-the-art approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Practical Accuracy Estimation for Efficient Deep Neural Network Testing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Software Engineering and Methodology

Lead the way for us

Journal: ACM Transactions on Software Engineering and Methodology	Publication Date: Oct 4, 2020
Citations: 64

Similar Papers

Efficient generation of valid test inputs for deep neural networks via gradient search
Zhouxian Jiang ... Rui Wang
Journal of Software: Evolution and Process | VOL. 36
Zhouxian Jiang, et. al.Zhouxian Jiang ... Rui Wang
28 Mar 2023
Journal of Software: Evolution and Process | VOL. 36

Prioritizing Test Inputs for Deep Neural Networks via Mutation Analysis
Zan Wang ... Junjie Chen
-
Zan Wang, et. al.Zan Wang ... Junjie Chen
01 May 2021
01 May 2021

A Search-Based Testing Framework for Deep Neural Networks of Source Code Embedding
Maryam Vahdat Pour ... Lei Ma
-
Maryam Vahdat Pour, et. al.Maryam Vahdat Pour ... Lei Ma
01 Apr 2021
01 Apr 2021

Test Optimization in DNN Testing: A Survey
Qiang Hu ... Yves Le Traon
ACM Transactions on Software Engineering and Methodology | VOL. 33
Qiang Hu, et. al.Qiang Hu ... Yves Le Traon
20 Apr 2024
ACM Transactions on Software Engineering and Methodology | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Practical Accuracy Estimation for Efficient Deep Neural Network Testing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Software Engineering and Methodology