Operation is the Hardest Teacher: Estimating DNN Accuracy Looking for Mispredictions

Antonio Guerriero,Stefano Russo,Roberto Pietrantuono

doi:10.1109/icse43902.2021.00042

Abstract

Deep Neural Networks (DNN) are typically tested for accuracy relying on a set of unlabelled real world data (operational dataset), from which a subset is selected, manually labelled and used as test suite. This subset is required to be small (due to manual labelling cost) yet to faithfully represent the operational context, with the resulting test suite containing roughly the same proportion of examples causing misprediction (i.e., failing test cases) as the operational dataset. However, while testing to estimate accuracy, it is desirable to also learn as much as possible from the failing tests in the operational dataset, since they inform about possible bugs of the DNN. A smart sampling strategy may allow to intentionally include in the test suite many examples causing misprediction, thus providing this way more valuable inputs for DNN improvement while preserving the ability to get trustworthy unbiased estimates. This paper presents a test selection technique (DeepEST) that actively looks for failing test cases in the operational dataset of a DNN, with the goal of assessing the DNN expected accuracy by a small and informative test suite (namely with a high number of mispredictions) for subsequent DNN improvement. Experiments with five subjects, combining four DNN models and three datasets, are described. The results show that DeepEST provides DNN accuracy estimates with precision close to (and often better than) those of existing sampling-based DNN testing techniques, while detecting from 5 to 30 times more mispredictions, with the same test suite size.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Operation is the Hardest Teacher: Estimating DNN Accuracy Looking for Mispredictions

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A weight-wise watermarking technique for DNN models and its robustness against overwriting attack
Han He ... Seok Kang
-
Han He, et. al.Han He ... Seok Kang
13 Mar 2021
13 Mar 2021

A comparative evaluation of deep convolutional neural network and deep neural network-based land use/land cover classifications of mining regions using fused multi-sensor satellite data
Ajay Kumar ... Amit Kumar Gorai
Advances in Space Research | VOL. 72
Ajay Kumar, et. al.Ajay Kumar ... Amit Kumar Gorai
04 Sep 2023
Advances in Space Research | VOL. 72

Robustness analysis and experimental validation of a deep neural network for acoustic source imaging
Qing Li ... Yu Liu
Mechanical Systems and Signal Processing | VOL. 216
Qing Li, et. al.Qing Li ... Yu Liu
04 May 2024
Mechanical Systems and Signal Processing | VOL. 216

Fast and Accurate Deep Neural Network (DNN) Model Extension Method for Signal Integrity (SI) Applications
Hyunwook Park ... Hyungmin Kang
-
Hyunwook Park, et. al.Hyunwook Park ... Hyungmin Kang
01 Dec 2019
01 Dec 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Operation is the Hardest Teacher: Estimating DNN Accuracy Looking for Mispredictions

Abstract

Talk to us

Similar Papers