Importance-driven deep learning system testing

Simos Gerasimou,Hasan Ferit Eniser,Alper Cakan,Alper Sen

doi:10.1145/3377811.3380391

Simos Gerasimou, Hasan Ferit Eniser + Show 2 more

Open Access

https://doi.org/10.1145/3377811.3380391

Copy DOI

Abstract

Deep Learning (DL) systems are key enablers for engineering intelligent applications due to their ability to solve complex tasks such as image recognition and machine translation. Nevertheless, using DL systems in safety- and security-critical applications requires to provide testing evidence for their dependable operation. Recent research in this direction focuses on adapting testing criteria from traditional software engineering as a means of increasing confidence for their correct behaviour. However, they are inadequate in capturing the intrinsic properties exhibited by these systems. We bridge this gap by introducing DeepImportance, a systematic testing methodology accompanied by an Importance-Driven (IDC) test adequacy criterion for DL systems. Applying IDC enables to establish a layer-wise functional understanding of the importance of DL system components and use this information to assess the semantic diversity of a test set. Our empirical evaluation on several DL systems, across multiple DL datasets and with state-of-the-art adversarial generation techniques demonstrates the usefulness and effectiveness of DeepImportance and its ability to support the engineering of more robust DL systems.

Highlights

Driven by the increasing availability of publicly-accessible data and massive parallel processing power, Deep Learning (DL) systems have achieved unprecedented progress, commensurate with the cognitive abilities of humans [28, 46]
The Importance-Driven adequacy criterion instrumented by DeepImportance measures the adequacy of an input set as the ratio of combinations of important neurons clusters covered by the set
RQ2 (Diversity): Can DeepImportance inform the selection of a diverse test set? We investigate whether software engineers can employ the Importance-Driven Coverage to generate a diverse test set that comprises semantically different test inputs

Summary

Introduction

Driven by the increasing availability of publicly-accessible data and massive parallel processing power, Deep Learning (DL) systems have achieved unprecedented progress, commensurate with the cognitive abilities of humans [28, 46]. A neuron represents a computing unit that applies a nonlinear activation function to its inputs and transmits the result to neurons in the following layer [46]. A DL system’s architecture comprises the number of layers, neurons per layer, neuron activation functions and a cost function. Given such an architecture, the DL system carries out an iterative training process through which it consumes labelled input data (e.g, raw image pixels) in its input layer, executes a set of nonlinear transformations in its hidden layers to extract semantic concepts (i.e., features) from the input data, and, generates a decision that matches the effect of these computations in its output layer. The training process aims at finding weight values that minimize the cost function, enabling the DL system to achieve high generalisability

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Importance-driven deep learning system testing

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jun 27, 2020
Citations: 88	License type: cc-by

Similar Papers

Importance-driven deep learning system testing
Simos Gerasimou ... Alper Sen
-
Simos Gerasimou, et. al.Simos Gerasimou ... Alper Sen
27 Jun 2020
27 Jun 2020

Three Reasons Why Artificial Intelligence Might Be the Radiologist's Best Friend.
Rick R Van Rijn ... Alberto De Luca
Radiology | VOL. 296
Rick R Van Rijn, et. al.Rick R Van Rijn ... Alberto De Luca
21 Apr 2020
Radiology | VOL. 296

Clones in deep learning code: what, where, and why?
Hadhemi Jebnoun ... Md Saidur Rahman
Empirical Software Engineering | VOL. 27
Hadhemi Jebnoun, et. al.Hadhemi Jebnoun ... Md Saidur Rahman
08 Apr 2022
Empirical Software Engineering | VOL. 27

DeepXplore
Kexin Pei ... Junfeng Yang
Communications of the ACM | VOL. 62
Kexin Pei, et. al.Kexin Pei ... Junfeng Yang
24 Oct 2019
Communications of the ACM | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Importance-driven deep learning system testing

Abstract

Highlights

Summary

Talk to us

Similar Papers