Testing machine learning based systems: a systematic mapping

Vincenzo Riccio,Paolo Tonella,Michael Weiss,Andrea Stocco,Gunel Jahangirova,Nargiz Humbatova

doi:10.1007/s10664-020-09881-0

Vincenzo Riccio, Paolo Tonella + Show 4 more

Open Access

https://doi.org/10.1007/s10664-020-09881-0

Copy DOI

Abstract

Context:A Machine Learning based System (MLS) is a software system including one or more components that learn how to perform a task from a given data set. The increasing adoption of MLSs in safety critical domains such as autonomous driving, healthcare, and finance has fostered much attention towards the quality assurance of such systems. Despite the advances in software testing, MLSs bring novel and unprecedented challenges, since their behaviour is defined jointly by the code that implements them and the data used for training them.Objective:To identify the existing solutions for functional testing of MLSs, and classify them from three different perspectives: (1) the context of the problem they address, (2) their features, and (3) their empirical evaluation. To report demographic information about the ongoing research. To identify open challenges for future research.Method:We conducted a systematic mapping study about testing techniques for MLSs driven by 33 research questions. We followed existing guidelines when defining our research protocol so as to increase the repeatability and reliability of our results.Results:We identified 70 relevant primary studies, mostly published in the last years. We identified 11 problems addressed in the literature. We investigated multiple aspects of the testing approaches, such as the used/proposed adequacy criteria, the algorithms for test input generation, and the test oracles.Conclusions:The most active research areas in MLS testing address automated scenario/input generation and test oracle creation. MLS testing is a rapidly growing and developing research area, with many open challenges, such as the generation of realistic inputs and the definition of reliable evaluation metrics and benchmarks.

Highlights

Humanity long dreamed about reproducing intelligence within artificial machines
We identified the main reason for this in selection criterion MI C1 (About Machine Learning based System (MLS) testing): the database search string cannot distinguish between studies that use Machine Learning (ML) for testing and studies on testing of MLS
We identify unique open challenges not reported by Sherin et al (2019), among which: (1) evaluating whether inaccuracies of an isolated ML model have consequences that can be regarded as failures at the system level, and (2) generating inputs within the validity domain of the overall system in order to detect misbehaviours that can occur in the real world

Summary

Introduction

Humanity long dreamed about reproducing intelligence within artificial machines. Back in1872, the novelist S. Humanity long dreamed about reproducing intelligence within artificial machines. Scientists did not wait long to investigate in this direction: in 1950 Alan Turing proposed his famous operational test to verify a machine’s ability to exhibit intelligent behaviour indistinguishable from that of a human (Turing 2009). In which developers explicitly program the systems’ behaviour, ML entails techniques that mimic the human ability to automatically learn how to perform tasks through training examples (Manning et al 2008). Instances of such tasks include image processing, speech, audio recognition, and natural language processing. We give an overview about the relevant ML techniques as well as the challenges of applying classical testing approaches in the machine learning domain

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Empirical Software Engineering	Publication Date: Sep 15, 2020
Citations: 147	License type: open-access

R Discovery Prime

R Discovery Prime

Testing machine learning based systems: a systematic mapping

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Empirical Software Engineering

Lead the way for us

Similar Papers

Security for Machine Learning-based Software Systems: A Survey of Threats, Practices, and Challenges
Huaming Chen ... M Ali Babar
ACM Computing Surveys | VOL. 56
Huaming Chen, et. al.Huaming Chen ... M Ali Babar
23 Feb 2024
ACM Computing Surveys | VOL. 56

Axiomatic assessment of control flow-based software test adequacy criteria
Hong Zhu
Software Engineering Journal | VOL. 10
Hong Zhu Hong Zhu
01 Jan 1995
Software Engineering Journal | VOL. 10

Machine Learning-Based System for Managing Energy Efficiency of Public Buildings: An Approach towards Smart Cities
Taiwo Ajagunsegun ... Olusola Bamisile
-
Taiwo Ajagunsegun, et. al.Taiwo Ajagunsegun ... Olusola Bamisile
25 Mar 2022
25 Mar 2022

Guiding Deep Learning System Testing Using Surprise Adequacy
Jinhan Kim ... Robert Feldt
-
Jinhan Kim, et. al.Jinhan Kim ... Robert Feldt
01 May 2019
01 May 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Testing machine learning based systems: a systematic mapping

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Empirical Software Engineering