Abstract

Deep learning (DL) brings autonomous vehicles (AVs) close to reality. However, the witness of many safety issues has raised a big concern about the reliability of AVs. To solve this problem, much research has been done to test deep learning-driven AVs. Generally, once a test input is produced, a developer needs to manually check its expected output. However, there often exists massive unlabeled test data (e.g., raw context traces in the real world). It is impractical to manually label all test inputs. Despite some works on automatic generation of test oracles, they are either task-specific or constrained to synthetic inputs. In this paper, we present a general and extensible framework, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepSuite</i> , to mitigate the manual effort of generating test oracles. The intuition behind is that not all test inputs are equally worth labelling. With limited testing budget, it is desirable to label a test suite with high diversity and a reasonable size. Due to the large search space, to optimize such test suites is of great challenge. To address it, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepSuite</i> employs a three-phase optimization method (i.e., selection, crossover, and mutation) to iteratively select representative but non-redundant test suites. Such conflicting profit/cost objectives are attained through a genetic algorithm with a well-defined multi-objective fitness function. In the experiments, we first show that the diversity of tests can be revealed by test criteria. Then, experiments on three widely-used datasets demonstrated the effectiveness of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepSuite</i> in generating test suites with competitive testing coverage and 68.42% smaller size, which greatly improves the data collection efficiency of testing DL-driven autonomous vehicles.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call