An Initiative to Improve Reproducibility and Empirical Evaluation of Software Testing Techniques

Francisco G De Oliveira Neto,Richard Torkar,Patricia D L Machado

doi:10.1109/icse.2015.197

Abstract

The current concern regarding quality of evaluation performed in existing studies reveals the need for methods and tools to assist in the definition and execution of empirical studies and experiments. However, when trying to apply general methods from empirical software engineering in specific fields, such as evaluation of software testing techniques, new obstacles and threats to validity appears, hindering researchers' use of empirical methods. This paper discusses those issues specific for evaluation of software testing techniques and proposes an initiative for a collaborative effort to encourage reproducibility of experiments evaluating software testing techniques (STT). We also propose the development of a tool that enables automatic execution and analysis of experiments producing a reproducible research compendia as output that is, in turn, shared among researchers. There are many expected benefits from this endeavour, such as providing a foundation for evaluation of existing and upcoming STT, and allowing researchers to devise and publish better experiments.

Full Text