STAMINA: a competition to encourage the development and assessment of software model inference techniques

Neil Walkinshaw,Christophe Damas,Kirill Bogdanov,Pierre Dupont,Bernard Lambeau

doi:10.1007/s10664-012-9210-3

Abstract

Models play a crucial role in the development and maintenance of software systems, but are often neglected during the development process due to the considerable manual effort required to produce them. In response to this problem, numerous techniques have been developed that seek to automate the model generation task with the aid of increasingly accurate algorithms from the domain of Machine Learning. From an empirical perspective, these are extremely challenging to compare; there are many factors that are difficult to control (e.g. the richness of the input and the complexity of subject systems), and numerous practical issues that are just as troublesome (e.g. tool availability). This paper describes the StaMinA (State Machine Inference Approaches) competiton, that was designed to address these problems. The competition attracted numerous submissions, many of which were improved or adapted versions of techniques that had not been subjected to extensive empirical evaluations, and had not been evaluated with respect to their ability to infer models of software systems. This paper shows how many of these techniques substantially improve on the state of the art, providing insights into some of the factors that could underpin the success of the best techniques. In a more general sense it demonstrates the potential for competitions to act as a useful basis for empirical software engineering by (a) spurring the development of new techniques and (b) facilitating their comparative evaluation to an extent that would usually be prohibitively challenging without the active participation of the developers.

Highlights

Models are crucial for the effective development and maintenance of software systems
This paper describes the StaMInA (State Machine Inference Approaches) competition, which is intended to provide an empirically sound basis for the comparison of techniques for the inference of models in the form of Deterministic Finite Automata
This section provides an introduction to the problem of state machine inference, and discusses the characteristics of software models that make them especially difficult to infer. It presents a brief overview of the Blue-Fringe algorithm (Lang et al 1998), which has already been extensively used for software model inference (Damas et al 2005; Dupont et al 2008; Lambeau et al 2008; Walkinshaw and Bogdanov 2008; Walkinshaw et al 2007) and forms the baseline for this competition

Summary

Introduction

Models are crucial for the effective development and maintenance of software systems. Models of software behaviour, which are the subject of this paper, are valuable, because they can form the basis for powerful automated techniques for tasks such as verification, validation and refinement (Lee and Yannakakis 1996; van Lamsweerde 2009). Despite their apparent advantages, models are often neglected during the software development process. Software behaviour is commonly modelled in sequential terms, i.e. the sequences of permissible and impermissible events or inputs/outputs that constitute its functionality These sequences are usually modelled with the help of Deterministic Finite Automata (DFA). A DFA can be visualised as a directed graph, where states are the nodes, and transitions are the edges between them, labelled by their respective alphabet elements

Results

Discussion

Conclusion