The role and value of replication in empirical software engineering results

Martin Shepperd,Nemitari Ajienka,Steve Counsell

doi:10.1016/j.infsof.2018.01.006

Martin Shepperd, Nemitari Ajienka + Show 1 more

Open Access

https://doi.org/10.1016/j.infsof.2018.01.006

Copy DOI

Abstract

ContextConcerns have been raised from many quarters regarding the reliability of empirical research findings and this includes software engineering. Replication has been proposed as an important means of increasing confidence. ObjectiveWe aim to better understand the value of replication studies, the level of confirmation between replication and original studies, what confirmation means in a statistical sense and what factors modify this relationship. MethodWe perform a systematic review to identify relevant replication experimental studies in the areas of (i) software project effort prediction and (ii) pair programming. Where sufficient details are provided we compute prediction intervals. ResultsOur review locates 28 unique articles that describe replications of 35 original studies that address 75 research questions. Of these 10 are external, 15 internal and 3 internal-same-article replications. The odds ratio of internal to external (conducted by independent researchers) replications of obtaining a ‘confirmatory’ result is 8.64. We also found incomplete reporting hampered our ability to extract estimates of effect sizes. Where we are able to compute replication prediction intervals these were surprisingly large. ConclusionWe show that there is substantial evidence to suggest that current approaches to empirical replications are highly problematic. There is a consensus that replications are important, but there is a need for better reporting of both original and replicated studies. Given the low power and incomplete reporting of many original studies, it can be unclear the extent to which a replication is confirmatory and to what extent it yields additional knowledge to the software engineering community. We recommend attention is switched from replication research to meta-analysis.

Full Text