A Method to Assess and Argue for Practical Significance in Software Engineering

Richard Torkar,Francisco Gomes De Oliveira Neto,Neil A Ernst,Per Lenberg,Robert Feldt,Carlo A Furia,Lucas Gren

doi:10.1109/tse.2020.3048991

Abstract

A key goal of empirical research in software engineering is to assess practical significance, which answers the question whether the observed effects of some compared treatments show a relevant difference in practice in realistic scenarios. Even though plenty of standard techniques exist to assess statistical significance, connecting it to practical significance is not straightforward or routinely done; indeed, only a few empirical studies in software engineering assess practical significance in a principled and systematic way. In this paper, we argue that Bayesian data analysis provides suitable tools to assess practical significance rigorously. We demonstrate our claims in a case study comparing different test techniques. The case study's data was previously analyzed (Afzal <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">et al.</i> , 2015) using standard techniques focusing on statistical significance. Here, we build a multilevel model of the same data, which we fit and validate using Bayesian techniques. Our method is to apply cumulative prospect theory on top of the statistical model to quantitatively connect our statistical analysis output to a practically meaningful context. This is then the basis both for assessing and arguing for practical significance. Our study demonstrates that Bayesian analysis provides a technically rigorous yet practical framework for empirical software engineering. A substantial side effect is that any uncertainty in the underlying data will be propagated through the statistical model, and its effects on practical significance are made clear. Thus, in combination with cumulative prospect theory, Bayesian analysis supports seamlessly assessing practical significance in an empirical software engineering context, thus potentially clarifying and extending the relevance of research for practitioners.

Highlights

A MAIN goal of research in empirical software engineering (ESE) is assessing practical significance: what is the impact of the research findings in realistic scenarios? To this end, statistical analysis has been used extensively in ESE for decades
We demonstrate how to build a rigorous model of practical significance, and the advantages of reporting significance results in a way that is grounded in concrete decision-making scenarios—in contrast to the traditional approach that presents general statistics in a more abstract form
In order to present the background of our research, we introduce the essential terminology and techniques of Bayesian statistical analysis and cumulative prospect theory (CPT)

Summary

Introduction

A MAIN goal of research in empirical software engineering (ESE) is assessing practical significance: what is the impact of the research findings in realistic scenarios? To this end, statistical analysis has been used extensively in ESE for decades. A common example are effect size measures (such as Cohen’s d, or the size of coefficients in a regression model): if the effect size of a technique A is markedly bigger than the one of another technique B, this is taken as an indication that A performs better than B in practice. This common approach overlooks the issue that assessing practical significance on statistical measures such as effect sizes makes it hard to ensure that the statistics accurately reflect expert knowledge. In Bayesian statistics, this is expressed as estimating the probability P (θ | D) of the parameters given the data, which obeys Bayes’ theorem:

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Software Engineering	Publication Date: Jan 20, 2021
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Method to Assess and Argue for Practical Significance in Software Engineering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering

Lead the way for us

Similar Papers

Support mechanisms to conduct empirical studies in software engineering
Alex Borges ... Aline Alencar
-
Alex Borges, et. al.Alex Borges ... Aline Alencar
27 Apr 2015
27 Apr 2015

Bayesian Data Analysis in Empirical Software Engineering Research
Carlo Alberto Furia ... Richard Torkar
IEEE Transactions on Software Engineering | VOL. -
Carlo Alberto Furia, et. al.Carlo Alberto Furia ... Richard Torkar
01 Jan 2019
IEEE Transactions on Software Engineering | VOL. -

Investigations about replication of empirical studies in software engineering: A systematic mapping study
Cleyton V.C De Magalhães ... Marcos Suassuna
Information and Software Technology | VOL. 64
Cleyton V.C De Magalhães, et. al.Cleyton V.C De Magalhães ... Marcos Suassuna
13 Feb 2015
Information and Software Technology | VOL. 64

Towards a Taxonomy of Replications in Empirical Software Engineering Research: A Research Proposal
Cleyton V.C De Magalhaes ... Fabio Q.B Da Silva
-
Cleyton V.C De Magalhaes, et. al.Cleyton V.C De Magalhaes ... Fabio Q.B Da Silva
01 Oct 2013
01 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Method to Assess and Argue for Practical Significance in Software Engineering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Software Engineering