Case study identification: A trivial indicator outperforms human classifiers

Austen Rainer,Claes Wohlin

doi:10.1016/j.infsof.2023.107252

Abstract

The definition and term “case study” are not being applied consistently by software engineering researchers. We previously developed a trivial “smell indicator” to help detect the misclassification of primary studies as case studies. To evaluate the performance of the indicator. We compare the performance of the indicator against human classifiers for three datasets, two datasets comprising classifications by both authors of systematic literature studies and primary studies, and one dataset comprising only primary-study author classifications. The indicator outperforms the human classifiers for all datasets. The indicator is successful because human classifiers “fail” to properly classify their own, and others’, primary studies. Consequently, reviewers of primary studies and authors of systematic literature studies could use the classifier as a “sanity” check for primary studies. Moreover, authors might use the indicator to double-check how they classified a study, as part of their analysis, and prior to submitting their manuscript for publication. We challenge the research community to both beat the indicator, and to improve its ability to identify true case studies.

Full Text