Predicting Aggregated User Satisfaction in Software Projects

Łukasz Radliński

doi:10.1515/fcds-2018-0017

Abstract

Abstract User satisfaction is an important feature of software quality. However, it was rarely studied in software engineering literature. By enhancing earlier research this paper focuses on predicting user satisfaction with machine learning techniques using software development data from an extended ISBSG dataset. This study involved building, evaluating and comparing a total of 15,600 prediction schemes. Each scheme consists of a different combination of its components: manual feature preselection, handling missing values, outlier elimination, value normalization, automated feature selection, and a classifier. The research procedure involved a 10-fold cross-validation and separate testing, both repeated 10 times, to train and to evaluate each prediction scheme. Achieved level of accuracy for best performing schemes expressed by Matthews correlation coefficient was about 0.5 in the cross-validation and about 0.5–0.6 in the testing stage. The study identified the most accurate settings for components of prediction schemes.

Full Text