Abstract

Knowledge discovery from software engineering measurement data is essential in deriving the right conclusions from experiments. Various data analysis techniques may provide data analysts with different and complementary insights into the studied phenomena. In this paper, two data analysis techniques – Rough Sets (RSs) and Logistic Regression (LR) are compared, from both the theoretical and the experimental point of view. In particular, the empirical study was performed as a part of the ESPRIT/ESSI project CEMP on a real-life maintenance project, the DATATRIEVE™ project carried out at Digital Engineering Italy. We have applied both techniques to the same data set. The goal of the experimental study was to predict module fault-proneness and to determine the major factors affecting software reliability in the application context. The results obtained with either analysis technique are discussed and compared. Then, a hybrid approach is built, by integrating different and complementary knowledge obtained from either approach on the fault-proneness of modules. This knowledge can be reused in the organizational framework of a company-wide experience factory.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call