ATTRIBUTE SELECTION USING ROUGH SETS IN SOFTWARE QUALITY CLASSIFICATION

Lofton A Bullard,Kehan Gao,Taghi M Khoshgoftaar

doi:10.1142/s0218539309003307

Abstract

Finding techniques to reduce software developmental effort and produce highly reliable software is an extremely vital goal for software developers. One method that has proven quite useful is the application of software metrics-based classification models. Classification models can be constructed to identify faulty components in a software system with high accuracy. Significant research has been dedicated towards developing methods for improving the quality of software metrics-based classification models. It has been shown in several studies that the accuracy of these models improves when irrelevant attributes are identified and eliminated from the training data set. This study presents a rough set theory approach, based on classical set theory, for identifying and eliminating irrelevant attributes from a training data set. Rough set theory is used to find small groups of attributes, determined by the relationships that exist between the objects in a data set, with comparable discernibility as larger sets of attributes. This allows for the development of simpler classification models that are easy for analyst to understand and explain to others. We built case-based reasoning models in order to evaluate their classification performance on the smaller subsets of attributes selected using rough set theory. The empirical studies demonstrated that by applying a rough set approach to find small subsets of attributes we can build case-based reasoning models with an accuracy comparable to, and in some cases better than, a case-based reasoning model built with a complete set of attributes.

Full Text