An efficient classifier design integrating rough set and set oriented database operations

Asit Kumar Das,Jaya Sil

doi:10.1016/j.asoc.2010.08.008

Abstract

Feature subset selection and dimensionality reduction of data are fundamental and most explored area of research in machine learning and data mining domains. Rough set theory (RST) constitutes a sound basis for data mining, can be used at different phases of knowledge discovery process. In the paper, by integrating the concept of RST and relational algebra operations, a new attribute reduction algorithm has been presented to select the minimum set of attributes, called reducts, required for classification of data. Firstly, the conditional attributes are partitioned into different groups according to their score, calculated using projection (Π) and division (÷) operations of relational algebra. The groups based on their scores are sorted in ascending order while the first group contains maximum information is uniquely used for generating the reducts. The non-reduct attributes are combined with the elements of the next group and the modified group is considered for computing the reducts. The process continues until all groups are exhausted and thus a final set of reducts is obtained. Then applying decision tree algorithm on each reduct, decision rule sets are generated, which are later pruned by removing the extraneous components. Finally, by involving the concept of probability theory and graph theory minimum number of rules is obtained used for building an efficient classifier.

Full Text