A Comparative Study on Decision Rule Induction for incomplete data using Rough Set and Random Tree Approaches

M Sandhya

doi:10.9790/0661-0930610

Abstract

Handling missing attribute values is the greatest challenging process in data analysis. There are so many approaches that can be adopted to handle the missing attributes. In this paper, a comparative analysis is made of an incomplete dataset for future prediction using rough set approach and random tree generation in data mining. The result of simple classification technique (using random tree classifier) is compared with the result of rough set attribute reduction performed based on Rule induction and decision tree. WEKA (Waikato Environment for Knowledge Analysis), a Data Mining tool and ROSE2 (Rough Set Data Explorer), a Rough Set approach tool have been used for the experiment. The result of the experiment shows that the random tree classification algorithm gives promising results with utmost accuracy and produces best decision rule using decision tree for the original incomplete data or with the missing attribute values (i.e. missing attributes are simply ignored). Whereas in rough set approach, the missing attribute values are filled with the most common values of that attribute domain. This paper brings out a conclusion that the missing data simply ignored yields best decision than filling some data in the place of missing attribute value.

Full Text