Abstract

Handling missing attribute values is the greatest challenging process in data analysis. There are so many approaches that can be adopted to handle the missing attributes. In this paper, a comparative analysis is made of an incomplete dataset for future prediction using rough set approach and random tree generation in data mining. The result of simple classification technique (using random tree classifier) is compared with the result of rough set attribute reduction performed based on Rule induction and decision tree. WEKA (Waikato Environment for Knowledge Analysis), a Data Mining tool and ROSE2 (Rough Set Data Explorer), a Rough Set approach tool have been used for the experiment. The result of the experiment shows that the random tree classification algorithm gives promising results with utmost accuracy and produces best decision rule using decision tree for the original incomplete data or with the missing attribute values (i.e. missing attributes are simply ignored). Whereas in rough set approach, the missing attribute values are filled with the most common values of that attribute domain. This paper brings out a conclusion that the missing data simply ignored yields best decision than filling some data in the place of missing attribute value.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.