Abstract

Rough set data analysis (RSDA) has recently become a frequently studied symbolic method in data mining. Among other things, it is being used for the extraction of rules from databases; it is, however, not clear from within the methods of rough set analysis, whether the extracted rules are valid.In this paper, we suggest to enhance RSDA by two simple statistical procedures, both based on randomization techniques, to evaluate the validity of prediction based on the approximation quality of attributes of rough set dependency analysis. The first procedure tests the casualness of a prediction to ensure that the prediction is not based on only a few (casual) observations. The second procedure tests the conditional casualness of an attribute within a prediction rule.The procedures are applied to three data sets, originally published in the context of rough set analysis. We argue that several claims of these analyses need to be modified because of lacking validity, and that other possibly significant results were overlooked.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.