Rough set theory is an important approach to deal with uncertainty in data mining. However, Pawlak’s classical rough set has low fault-tolerance on concept approximation based on knowledge granules, which may influence the classification accuracy in practical application. To address this problem, the present paper proposes a novel sequential rough-set model that is proved to be a conservative extension of Pawlak’s classical rough set. As a result, it effectively improves the fault-tolerance ability, classification accuracy and concept approximation accuracy of the latter without any additional assumption. Based on the properties and theoretical analysis of the proposed model, an algorithm is presented to automatically determine the sequential thresholds and compute the three regions for the given concept. Experiments on real data verify the validity of the algorithm, and also show the stable improvement on the two types of accuracy.
Read full abstract