Abstract

Data reduction, aiming to reduce the original data by selecting the most representative information, is an important technique of preprocessing data. At present, large-scale or huge data are very common and the development of data reduction techniques for such data has attracted much attention. As a powerful tool for handling uncertainty in real-valued data, the fuzzy rough set theory has been widely applied to data reduction including extensive feature selection methods and some instance selection approaches. Nevertheless, not much work has been devoted to the simultaneous selection of feature and instance based on fuzzy rough sets. In this paper, we investigate the fuzzy rough set-based bi-selection issue for data reduction. Specifically, the unified concepts of the importance degrees of fuzzy granules are presented to select the representative instances first and then the critical features. An instance selection algorithm with a noise elimination technique is provided to firstly remove the noise and then select the representative instances according to the importance degrees of fuzzy granules. Then, the importance-degree-preserved attribute reduction is proposed, and a corresponding feature selection algorithm with a wrapper technique is given to search for a best feature subset. Lastly, the bi-selection method based on fuzzy rough sets (BSFRS) is presented for data reduction by integrating the instance selection and the feature selection methods. Moreover, some numerical experiments are conducted to assess the performance of BSFRS, and the results show that BSFRS performs well in terms of the effectiveness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call