Abstract

Feature selection is an important preprocessing step in data mining and pattern recognition. The neighborhood rough set (NRS) model is a widely-used rough set model for feature selection on continuous data. All currently known NRS models are defined on a distance metric — mostly the Euclidean distance metric — which invalidates the NRS models in scenarios wherein the Euclidean distance is ineffective, for example, while considering differing attribute weights. We first introduce the concept of space division of granular-rectangular, and then construct the neighborhood radius in our method by describing the relationship between child and parent spaces, which avoids the use of a distance metric and reduces the search space for the neighborhood radius. This greatly improves both the accuracy and efficiency of NRS. In addition, the upper and lower approximations of the granular-rectangular rough sets (GRRSs) comprise equivalence classes; this results in better performance of GRRS in knowledge representation compared with the traditional NRS. Experimental results on public benchmark datasets reveal that our method, GRRS, achieves higher accuracy than ten popular and state-of-the-art feature-selection methods, including two NRS algorithms. Moreover, GRRS outperforms the established NRS algorithms regarding efficiency, including the state-of-the-art NRS algorithm, GBNRS. All code has been released as an open libary called GRRS: https://github.com/syxiaa/GRRS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call