Abstract
Feature selection is an important preprocessing step in data mining and pattern recognition. The neighborhood rough set (NRS) model is a widely-used rough set model for feature selection on continuous data. All currently known NRS models are defined on a distance metric — mostly the Euclidean distance metric — which invalidates the NRS models in scenarios wherein the Euclidean distance is ineffective, for example, while considering differing attribute weights. We first introduce the concept of space division of granular-rectangular, and then construct the neighborhood radius in our method by describing the relationship between child and parent spaces, which avoids the use of a distance metric and reduces the search space for the neighborhood radius. This greatly improves both the accuracy and efficiency of NRS. In addition, the upper and lower approximations of the granular-rectangular rough sets (GRRSs) comprise equivalence classes; this results in better performance of GRRS in knowledge representation compared with the traditional NRS. Experimental results on public benchmark datasets reveal that our method, GRRS, achieves higher accuracy than ten popular and state-of-the-art feature-selection methods, including two NRS algorithms. Moreover, GRRS outperforms the established NRS algorithms regarding efficiency, including the state-of-the-art NRS algorithm, GBNRS. All code has been released as an open libary called GRRS: https://github.com/syxiaa/GRRS.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.