Abstract

Feature selection has aroused extensive attention and aims at selecting features that are highly relevant to classification from raw datasets to improve the performance of a learning model. Fuzzy rough set theory is a powerful mathematical method for feature selection. The classical fuzzy rough set model is very sensitive to the noise while the noise samples in classification data often appear. In addition, fuzzy rough set theory does not fit well when the density distribution of the samples in the dataset varies greatly. Thus, it is of great significance to improve the robustness of fuzzy rough set models and its adaptability to data for feature selection. Inspired by these issues, we focus on the robust fuzzy rough set approach for feature selection. We first propose a robust fuzzy rough set model based on data distribution to achieve the purpose of anti-noise i.e., Noise-aware Fuzzy Rough Sets (NFRS) model. This model proposes a novel search mechanism, which weakens the sensitivity of the approximation operator to noise by considering the distribution of samples in the decision classes to weight the samples, further obtains three kinds of samples, i.e., intra-class samples, boundary samples, and outlier samples. Then, the degrees of relevance of the feature for class is defined by the dependency function based on the NFRS model to evaluate the significance of the feature subset. On this basis, an evaluation function about feature significance is constructed, which simultaneously considers the relevance and redundancy of a candidate feature provided for the selected subset and the remaining feature subset. A novel forward greedy search algorithm is presented to select a feature sequence. The selected features are subsequently evaluated with downstream classification tasks. Experimental using real-world datasets demonstrate the effectiveness of the proposed model and its superiority against comparison baseline methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call