Abstract

Fuzzy rough set theory can tackle feature redundancy in data and select more informative features for machine learning tasks. Gaussian kernel is often coupled with fuzzy rough set theory to measure fuzzy relation between data instances. However, Gaussian kernel has a serious long-tail phenomenon, which would perform poorly in modeling the fuzzy relation for high-dimensional data. Moreover, a robust feature evaluation function is also nontrivial in a fuzzy rough set model because a naive model may select those non-optimal feature subsets due to the perturbations from redundant features. This paper delves into Student-t kernel and fuzzy divergence to address these challenges for fuzzy rough feature selection. This paper proposes a new Student-t Kernelized Fuzzy Rough Set (SKFRS) model. The new model uses fuzzy divergence to evaluate uncertain information in the data. It also explores a newly-defined feature evaluation function on the biases of the dynamic relation between the relevance and indispensability of features in feature selection process. A novel forward greedy search algorithm is then presented to solve the final objective function. The selected features are subsequently evaluated on downstream classification tasks. Experimental results using real-world datasets demonstrate the effectiveness of the proposed model and its superiority against the baseline methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call