Abstract

Feature selection can remove data noise and redundancy and reduce computational complexity, which is vital for machine learning. Because the difference between nominal attribute values is difficult to measure, feature selection for hybrid information systems faces challenges. In addition, many existing feature selection methods are susceptible to noise, such as Fisher, LASSO, random forest, mutual information, rough-set-based methods, etc. This paper proposes some techniques that consider the above problems from the perspective of fuzzy evidence theory. Firstly, a new distance incorporating decision attributes is defined, and then a relation between fuzzy evidence theory and fuzzy β covering with an anti-noise mechanism is established. Based on fuzzy belief and fuzzy plausibility, two robust feature selection algorithms for hybrid data are proposed in this framework. Experiments on 10 datasets of various types have shown that the proposed algorithms achieved the highest classification accuracy 11 times out of 20 experiments, significantly surpassing the performance of the other 6 state-of-the-art algorithms, achieved dimension reduction of 84.13% on seven UCI datasets and 99.90% on three large-scale gene datasets, and have a noise tolerance that is at least 6% higher than the other 6 state-of-the-art algorithms. Therefore, it can be concluded that the proposed algorithms have excellent anti-noise ability while maintaining good feature selection ability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call