Abstract

Aiming at the problem of effectively selecting relevant features from heterogeneous data without decision, a novel feature selection approach is studied based on fuzzy mutual information in fuzzy rough set theory. First, the fuzzy relevance of each feature is defined by using fuzzy mutual information, and then, the fuzzy conditional relevance is further given. Next, the fuzzy redundancy is defined by using the difference between the fuzzy relevance and the fuzzy conditional relevance. Thereby, the evaluation index of the feature importance is obtained by using the idea of unsupervised minimum redundancy and maximum relevance. Finally, a fuzzy-mutual-information-based unsupervised feature selection algorithm is designed to select feature sequences. Extensive experiments are conducted on public datasets, and six unsupervised feature selection algorithms are compared. The selected features are evaluated by classification, clustering, and outlier detection methods. Experimental results show that the proposed algorithm can select fewer heterogeneous features to maintain or improve the performance of learning algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call