Abstract

Feature selection based on the fuzzy neighborhood rough set model (FNRS) is highly popular in data mining. However, the dependent function of FNRS only considers the information present in the lower approximation of the decision while ignoring the information present in the upper approximation of the decision. This construction method may lead to the loss of some information. To solve this problem, this paper proposes a fuzzy neighborhood joint entropy model based on fuzzy neighborhood self-information measure (FNSIJE) and applies it to feature selection. First, to construct four uncertain fuzzy neighborhood self-information measures of decision variables, the concept of self-information is introduced into the upper and lower approximations of FNRS from the algebra view. The relationships between these measures and their properties are discussed in detail. It is found that the fourth measure, named tolerance fuzzy neighborhood self-information, has better classification performance. Second, an uncertainty measure based on the fuzzy neighborhood joint entropy has been proposed from the information view. Inspired by both algebra and information views, the FNSIJE is proposed. Third, the K–S test is used to delete features with weak distinguishing performance, which reduces the dimensionality of high-dimensional gene datasets, thereby reducing the complexity of high-dimensional gene datasets, and then, a forward feature selection algorithm is provided. Experimental results show that compared with related methods, the presented model can select less important features and have a higher classification accuracy.

Highlights

  • Related workFeature selection is an important data preprocess in the fields of granular computing and artificial intelligence [1,2,3,4,5,6]

  • – To better discuss feature selection methods based on algebra and information views, this paper studies the uncertainty measure method based on fuzzy neighborhood joint entropy

  • The section “Insufficiency of neighborhood correlation functions and uncertainty measurement based on FNSIJE” points out the shortcoming of the neighborhood correlation functions; in view of this shortcoming, we propose four fuzzy neighborhood self-information measures, and study their related properties

Read more

Summary

Introduction

Related workFeature selection is an important data preprocess in the fields of granular computing and artificial intelligence [1,2,3,4,5,6].

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call