Abstract

For the problem about a large number of irrelevant and redundant features may reduce the performance of data classification in massive data sets, a method of feature automatic selection based on mutual information and fuzzy clustering algorithm is proposed. The method is carried out as follows: The first is to work out the feature correlation based on mutual information, and to group the data according to the feature of the maximum correlation. The second is to automatically determine the optimal number of feature and compression features dimension by fuzzy c-means clustering algorithm in the data groups. The theoretical analysis and the experiment indicate that the method can obtain higher efficiency in data classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call