Feature subset selection based on Filter technique

K Fathima Bibi,M Nazreen Banu

doi:10.1109/iccct2.2015.7292710

Abstract

From a large amount of data, significant knowledge is discovered by means of applying techniques in the knowledge management process and those techniques is known as Data mining techniques. For a specific domain, a form of knowledge discovery called data mining is necessary for solving the problems. The classes of unknown data are detected by the technique called classification. Neural networks, rule based, decision trees, Bayesian are the some of the existing methods used for classification. It is necessary to filter the irrelevant attributes before applying any mining techniques. Embedded, Wrapper and Filter techniques are various feature selection techniques used for filtering. In this paper, we have proposed an improved method using the existing cosine similarity measure for selecting the attributes from a large number of attributes. The decision tree classification technique J48 algorithm and Naive Bayes classifier are used. The above techniques are analyzed by two different datasets taken from the UCI repository and the results are generated. From the implementation result, our proposed subset evaluation method gives the best result with high accuracy and less error rate.

Full Text