Abstract
Feature selection has been widely discussed as an important preprocessing step in machine learning and data mining. Evaluation criterion designing arises as a main aspect for constructing feature selection algorithms. In this paper, a new feature evaluation criterion, called the neighborhood effective information ratio (NEIR), is proposed to compute discernibility capability of categorical and numerical features. Based on the evaluation criterion, a general definition of significance of hybrid features is presented. Then a greedy selection algorithm for hybrid feature subsets based on the proposed evaluation criterion is constructed for data classification. We compare the proposed algorithm with other feature selection algorithms. Both theoretical and experimental analysis verifies the effectiveness and the efficiency of the proposed algorithm.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have