Abstract

Feature selection is one of the most efficient procedures for reducing the dimensionality of high-dimensional data by choosing a practical subset of features. Since labeled samples are not always available and labeling data may be time-consuming or costly, the importance of semi-supervised learning becomes apparent. Semi-supervised learning deals with data that includes both labeled and unlabelled instances. This article proposes a method based on Ant Colony Optimization (ACO) for the semi-supervised feature selection problem called SemiACO. The SemiACO algorithm finds features by considering the minimum redundancy between features and the maximum relevancy between the features and the class label. The SemiACO uses a nonlinear heuristic function instead of a linear one. The heuristic learning technique for the ACO heuristic function utilize a Temporal Difference (TD) reinforcement learning algorithm. We characterize the feature selection search space as a Markov Decision Process (MDP), where features indicate the states, and selecting the unvisited features by each ant represents a set of actions. We contrast the efficiency of SemiACO based on various experiments on 14 benchmark datasets, comparing eight semi-supervised feature selection methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call