Abstract
Feature selection is widely used as the first stage of classification task to reduce the dimension of problem, decrease noise, improve speed and relieve memory constraints by the elimination of irrelevant or redundant features. One approach in the feature selection area is employing population-based optimization algorithms such as particle swarm optimization (PSO)-based method and ant colony optimization (ACO)-based method. Ant colony optimization algorithm is inspired by observation on real ants in their search for the shortest paths to food sources. Protein function prediction is an important problem in functional genomics. Typically, protein sequences are represented by feature vectors. A major problem of protein datasets that increase the complexity of classification models is their large number of features. This paper empowers the ant colony optimization algorithm by enabling the ACO to select features for a Bayesian classification method. The naive Bayesian classifier is a straightforward and frequently used method for supervised learning. It provides a flexible way for dealing with any number of features or classes, and is based on probability theory. This paper then compares the performance of the proposed ACO algorithm against the performance of a standard binary particle swarm optimization algorithm on the task of selecting features on Postsynaptic dataset. The criteria used for this comparison are maximizing predictive accuracy and finding the smallest subset of features. Simulation results on Postsynaptic dataset show that proposed method simplifies features effectively and obtains a higher classification accuracy compared to other feature selection methods.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have