Abstract

Ant colony optimization (ACO) algorithms have been successfully applied to data classification problems that aim at discovering a list of classification rules. However, on the one hand, the ACO algorithm has defects including long search times and convergence issues with non-optimal solutions. On the other hand, given bottlenecks such as memory restrictions, time complexity, or data complexity, it is too hard to solve a problem when its scale becomes too large. One solution for this issue is to design a highly parallelized learning algorithm. The MapReduce programming model has quickly emerged as the most common model for executing simple algorithmic tasks over huge volumes of data, since it is simple, highly abstract, and efficient. Therefore, MapReduce-based ACO has been researched extensively. However, due to its unidirectional communication model and the inherent lack of support for iterative execution, ACO algorithms cannot easily be implemented on MapReduce. In this paper, a novel classification rule discovery algorithm is proposed, namely MR-AntMiner, which can capitalize on the benefits of the MapReduce model. In order to construct quality rules with fewer iterations as well as less communication between different nodes to share the parameters used by each ant, our algorithm splits the training data into some subsets that are randomly mapped to different mappers; then the traditional ACO algorithm is run on each mapper to gain the local best rule set, and the global best rule list is produced in the reducer phase according to a voting mechanism. The performance of our algorithm was studied experimentally on 14 publicly available data sets and further compared to several state-of-the-art classification approaches in terms of accuracy. The experimental results show that the predictive accuracy obtained by our algorithm is statistically higher than that of the compared targets. Furthermore, experimental studies show the feasibility and the good performance of the proposed parallelized MR-AntMiner algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call