Abstract

Multi-label classification has recently attracted greater research interest as a data mining task. Many current applications in data mining address problems having instances that belong to more than one class, which requires the development of new efficient methods.Instance-based classification models, such as the :-nearest neighbor rule, are among the highest performing methods on any classification task and have also been successfully applied to multi-label problems. Despite their simplicity, they achieve comparable performance compared to considerably more complex methods. One of the challenges associated with instance-based classification models is their requirement for storing all training instances in memory. To ameliorate this problem, instance selection methods have been proposed. However, their application to multi-label problems is problematic because the adaptation of most of their concepts to multi-label problems is difficult.In this paper, we propose a scalable evolutionary algorithm for instance selection for multi-label problems. As our evolutionary algorithm is solely based on the performance of a subset of selected instances, it is able to handle multi-label datasets. On a set of 12 real-world problems, our approach performs comparably to methods that use all instances while achieving a large reduction in the size of the training set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call