Abstract

Multi-label learning deals with the problem where an instance is associated with multiple labels simultaneously. Multi-label data is often of high dimensionality and has many noisy, irrelevant, and redundant features. As an important machine learning task, multi-label feature selection has received considerable attention in recent years due to its promising performance in dealing with high-dimensional multi-label data. Existing multi-label feature selection methods typically select the global features which are shared by all instances in a dataset. However, these multi-label feature selection methods may be suboptimal since they do not consider the specific characteristics of instances. In this paper, we propose a novel algorithm that integrates Global and Local Feature Selection (GLFS) to exploit both the global features and a subset of discriminative features shared only locally by a subgroup of instances in a multi-label dataset. Specifically, GLFS employs linear regression and ℓ 2,1 -norm on the regression parameters to achieve simultaneous global and local feature selection. Moreover, the proposed algorithm has an effective mechanism for utilizing label correlations to improve the feature selection. Experiments on real-world multi-label datasets show the superiority of GLFS over the state-of-the-art multi-label feature selection methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call