Abstract
In multi-label data, the importance of each label within the logical label vector varies for each sample, and there exist inherent correlations among the labels. However, the logical label vector fails to capture these nuances. Consequently, relying solely on this vector for feature selection in multi-label data results in underutilization of supervisory information. To address this issue, this paper introduces a novel label enhancement algorithm. This algorithm leverages neighborhood information derived from features to transform the logical label vector into a label distribution that effectively reflects label differences and correlations. Subsequently, we propose a feature selection algorithm tailored for multi-label data, which incorporates both the transformed label distribution and mutual information. This algorithm not only accounts for the mutual information between features and label distributions but also captures the mutual information among features themselves. Finally, we evaluate our proposed feature selection algorithm against five state-of-the-art multi-label feature selection algorithms on ten publicly available datasets. The experimental results reveal that our algorithm outperforms its competitors in six distinct evaluation metrics, achieving an average performance improvement of approximately 9%. This substantial enhancement underscores the efficacy of our algorithm in handling complex multi-label data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have