Abstract

In recent years, there has been a growing interest in the problem of multi-label streaming feature selection with no prior knowledge of the feature space. However, the algorithms proposed to handle this problem seldom consider the group structure of streaming features. Another shortcoming arises from the fact that few studies have addressed atomic feature models, and particularly, few have measured the attraction and repulsion between features. To remedy these shortcomings, we develop the streaming feature selection algorithm with dynamic sliding windows and feature repulsion loss (SF-DSW-FRL). This algorithm is essentially carried out in three consecutive steps. Firstly, within dynamic sliding windows, candidate streaming features that are strongly related to the labels in different feature groups are selected and stored in a fixed sliding window. Then, the interaction between features is measured by a loss function inspired by the mutual repulsion and attraction between atoms in physics. Specifically, one feature attraction term and two feature repulsion terms are constructed and combined to create the feature repulsion loss function. Finally, for the fixed sliding window, the best feature subset is selected according to this loss function. The effectiveness of the proposed algorithm is demonstrated through experiments on several multi-label datasets, statistical hypothesis testing, and stability analysis.

Highlights

  • In recent years, multi-label learning has been extensively used in various practical applications, such as text classification [1] and gene function classification [2]

  • We compare the proposed SF-DSW-feature repulsion loss (FRL) algorithm with feature selection results obtained from seven state-of-the-art algorithms, namely multi-label naive Bayes classification (MLNB) [41], multi-label dimensionality reduction via dependence maximization (MDDMspc) [42], MDDMproj [42], multivariate mutual information criterion for multi-label feature selection (PMU) [18], multi-label feature selection based on neighborhood mutual information (MFNMIpes) [20], MFNMIopt [20], and multi-label feature selection with label correlation (MUCO) [29]

  • In practical applications, data is generated in real time, and the number of generated data samples varies with time

Read more

Summary

Introduction

Multi-label learning has been extensively used in various practical applications, such as text classification [1] and gene function classification [2]. Lin et al [20] proposed a multi-label feature selection algorithm based on the neighborhood mutual information In this algorithm, an effective feature subset is selected according to the maximum correlation and minimum redundancy criteria. An effective feature subset is selected according to the maximum correlation and minimum redundancy criteria This method generalizes the neighborhood entropy function of single-label learning to multi-label learning. This paper proposes a multi-label streaming feature selection algorithm based on dynamic sliding windows and feature repulsion loss. We conclude our discussion and point out further research directions

Related Work
Multi-Label Learning
Neighborhood Information Entropy n o
Xq log i
Sliding-Window Mechanisms
Evaluation of Feature
Feature Repulsion Loss
Streaming Feature Selection
The Experimental Datasets
Evaluation Criteria
Configuration of Related Parameters and Experimental Results
Methods
Effect and Fine-Tuning of Feature Selection Thresholds
Statistical
Comparison
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.