Streaming Feature Selection for Multi-Label Data with Dynamic Sliding Windows and Feature Repulsion Loss

Yu Li,Yusheng Cheng

doi:10.3390/e21121151

Abstract

In recent years, there has been a growing interest in the problem of multi-label streaming feature selection with no prior knowledge of the feature space. However, the algorithms proposed to handle this problem seldom consider the group structure of streaming features. Another shortcoming arises from the fact that few studies have addressed atomic feature models, and particularly, few have measured the attraction and repulsion between features. To remedy these shortcomings, we develop the streaming feature selection algorithm with dynamic sliding windows and feature repulsion loss (SF-DSW-FRL). This algorithm is essentially carried out in three consecutive steps. Firstly, within dynamic sliding windows, candidate streaming features that are strongly related to the labels in different feature groups are selected and stored in a fixed sliding window. Then, the interaction between features is measured by a loss function inspired by the mutual repulsion and attraction between atoms in physics. Specifically, one feature attraction term and two feature repulsion terms are constructed and combined to create the feature repulsion loss function. Finally, for the fixed sliding window, the best feature subset is selected according to this loss function. The effectiveness of the proposed algorithm is demonstrated through experiments on several multi-label datasets, statistical hypothesis testing, and stability analysis.

Highlights

In recent years, multi-label learning has been extensively used in various practical applications, such as text classification [1] and gene function classification [2]
We compare the proposed SF-DSW-feature repulsion loss (FRL) algorithm with feature selection results obtained from seven state-of-the-art algorithms, namely multi-label naive Bayes classification (MLNB) [41], multi-label dimensionality reduction via dependence maximization (MDDMspc) [42], MDDMproj [42], multivariate mutual information criterion for multi-label feature selection (PMU) [18], multi-label feature selection based on neighborhood mutual information (MFNMIpes) [20], MFNMIopt [20], and multi-label feature selection with label correlation (MUCO) [29]
In practical applications, data is generated in real time, and the number of generated data samples varies with time

Summary

Introduction

Multi-label learning has been extensively used in various practical applications, such as text classification [1] and gene function classification [2]. Lin et al [20] proposed a multi-label feature selection algorithm based on the neighborhood mutual information In this algorithm, an effective feature subset is selected according to the maximum correlation and minimum redundancy criteria. An effective feature subset is selected according to the maximum correlation and minimum redundancy criteria This method generalizes the neighborhood entropy function of single-label learning to multi-label learning. This paper proposes a multi-label streaming feature selection algorithm based on dynamic sliding windows and feature repulsion loss. We conclude our discussion and point out further research directions

Related Work

Multi-Label Learning

Neighborhood Information Entropy n o

Xq log i

Sliding-Window Mechanisms

Evaluation of Feature

Feature Repulsion Loss

Streaming Feature Selection

The Experimental Datasets

Evaluation Criteria

Configuration of Related Parameters and Experimental Results

Methods

Effect and Fine-Tuning of Feature Selection Thresholds

Statistical

Comparison

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Nov 25, 2019
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Streaming Feature Selection for Multi-Label Data with Dynamic Sliding Windows and Feature Repulsion Loss

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Online early terminated streaming feature selection based on Rough Set theory
Peng Zhou ... Yanping Zhang
Applied Soft Computing | VOL. 113
Peng Zhou, et. al.Peng Zhou ... Yanping Zhang
23 Oct 2021
Applied Soft Computing | VOL. 113

Multi-objective Cuckoo Search-based Streaming Feature Selection for Multi-label Dataset
Dipanjyoti Paul ... Sriparna Saha
ACM Transactions on Knowledge Discovery from Data | VOL. 15
Dipanjyoti Paul, et. al.Dipanjyoti Paul ... Sriparna Saha
19 May 2021
ACM Transactions on Knowledge Discovery from Data | VOL. 15

Online Scalable Streaming Feature Selection via Dynamic Decision
Peng Zhou ... Shu Zhao
ACM Transactions on Knowledge Discovery from Data | VOL. 16
Peng Zhou, et. al.Peng Zhou ... Shu Zhao
09 Mar 2022
ACM Transactions on Knowledge Discovery from Data | VOL. 16

Streaming Feature Selection for Multilabel Learning Based on Fuzzy Mutual Information
Yaojin Lin ... Qinghua Hu
IEEE Transactions on Fuzzy Systems | VOL. 25
Yaojin Lin, et. al.Yaojin Lin ... Qinghua Hu
01 Dec 2017
IEEE Transactions on Fuzzy Systems | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Streaming Feature Selection for Multi-Label Data with Dynamic Sliding Windows and Feature Repulsion Loss

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy