Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift

Yange Sun,Han Shao,Shasha Wang

doi:10.3390/info10050158

Abstract

Most existing multi-label data streams classification methods focus on extending single-label streams classification approaches to multi-label cases, without considering the special characteristics of multi-label stream data, such as label dependency, concept drift, and recurrent concepts. Motivated by these challenges, we devise an efficient ensemble paradigm for multi-label data streams classification. The algorithm deploys a novel change detection based on Jensen–Shannon divergence to identify different kinds of concept drift in data streams. Moreover, our method tries to consider label dependency by pruning away infrequent label combinations to enhance classification performance. Empirical results on both synthetic and real-world datasets have demonstrated its effectiveness.

Highlights

In recent years, sensor networks [1], spam filtering [2], intrusion detection [3], and credit card fraud detection [4] have contributed to different new applications in continuously arriving data known as data streams [5]
In order to meet the above challenges, we develop an efficient ensemble scheme for multi-label data streams aiming at taking into account label dependencies as well as dealing with different types of concept drift
This paper introduces four popular performance metrics designed for multi-label data streams classifications: Hamming loss, Subset accuracy, F1, and Log-Loss

Summary

Introduction

Sensor networks [1], spam filtering [2], intrusion detection [3], and credit card fraud detection [4] have contributed to different new applications in continuously arriving data known as data streams [5]. In the data streams model, instances arrive at a higher rate, and the algorithms must process them with strict constraints of time and memory [6]. Traditional methods focus on classifying data streams under single-label scenarios where each instance belongs to a single label. Many real-world applications involve data with multi-label data streams. Multi-label stream classification is a non-trivial task, because traditional multi-label classification approaches work under the batch settings. An important feature of multi-label data streams is concept drift, i.e., the underlying distribution of data may change over time. Such changes might deteriorate the predictive accuracy of classifiers

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: Apr 28, 2019
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Efficient mining of IoT based data streams for advanced computer vision systems
Affan Ahmed Toor ... Farah Younas
Multimedia Tools and Applications | VOL. 83
Affan Ahmed Toor, et. al.Affan Ahmed Toor ... Farah Younas
18 Jun 2020
Multimedia Tools and Applications | VOL. 83

QuadCDD: A Quadruple-based Approach for Understanding Concept Drift in Data Streams
Pingfan Wang ... Wai Lok Woo
Expert Systems with Applications | VOL. 238
Pingfan Wang, et. al.Pingfan Wang ... Wai Lok Woo
25 Oct 2023
Expert Systems with Applications | VOL. 238

Intrusion detection in the IoT data streams using concept drift localization
Renjie Chu ... Quanxi Feng
AIMS Mathematics | VOL. 9
Renjie Chu, et. al.Renjie Chu ... Quanxi Feng
01 Jan 2023
AIMS Mathematics | VOL. 9

Countering the Concept-Drift Problem in Big Data Using iOVFDT
Hang Yang ... Simon Fong
-
Hang Yang, et. al.Hang Yang ... Simon Fong
01 Jun 2013
01 Jun 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Ensemble Classification for Multi-Label Data Streams with Concept Drift

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information