Data Stream Classification Based on the Gamma Classifier

Abril Valeria Uriarte-Arcia,Cornelio Yáñez-Márquez,Itzamá López-Yáñez,Oscar Camacho-Nieto,João Gama

doi:10.1155/2015/939175

Abril Valeria Uriarte-Arcia, Cornelio Yáñez-Márquez + Show 3 more

Open Access

PDF Available

https://doi.org/10.1155/2015/939175

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier) implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

Highlights

In recent years, technological advances have promoted the generation of a vast amount of information from different areas of knowledge: sensor networks, financial data, fraud detection, and web data, among others
In this paper we describe a novel method for the task of pattern classification over a continuous data stream based on an associative model
The DS-Gamma classifier performance improved as the window size increases with the exception of Electricity data stream, for this data stream performance is better with smaller window sizes

Summary

Introduction

Technological advances have promoted the generation of a vast amount of information from different areas of knowledge: sensor networks, financial data, fraud detection, and web data, among others. According to the study performed by IDC (International Data Corporation) [1], the digital universe in 2013 was estimated in 4.4 trillion gigabytes From this digital data, only 22% would be a candidate for analysis, while the available storage capacity could hold just 33% of the generated information. The algorithms developed to address this context, unlike traditional ones, must meet some constraints, as defined in [2], work with a limited amount of time, and use a limited amount of memory, and one or only few pass over the data. They should be capable of reacting to concept drift, that is, changes in the distribution of the data over time. In [3], more details of the requirements to be considered for data streams algorithms can be found

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: Jan 1, 2015
Citations: 13	License type: CC BY 3.0

R Discovery Prime

Data Stream Classification Based on the Gamma Classifier

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

Online Mining Changes of Items over Continuous Append-only and Dynamic Data Streams
...
Zenodo (CERN European Organization for Nuclear Research) | VOL. -
, et. al. ...
01 Jan 2004
Zenodo (CERN European Organization for Nuclear Research) | VOL. -

Approximate TF–IDF based on topic extraction from massive message stream using the GPU
Ugo Erra ... Giuseppe Caggianese
Information Sciences | VOL. 292
Ugo Erra, et. al.Ugo Erra ... Giuseppe Caggianese
16 Sep 2014
Information Sciences | VOL. 292

Fully online clustering of evolving data streams into arbitrarily shaped clusters
Richard Hyde ... A.R Mackenzie
Information Sciences | VOL. 382-383
Richard Hyde, et. al.Richard Hyde ... A.R Mackenzie
06 Dec 2016
Information Sciences | VOL. 382-383

Efficient Method for Continuous IoT Data Stream Indexing in the Fog-Cloud Computing Level
Karima Khettabi ... Hamid Seridi
Big Data and Cognitive Computing | VOL. 7
Karima Khettabi, et. al.Karima Khettabi ... Hamid Seridi
14 Jun 2023
Big Data and Cognitive Computing | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Data Stream Classification Based on the Gamma Classifier

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering