Mining frequent itemsets from streaming transaction data using genetic algorithms

Sikha Bagui,Patrick Stanley

doi:10.1186/s40537-020-00330-9

Abstract

This paper presents a study of mining frequent itemsets from streaming data in the presence of concept drift. Streaming data, being volatile in nature, is particularly challenging to mine. An approach using genetic algorithms is presented, and various relationships between concept drift, sliding window size, and genetic algorithm constraints are explored. Concept drift is identified by changes in frequent itemsets. The novelty of this work lies in determining concept drift using frequent itemsets for mining streaming data, using the genetic algorithm framework. Formulas have been presented for calculating minimum support counts in streaming data using sliding windows. Testing highlighted that the ratio of the window size to transactions per drift was a key to good performance. Getting good results when the sliding window size was too small was a challenge since normal fluctuations in the data could appear to be a concept drift. Window size must be managed in conjunction with support and confidence values in order to achieve reasonable results. This method of detecting concept drift performed well when larger window sizes were used.

Highlights

Today’s digital world is constantly generating data from traffic sensors, health sensors, customer transactions, and various other Internet of Things (IoT) devices
Concept drift detection In streaming data, frequent itemsets for a stable concept would be identified by the set of frequent itemsets remaining constant in both number and content, despite data flowing through the window
This testing highlighted that the ratio of the window size to transactions per drift was a key to good performance

Summary

Introduction

Today’s digital world is constantly generating data from traffic sensors, health sensors, customer transactions, and various other Internet of Things (IoT) devices. Continuous never-ending streams of Big Data are creating new sets of challenges from the perspective of data mining. Mining only static data in snapshots of time is no longer useful. Streaming data, being dynamic or volatile in nature, has changing patterns over time, and this is more technically known as concept drift. Algorithms developed for mining streaming data have to be able to detect and work with concept drifts, the need for new streaming data mining approaches. This work looks at an important data mining technique, frequent itemset mining, applied to streaming transaction data, in the presence of concept drift

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Big Data	Publication Date: Jul 25, 2020
Citations: 9	License type: open-access

R Discovery Prime

R Discovery Prime

Mining frequent itemsets from streaming transaction data using genetic algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data

Lead the way for us

Similar Papers

One-class classifiers with incremental learning and forgetting for data streams with concept drift
Bartosz Krawczyk ... Michał Woźniak
Soft Computing | VOL. 19
Bartosz Krawczyk, et. al.Bartosz Krawczyk ... Michał Woźniak
21 Oct 2014
Soft Computing | VOL. 19

Using Diversity Ensembles with Time Limits to Handle Concept Drift
Robert Van Camp
-
Robert Van CampRobert Van Camp
20 Dec 2016
20 Dec 2016

A novel technique for mining closed frequent itemsets using variable sliding window
Vikas Kumar ... Sangita Rani Satapathy
-
Vikas Kumar, et. al.Vikas Kumar ... Sangita Rani Satapathy
01 Feb 2014
01 Feb 2014

Online Machine Learning from Non-stationary Data Streams in the Presence of Concept Drift and Class Imbalance: A Systematic Review
Abdul Sattar Palli ... Abdul Rehman Gilal
Journal of Information and Communication Technology | VOL. 23
Abdul Sattar Palli, et. al.Abdul Sattar Palli ... Abdul Rehman Gilal
30 Jan 2024
Journal of Information and Communication Technology | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining frequent itemsets from streaming transaction data using genetic algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data