Sliding Window-based Frequent Itemsets Mining over Data Streams using Tail Pointer Table

Bo Jin,Le Wang,Lin Feng

doi:10.1080/18756891.2013.859860

Abstract

AbstractMining frequent itemsets over transaction data streams is critical for many applications, such as wireless sensor networks, analysis of retail market data, and stock market predication. The sliding window method is an important way of mining frequent itemsets over data streams. The speed of the sliding window is affected not only by the efficiency of the mining algorithm, but also by the efficiency of updating data. In this paper, we propose a new data structure with a Tail Pointer Table and a corresponding mining algorithm; we also propose a algorithm COFI2, a revised version of the frequent itemsets mining algorithm COFI (Co-Occurrence Frequent-Item), to reduce the temporal and memory requirements. Further, theoretical analysis and experiments are carried out to prove their effectiveness.

Highlights

Since Agrawal 1 developed the first algorithm Apriori for mining frequent itemsets from static sales dataset in 1994, new algorithms are proposed constantly for various sub-domains of frequent itemsets mining, such as those for traditional frequent itemsets 2, 3, 4, 5, 6 in certain datasets, high utility itemsets 7, 8, 9, 10, 11, frequent itemsets in uncertain datasets 12, 13, 14
We propose a new data structure, called TPT-tree (Tail Pointer Table tree), to store the stream data of a window, it can improve the efficiency of updating data and costs less memory than DST/DSP; and propose a corresponding algorithm, called COFI2, for mining frequent itemsets over data streams
Concluding the above experiments, we can see that our proposed algorithm TPT has achieved a better performance than DST under varied minimum support thresholds and varied batch-sizes, and its advantage is stable along with the accumulation of the data flow process

Summary

Introduction

Since Agrawal 1 developed the first algorithm Apriori for mining frequent itemsets from static sales dataset in 1994, new algorithms are proposed constantly for various sub-domains of frequent itemsets mining, such as those for traditional frequent itemsets 2, 3, 4, 5, 6 in certain datasets, high utility itemsets 7, 8, 9, 10, 11, frequent itemsets in uncertain datasets 12, 13, 14 These approaches could be classified into two categories: level-wise approaches and pattern-Growth approaches. We propose a new data structure, called TPT-tree (Tail Pointer Table tree), to store the stream data of a window, it can improve the efficiency of updating data and costs less memory than DST/DSP; and propose a corresponding algorithm, called COFI2, for mining frequent itemsets over data streams. The organization of this article is as follows: Section 2 discusses related work; Section 3 provides a description of the problem and defines relevant terms; Section 4 introduces a structure TPT-tree and a corresponding algorithm; Section 5 shows the experimental results, and Section 6 gives conclusions

Related work

Data structures

Algorithms of mining frequent itemsets

Description of the problem

Structure of TPT-tree

Modify tail support numbers of TPT

Experimental analyses

Findings

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Computational Intelligence Systems	Publication Date: Jan 1, 2014
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

Sliding Window-based Frequent Itemsets Mining over Data Streams using Tail Pointer Table

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computational Intelligence Systems

Lead the way for us

Similar Papers

DSM-FI: an efficient algorithm for mining frequent itemsets in data streams
Hua-Fu Li ... Man-Kwan Shan
Knowledge and Information Systems | VOL. 17
Hua-Fu Li, et. al.Hua-Fu Li ... Man-Kwan Shan
09 Jan 2008
Knowledge and Information Systems | VOL. 17

A false negative approach to mining frequent itemsets from high speed transactional data streams
Jeffrey Xu Yu ... Aoying Zhou
Information Sciences | VOL. 176
Jeffrey Xu Yu, et. al.Jeffrey Xu Yu ... Aoying Zhou
29 Nov 2005
Information Sciences | VOL. 176

Anytime Frequent Itemset Mining of Transactional Data Streams
Poonam Goyal ... Navneet Goyal
Big Data Research | VOL. 21
Poonam Goyal, et. al.Poonam Goyal ... Navneet Goyal
28 Jul 2020
Big Data Research | VOL. 21

AnyFI: An anytime frequent itemset mining algorithm for data streams
Poonam Goyal ... Jagat Sesh Challa
-
Poonam Goyal, et. al.Poonam Goyal ... Jagat Sesh Challa
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sliding Window-based Frequent Itemsets Mining over Data Streams using Tail Pointer Table

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computational Intelligence Systems