Mining Approximate Frequent Itemsets Using Pattern Growth Approach

Shariq Bashir,Daphne Teck Ching Lai

doi:10.5755/j01.itc.50.4.29060

Shariq Bashir, Daphne Teck Ching Lai

Open Access

https://doi.org/10.5755/j01.itc.50.4.29060

Copy DOI

Abstract

Approximate frequent itemsets (AFI) mining from noisy databases are computationally more expensive than traditional frequent itemset mining. This is because the AFI mining algorithms generate large number of candidate itemsets. This article proposes an algorithm to mine AFIs using pattern growth approach. The major contribution of the proposed approach is it mines core patterns and examines approximate conditions of candidate AFIs directly with single phase and two full scans of database. Related algorithms apply Apriori-based candidate generation and test approach and require multiple phases to obtain complete AFIs. First phase generates core patterns, and second phase examines approximate conditions of core patterns. Specifically, the article proposes novel techniques that how to map transactions on approximate FP-tree, and how to mine AFIs from the conditional patterns of approximate FP-tree. The approximate FP-tree maps transactions on shared branches when the transactions share a similar set of items. This reduces the size of databases and helps to efficiently compute the approximate conditions of candidate itemsets. We compare the performance of our algorithm with the state of the art AFI mining algorithms on benchmark databases. The experiments are analyzed by comparing the processing time of algorithms and scalability of algorithms on varying database size and transaction length. The results show pattern growth approach mines AFIs in less processing time than related Apriori-based algorithms.

Highlights

Mining frequent itemsets from databases is an important data mining task
Our proposed algorithm mines approximate frequent itemsets (AFI) using the concept of core patterns [10,11] by exploring complete search space of candidate itemsets
We analyze the performance of AFI mining algorithms with the following three aspects. _ In first aspect, we compare all algorithms in term of how much processing time the algorithms consume for mining complete set of AFIs. _ In second aspect, we compare the performance of algorithms on varying database size

Summary

Introduction

Mining frequent itemsets from databases is an important data mining task. It has many practical applications including document clustering [15, 40], social network analysis [23, 34], market basked analysis [17], fraud detection [14], bioinformatics [13, 28, 33], mining patterns from web logs [22, 38]. The transactions 10, 20, and 50 contain three out of four items of efcb and every single item of (efcb) is appeared in at least two transactions (10, 20, and 50) This approximate match mining concept is appealing in this way that it discovers long length frequent itemsets. _ To examine AFI conditions of core patterns, the Apriori-based algorithm scans the original database multiple times for calculating supports of itemsets and items.

Related Work

Design and Construction

Mining Approximate Frequent Itemsets from Apx-FP-tree

Experiments

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information Technology and Control	Publication Date: Dec 16, 2021
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

Mining Approximate Frequent Itemsets Using Pattern Growth Approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information Technology and Control

Lead the way for us

Similar Papers

Mining Approximate Frequent Itemsets In the Presence of Noise: Algorithm and Analysis
Jinze Liu ... Xing Sun
-
Jinze Liu, et. al.Jinze Liu ... Xing Sun
20 Apr 2006
20 Apr 2006

Mining Approximate Frequent Itemsets from Noisy Data
Jinze Liu ... A Nobel
-
Jinze Liu, et. al. Jinze Liu ... A Nobel
27 Nov 2005
27 Nov 2005

An Efficient Mining for Approximate Frequent Items in Protein Sequence Database
...
Journal of Emerging Technologies in Web Intelligence | VOL. 6
, et. al. ...
08 Jan 2014
Journal of Emerging Technologies in Web Intelligence | VOL. 6

UP-GNIV: an expeditious high utility pattern mining algorithm for itemsets with negative utility values
Kannimuthu Subramanian ... Premalatha Kandhasamy
International Journal of Information Technology and Management | VOL. 14
Kannimuthu Subramanian, et. al.Kannimuthu Subramanian ... Premalatha Kandhasamy
01 Jan 2015
International Journal of Information Technology and Management | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining Approximate Frequent Itemsets Using Pattern Growth Approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information Technology and Control