Abstract

High-utility itemset mining (HUIM) is considered as an emerging approach to detect the high-utility patterns from databases. Most existing algorithms of HUIM only consider the itemset utility regardless of the length. This limitation raises the utility as a result of a growing itemset size. High average-utility itemset mining (HAUIM) considers the size of the itemset, thus providing a more balanced scale to measure the average-utility for decision-making. Several algorithms were presented to efficiently mine the set of high average-utility itemsets (HAUIs) but most of them focus on handling static databases. In the past, a fast-updated (FUP)-based algorithm was developed to efficiently handle the incremental problem but it still has to re-scan the database when the itemset in the original database is small but there is a high average-utility upper-bound itemset (HAUUBI) in the newly inserted transactions. In this paper, an efficient framework called PRE-HAUIMI for transaction insertion in dynamic databases is developed, which relies on the average-utility-list (AUL) structures. Moreover, we apply the pre-large concept on HAUIM. A pre-large concept is used to speed up the mining performance, which can ensure that if the total utility in the newly inserted transaction is within the safety bound, the small itemsets in the original database could not be the large ones after the database is updated. This, in turn, reduces the recurring database scans and obtains the correct HAUIs. Experiments demonstrate that the PRE-HAUIMI outperforms the state-of-the-art batch mode HAUI-Miner, and the state-of-the-art incremental IHAUPM and FUP-based algorithms in terms of runtime, memory, number of assessed patterns and scalability.

Highlights

  • Association-rule mining (ARM) [1, 6,7,8, 15] is the most popular method to discover the relationship among the itemsets from databases, where the potential and implicit information can be discovered and revealed

  • Each case has its own developed model to efficiently update and maintain the discovered frequent itemsets according to whether it exists or not in the original database or in the newly inserted transactions. This approach has been utilized to ARM [5, 16], High-utility itemset mining (HUIM) [22, 23], and High average-utility itemset mining (HAUIM) [33, 36]

  • We focus on maintaining the discovered high average-utility itemsets (HAUIs) for transaction insertion, which is a real case in the retail industry

Read more

Summary

Introduction

Association-rule mining (ARM) [1, 6,7,8, 15] is the most popular method to discover the relationship among the itemsets from databases, where the potential and implicit information can be discovered and revealed. Hong et al designed the TPAU algorithm [17] to identify the set of high averageutility itemsets (HAUIs) It uses the average-utility-upperbound (auub) model to estimate the upper-bound value on the high average-utility-upper-bound itemset (HAUUBI), maintaining the downward closure property which reduce the size of the candidates. This approach is capable of mining the set of HAUIs without creating new candidates It is based on the pattern-growth approach but each node in the tree structure keeps an additional array for further information, i.e., the quantity values of its prefix items in the path. An efficient average-utility (AU)-list framework [31] was developed to speed up mining performance of the HAUIs. Based on the simple join operation ,the required information of HAUIs can be retrieved and discovered without candidate generation. The performance is measured in terms of runtime, memory, number of assessed patterns, and scalability

High average-utility itemset mining
Incremental mining
Preliminaries and problem statement
Proposed PRE-HAUIMI framework for transaction insertion
The utilized pre-large concept
Proposed PRE-HAUIMI algorithm
Complexity analysis
A running example
Experimentation and findings
Runtime
Memory usage
Number of assessed patterns
Scalability
Conclusion and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call