Abstract
Analyzing customer transactions to discover high-utility itemsets is a popular task, which consists of finding the sets of items that are purchased together and yield a high profit. However, many studies assume that transactional data is static while in real-life, it changes over time. For example, the unit profits of items may vary from one week to another because sale prices and production costs may change. Many algorithms for mining high-utility itemsets (HUI) ignore this important property and thus are inapplicable or generate inaccurate results on real data. To address this issue, this paper proposes a novel algorithm named Multi-Core HUI Miner (MCH-Miner). It adapts techniques introduced in the iMEFIM algorithm to run on a parallel multi-core architecture to efficiently mine HUIs in dynamic transaction databases. An empirical evaluation shows that in most cases, MCH-Miner is significantly faster than iMEFIM, and that the cost of database scans is reduced.
Highlights
A key problem in the field of data mining is frequent itemset mining (FIM), which was introduced in 1994 by Agrawal and Skirant [1]
FIM algorithms rely on the support framework to discover these itemsets that have occurrence frequencies that are no less than a minimum support threshold
FIM treats all items of a transaction database that is as having the same importance
Summary
A key problem in the field of data mining is frequent itemset mining (FIM), which was introduced in 1994 by Agrawal and Skirant [1]. An HUIM algorithm operates on databases with utility information to discover itemsets having a utility that is no less than a user-specified threshold, named the minimum utility (minutil). Itemsets meeting this constraint are called high-utility itemsets (HUI). To provide an efficient solution to this problem, Nguyen et al has recently proposed a framework [12] that overcomes this unrealistic assumption and makes HUI mining in real-world databases with dynamic profit values possible. The authors proposed a compact format to store the transactions along with all their utility information [12] Using this framework, all the currently available HUIM algorithms can be applied to dynamic profit databases and generate accurate results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have