The classic problems in itemset mining involve finding frequent itemsets and high-utility itemsets. However, frequent itemset mining has the disadvantage of not paying attention to the profit of products, while high-utility itemset mining does not address the issue of the cost price of the products. Therefore, neither can locate products with high-efficiency value on investment. To overcome these problems, the high-efficiency itemset mining (HEIM) problem was proposed. Despite its practicality, this issue has received little attention. The algorithms proposed to exploit high-efficiency itemsets (HEI) still use ineffective strategies on dense databases and unstrict upper bounds, requiring a lot of time and memory. To address the current issues with HEIM, the paper proposes tight upper bounds for the early pruning of candidates. Several techniques are also proposed, such as combining similar transactions and saving promising transaction locations, to reduce the cost of database scanning. Finally, the techniques are combined to propose a novel way to implement the MHEI (an efficient strategy for Mining High-Efficiency Itemsets in quantitative databases) to optimize the HEIM process. The experimental process also shows that the proposed algorithm has performance better than the state-of-the-art algorithm in HEIM.
Read full abstract