MEMU: More Efficient Algorithm to Mine High Average-Utility Patterns With Multiple Minimum Average-Utility Thresholds

Jerry Chun-Wei Lin,Philippe Fournier-Viger,Shifeng Ren

doi:10.1109/access.2018.2801261

Abstract

High average-utility itemsets mining (HAUIM) is an emerging topic in data mining. Compared to traditional high utility itemset mining, HAUIM more fairly measures the utility of itemsets by considering their lengths (number of items). Many previous studies have presented algorithms to efficiently mine high average-utility itemsets (HAUIs). Most of these algorithms, however, only mine HAUIs using a single minimum high average-utility threshold, which limits their usefulness to analyze real data. This is a problem because different items are not equally important to the user. The importance of an item can be expressed for example in terms of weights, interestingness or unit profit. In the past, a baseline algorithm called HAUIM-MMAU was presented to mine HAUIs using multiple minimum high average-utility thresholds. However, it uses a generate-and-test approach to mine HAUIs using a level-wise approach, which is time consuming. In this paper, we propose an efficient algorithm to discover HAUIs based on the average-utility list structure. A tighter upper-bound model is used to reduce the search space instead of the one used in traditional HAUIM, which is called the auub model. Three pruning strategies are also respectively developed to increase the performance HAUIs. Experiments show that the proposed algorithm outperforms the state-of-the-art HAUIM-MMAU algorithm in terms of runtime, memory usage, number of candidates, and scalability.

Full Text