Abstract

High Utility Itemset Mining (HUIM) is the process of locating itemsets that are profitable and useful to users. One of the key flaws in HUIM is that as the length of the itemset increases, the utility also increases. The true utility/profit of the itemset is not revealed in HUIM. High Average Utility Itemset mining overcomes the limitations of HUIM by taking the length of the itemset into account when estimating the utility. Existing pruning methods used for eliminating weak candidates overestimate the average usefulness of itemsets, causing the mining process to slow down. To prevent processing unpromising candidate itemsets and efficiently reduce the search space and processing time, the proposed methodology employs Upper Bound using Remaining Items Utility, Maximum Itemset Utility, and Sum of Maximum Utility in a Transaction. It also uses the multithreaded parallel approach to reduce the processing time. The H-Map-based data structure (H-Map) used for storing the utility values reduces the lookup time and joins used for itemset extension compared to existing state-of-the-art High Average Utility Itemset mining algorithms. The performance of the proposed work is evaluated in terms of memory usage and the time taken for processing. The proposed work increases the overall efficiency of the system employing effective pruning algorithms for pruning poor candidate itemsets and an efficient data structure for storing utility values.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call