Abstract

High Average-Utility Itemset (HAUI) mining is an emerging pattern mining technique to extract meaningful patterns from a transaction dataset. In the past, several HAUI mining algorithms have been developed with efficient upper-bounds and pruning strategies. However, all these algorithms use a single value of the minimum average-utility threshold for all the itemsets, which limits their applicability to real-life datasets. In order to address this issue, several HAUI mining algorithms with multiple average-utility thresholds have been developed that process the items in ascending order of their minimum average-utility threshold. However, it makes them inapplicable on traditional HAUI mining algorithms. Moreover, the perturbation in preference of items may reduce the performance of the algorithms. This paper presents an HAUI mining algorithm named Generalized High Average-utility Itemset Miner (GHAIM) that processes the items in ascending order of their Average Utility Upper-Bound (AUUB) like the traditional HAUI mining algorithms. A new approach named suffix minimum average-utility is proposed to retain the downward closure property of AUUB and several pruning methods. Besides, a compact list structure is also proposed to mine the HAUIs in one phase. Several pruning methods have been introduced for reducing search space and improving efficiency. Extensive experiments were performed with different sparse and dense types of datasets to determine GHAIM efficiency compared to two existing algorithms. It was observed from the results that GHAIM outperforms both the current algorithms in run time, memory consumption, number of candidate itemsets, and scalability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.