Abstract

In the field of data mining, the topic of high-utility itemset mining (HUIM) has recently gained a lot of attention from researchers as it takes many factors into account that are useful for decision-making by retail managers. In the past, many algorithms have been presented for HUIM but most of them suffer from the limitation of using a single minimum utility threshold to identify high-utility itemsets (HUIs). For real-life applications, finding itemsets using a single threshold is inadequate and unfair since each item is different. Hence, the diversity or importance of each item should be considered. This paper proposes a solution to this issue by defining the novel task of HUIM with multiple minimum utility thresholds (named as HUIM-MMU). This task lets users specify a different minimum utility threshold for each item to identify more useful and specific HUIs, which would generate more profits when compared to HUIs discovered based on a single minimum utility threshold. The HUI-MMU algorithm is designed to mine HUIs in a level-wise manner. The sorted downward closure (SDC) property and the least minimum utility (LMU) concept are developed to avoid a combinatorial explosion for identifying HUIs and to ensure the completeness and correctness of HUI-MMU for discovering HUIs. Meanwhile, two improved algorithms, namely HUI-MMUTID and HUI-MMUTE, are presented based on the TID-index and EUCP strategies. Those strategies can be used to speed up the mining performance to discover HUIs. Substantial experiments on both real-life and synthetic datasets show that the designed algorithms can efficiently and effectively discover the complete set of HUIs in databases by considering multiple minimum utility thresholds.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.