Abstract

High utility itemsets (HUIs) mining is an emerging area of data mining which discovers sets of items generating a high profit from transactional datasets. In recent years, several algorithms have been proposed for this task. However, most of them do not consider the on-shelf time period of items and negative utility of items. High on-shelf utility itemset (HOUIs) mining is more difficult than traditional HUIs mining because it deals with on-shelf-based time period and negative utility of items. Moreover, most algorithms need minimum utility threshold ( min_util ) to find rules. However, specifying the appropriate min_util threshold is a difficult problem for users. A smaller min_util threshold may generate too many rules and a higher one may generate a few rules, which can degrade performance. To address these issues, a novel top-k HOUIs mining algorithm named TKOS ( T op- K high O n- S helf utility itemsets miner) is proposed which considers on-shelf time period and negative utility. TKOS presents a novel branch and bound-based strategy to raise the internal min_util threshold efficiently. It also presents two pruning strategies to speed up the mining process. In order to reduce the dataset scanning cost, we utilize transaction merging and dataset projection techniques. Extensive experiments have been conducted on real and synthetic datasets having various characteristics. Experimental results show that the proposed algorithm outperforms the state-of-the-art algorithms. The proposed algorithm is up to 42 times faster and uses up-to 19 times less memory compared to the state-of-the-art KOSHU. Moreover, the proposed algorithm has excellent scalability in terms of time periods and the number of transactions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call