A Dynamic Itemset Counting Based Two-Phase Algorithm for Mining High Utility Itemsets

B Anup Bhat,M Geetha,S V Harish

doi:10.1109/indicon45594.2018.8987024

Abstract

High Utility Itemset Mining (HUIM) aids in the discovery of itemsets based on quantity and unit price of the items from a transactional database. Since its inception, HUIM has evolved as a generalized form of Frequent Itemset Mining (FIM). Unlike the support of an itemset which is antimonotone and is exploited in the algorithms for mining frequent itemsets, the utility measure is neither antimonotone nor monotone. This makes the problem of mining High Utility Itemsets (HUIs) interesting. In the current study, a novel method based on Dynamic Itemset Counting (DIC) has been proposed to optimize the Apriori-like Two-Phase (TP) algorithm for mining HUIs. Although, the TP algorithm uses antimonotonicity of Transaction Weighted Utility (TWU) of itemsets to prune the search space, the candidates are generated in a level-wise manner. This requires multiple database scans to test the candidates. The proposed method tests and generates higher order candidates at different stops during the database scan and segregates the itemsets for further evaluation. Experiments performed on real-time datasets show significant improvement in execution time of the DIC method when compared to the TP algorithm.

Full Text