The problem of discovering high-utility itemsets (HUIs) in transaction databases, which is an extension of Frequent Itemset Mining, is a commonly encountered mining task. Researchers have proposed algorithms to efficiently mine highly profitable itemsets in customer transaction databases, in which the unit profits of items are fixed. However, this assumption does not reflect the true nature of the utility measure of items in real-life transaction databases, which might vary over time. Moreover, since this important characteristic is ignored by all the current HUI mining algorithms, they are either not applicable to this type of database, or they generate inaccurate results. In addition, the HUI mining algorithms’ traditional limitation is that they produce a huge number of HUIs for users. In this paper, we define the problem of mining a lossless, concise and compact representation of HUIs, called closed HUIs (CHUIs), in dynamic unit profit databases. Based on newly defined of utility measure, a novel algorithm, called iEFIM-Closed, is introduced. This relies on this new utility measure, a novel compact database format to reduce the cost of database scans and increase the efficiency of the mining process. Experimental evaluations show that iEFIM-Closed significantly outperforms state-of-the-art algorithms for mining CHUIs on sparse databases with dynamic profit, and has competitive performance in dense databases in terms of runtime, the cost of database scans and scalability.
Read full abstract