Existing algorithms for mining high utility patterns over a data stream are two-phase algorithms that are not scalable due to the large number of candidates generation in the first phase, particularly when the minimum utility threshold is low. Moreover, in the second phase, the algorithm needs to scan the database again to find out actual utility for candidates. In this paper, we propose one-phase algorithm SOHUPDS+ to mine high utility itemsets in the current sliding window of the data stream with respect to absolute or relative minimum utility threshold. To facilitate SOHUPDS+, we propose a data structure IUDataListSW+ , which stores and maintains utility and upper-bound values of the items in the current sliding window when sliding window advances. In addition, we propose a transaction merging strategy, called BitmapTransactionMerging , which saves execution time for utility and upper-bound values computations in denser datasets. Moreover, we propose update strategies to utilize mined high utility patterns from the previous sliding window to update high utility patterns in the current sliding window. The results of experiments illustrate that SOHUPDS+ is more efficient than the state-of-the-art algorithms in terms of execution time as well as memory usage in most of the experiments on various datasets.
Read full abstract