High-utility itemset mining (HUIM) is a major contemporary data mining issue. It is different from frequent itemset mining (FIM), which only considers the frequency factor. HUIM applies both the quantity and profit factors to be used to reveal the most profitable products. Several previous approaches have been proposed to mine high-utility itemsets (HUIs) and most of them have to handle the exponential search space for discovering HUIs when the number of distinct items and the size of the database are both very large. Therefore, two evolutionary computation (EC) techniques, genetic algorithm (GA) and particle swarm optimization (PSO), were previously proposed to mine HUIs. In these studies, GAs and PSOs also could obtain the huge amount of high-utility items in a limitation time. In this paper, a novel algorithm based on the other evolutionary computation technique, ant colony optimization (ACO), is proposed to resolve this issue. Unlike GAs and PSOs, ACOs produce a feasible solution in a constructive way. They can avoid generating unreasonable solutions as much as possible. Thus, a well-defined ACO approach can always obtain suitable solutions efficiently. An ant colony system (ACS), which is extended from ACO and consists of high-utility itemset mining by ACS (HUIM-ACS), is proposed to efficiently find HUIs. In general, an EC algorithm cannot make sure the provided solution is the global optimal solution. But the designed HUIM-ACS algorithm maps the completed solution space into the routing graph and includes two pruning processes. Therefore, it guarantees that it obtains all of the HUIs when there is no candidate edge from the starting point. In addition, HUIM-ACS does not estimate the same feasible solution again in its process in order to avoid wasting computational resource. Substantial experiments on real-life datasets show that the proposed algorithm outperforms the other heuristic algorithms for mining HUIs in terms of the number of discovered HUIs, and convergence.
Read full abstract