Abstract

In the field of data mining, high utility itemset mining (HUIM) is a relevant mining task, with the aim of analyzing customer transaction databases. HUIM consists of exploiting the set of items that are often purchased together and yield high profit value. In real-world applications, transaction databases often come with item categorization, stored in a taxonomy. Items in these databases can be clustered into specific categories at higher levels of abstraction. Extracting and analyzing itemsets discovered from different levels of abstraction can provide more useful insights into customer behaviors. However, considering item taxonomy increases the problem’s complexity, hence prolonging the execution time needed to explore the search space. Parallelism is thus employed to address this drawback, but previous approaches are not efficient as they only adopt simple scheduling strategies or do not utilize the full capabilities of a multi-core processor. This work introduces three new efficient strategies to significantly boost the performance of the multi-level high utility itemset mining task using multi-core processing. Two new algorithms, called MCML+ and MCML++, are also proposed by adopting the suggested strategies. Extensive experiments on several large databases show that our proposed algorithms have better performance compared to previous approaches in terms of running time and scalability, up to 4.0 times better than the previous parallelized algorithm, the MCML-Miner algorithm; and over 9.0 times faster than the original sequential algorithm, the MLHUI-Miner algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call