Abstract

Frequent itemset mining (FIM) is increasingly important in fundamental data mining techniques. However, the applicability of existing FIM methods is limited, mainly because of their performance. The expected performance improvement is limited owing to the exploitation of only a single thread, despite numerous efficient single-threaded FIM methods being proposed. Numerous parallel FIM methods have been devised using graphic processing units (GPU) or multicore central processing units (CPUs) to overcome the shortcomings of these methods. However, when extracting patterns from large amounts of data, multi-threaded FIM methods exhibit a similar performance tendency to single-threaded FIM methods, because of their large memory footprints and computations. Hence, we propose GMiner++, a memory-efficient GPU-based FIM method equipped with several GPUs. We propose a sub-database of the same size called bit array blocks, which contains pre-calculated bit arrays of F1∪P(IK). These bit arrays are repeatedly exploited during mining tasks using an elegant probabilistic model. GMiner++ can obtain frequent patterns and use several GPUs only by using the bit array blocks and the occurrence update scheme. The proposed method decreased redundant computations using pre-calculated bit arrays with bit array blocks. In addition, GMiner++ does not create intermediate data during mining tasks to increase robustness and reduce memory footprints. Simulation results demonstrate that GMiner++ outperformed existing FIM methods concerning performance and scalability with increasing robustness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call