More Efficient Algorithm for Mining Frequent Patterns with Multiple Minimum Supports

Wensheng Gan,Philippe Fournier-Viger,Han-Chieh Chao,Jerry Chun-Wei Lin

doi:10.1007/978-3-319-39937-9_1

Abstract

Frequent pattern mining (FPM) is an important data mining task, having numerous applications. However, an important limitation of traditional FPM algorithms, is that they rely on a single minimum support threshold to identify frequent patterns (FPs). As a solution, several algorithms have been proposed to mine FPs using multiple minimum supports. Nevertheless, a crucial problem is that these algorithms generally consume a large amount of memory and have long execution times. In this paper, we address this issue by introducing a novel algorithm named efficient discovery of Frequent Patterns with Multiple minimum supports from the Enumeration-tree (FP-ME). The proposed algorithm discovers FPs using a novel Set-Enumeration-tree structure with Multiple minimum supports (ME-tree), and employs a novel sorted downward closure (SDC) property of FPs with multiple minimum supports. The proposed algorithm directly discovers FPs from the ME-tree without generating candidates. Furthermore, an improved algorithms, named \({\text {FP-ME}}_\mathrm{DiffSet}\), is also proposed based on the DiffSet concept, to further increase mining performance. Substantial experiments on real-life datasets show that the proposed approaches not only avoid the “rare item problem”, but also efficiently and effectively discover the complete set of FPs in transactional databases.

Full Text