Abstract

Frequent-regular pattern mining has attracted recently many works. Most of the approaches focus on discovering a complete set of patterns under the user-given support and regularity threshold constraints. This leads to several quantitative and qualitative drawbacks. First, it is often difficult to set appropriate support threshold. Second, algorithms produce a huge number of patterns, many of them being redundant. Third, most of the patterns are of very small size and it is arduous to extract interesting relationship among items. To reduce the number of patterns a common solution is to consider the desired number k of outputs and to mine the top-k patterns. In addition, this approach does not require to set a support threshold. To cope with redundancy and interestingness relationship among items, we suggest to focus on closed patterns and introduce a minimal length constraint. We thus propose to mine the top-k frequent-regular closed patterns with minimal length. An efficient single-pass algorithm, called TFRC-Mine, and a new compact bit-vector representation which allows to prune uninteresting candidate, are designed. Experiments show that the proposed algorithm is efficient to produce longer – non redundant – patterns, and that the new data representation is efficient for both computational time and memory usage.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call