Abstract

Pattern mining has been widely used to uncover interesting patterns from data. However, one of its main problems is that it produces too many patterns and many of them are redundant. To reduce the number of redundant patterns and retain overlapping ones, delta-closed pattern pruning was introduced, yet it can only prune subpatterns if they are covered by superpatterns. Such unduly superpatterns need to be pruned. Furthermore, in order to improve the management and interpretation of patterns, pattern summarization is proposed. It renders a small number of patterns that retain the most crucial information. RuleCover algorithm was one of such algorithms. However, it tends to produce over trivial patterns, whereas more interesting and revealing ones may be pruned. To overcome these problems, this paper presents a new algorithm which integrates delta-closed , and RuleCover methods with our other two new algorithms: 1) statistically induced pattern pruning for pruning statistically induced superpatterns by strong subpatterns and 2) AreaCover algorithm for pruning overlapping patterns but retain higher order and high quality patterns with large coverage of the data “area.” Experimental results show that the proposed algorithms produce very compact yet comprehensive knowledge from patterns discovered from relational data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call