NWP-Miner: Nonoverlapping weak-gap sequential pattern mining

Youxi Wu,Zhu Yuan,Yan Li,Lei Guo,Philippe Fournier-Viger,Xindong Wu

doi:10.1016/j.ins.2021.12.064

Abstract

Nonoverlapping sequential pattern mining (SPM) is a type of SPM with gap constraints that can mine valuable information in sequences. One of the disadvantages of nonoverlapping SPM is that any characters can match with gap constraints. Hence, there can be a significant difference between the trend of a pattern and those of its occurrences. To tackle this issue, we propose nonoverlapping weak-gap sequential pattern (NWP) mining, where characters are divided into two types: weak and strong. This allows discovering frequent patterns more accurately by limiting the gap constraints to match only weak characters. To discover NWPs, we propose NMP-Miner which involves two key steps: support calculation and candidate pattern generation. To efficiently calculate the support of candidate patterns, depth-first search and backtracking strategies based on a simplified Nettree structure are adopted, which effectively reduce the time and space complexities of the algorithm. Moreover, a pattern join approach is applied to effectively reduce the number of candidate patterns. The experimental results show that NWP-Miner is more efficient than other competitive algorithms. More importantly, the case study of time series shows that NWP-Miner can effectively filter out noise patterns and discover more meaningful patterns. Algorithms and datasets can be downloaded fromhttps://github.com/wuc567/Pattern-Mining/tree/master/NWP-Miner.

Full Text