Abstract

Mining frequent sequences in large databases has been an important research topic. The main challenge of mining frequent sequences is the high processing cost due to the large amount of data. In this paper, we propose a novel strategy to find all the frequent sequences without having to compute the support counts of non-frequent sequences. The previous works prune candidate sequences based on the frequent sequences with shorter lengths, while our strategy prunes candidate sequences according to the non-frequent sequences with the same lengths. As a result, our strategy can cooperate with the previous works to achieve a better performance. We then identify three major strategies used in the previous works and combine them with our strategy into an efficient algorithm. The novelty of our algorithm lies in its ability to dynamically switch from a previous strategy to our new strategy in the mining process for a better performance. Experiment results show that our algorithm outperforms the previous ones under various parameter settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call