Abstract

Traditional sequential pattern mining methods were designed for symbolic sequence. As a collection of measurements in chronological order, a time series needs to be discretized into symbolic sequences, and then users can apply sequential pattern mining methods to discover interesting patterns in time series. The discretization will not only cause the loss of some important information, which partially destroys the continuity of time series, but also ignore the order relations between time-series values. Inspired by order-preserving matching, this article explores a new method called order-preserving sequential pattern (OPP) mining, which does not need to discretize time series into symbolic sequences and represents patterns based on the order relations of time series. An inherent advantage of such representation is that the trend of a time series can be represented by the relative order of the values underneath time series. We propose an OPP-Miner algorithm to mine frequent patterns in time series with the same relative order. OPP-Miner employs the filtration and verification strategies to calculate the support and uses the pattern fusion strategy to generate candidate patterns. To compress the result set, we also study to find the maximal OPPs. Experimental results validate that OPP-Miner is not only efficient but can also discover similar subsequences in time series. In addition, case studies show that our algorithms have high utility in analyzing the COVID-19 epidemic by identifying critical trends and improve the clustering performance. The algorithms and data can be downloaded from https://github.com/wuc567/Pattern-Mining/tree/master/OPP-Miner.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call