Nowadays the demands for managing and analyzing substantially increasing collections of time series are becoming more challenging. Subsequence matching, as a core subroutine in time series analysis, has drawn significant research attention. Most of the previous works only focus on matching the subsequences with equal length to the query. However, many scenarios require support for efficient variable-length subsequence matching. In this paper, we propose a new representation, Uniform Piecewise Aggregate Approximation (UPAA) with the capability of aligning features for variable-length time series while remaining the lower bounding property. Based on UPAA, we present a compact index structure by grouping adjacent subsequences and similar subsequences respectively. Moreover, we propose an index pruning algorithm and a data filtering strategy to efficiently support variable-length subsequence matching without false dismissals. The experiments conducted on both real and synthetic datasets demonstrate that our approach achieves considerably better efficiency, scalability, and effectiveness than existing approaches.
Read full abstract