The inherent time complexity and an efficient algorithm for subsequence matching problem

Zemin Chao,Hong Gao,Yinan An,Jianzhong Li

doi:10.14778/3523210.3523222

Abstract

Subsequence matching is an important and fundamental problem on time series data. This paper studies the inherent time complexity of the subsequence matching problem and designs a more efficient algorithm for solving the problem. Firstly, it is proved that the subsequence matching problem is incomputable in time O ( n 1-δ ) even allowing polynomial time preprocessing if the hypothesis SETH is true, where n is the size of the input time series and 0 ≤ δ < 1, i.e., the inherent complexity of the subsequence matching problem is ω ( n 1-δ ). Secondly, an efficient algorithm for subsequence matching problem is proposed. In order to improve the efficiency of the algorithm, we design a new summarization method as well as a novel index for series data. The proposed algorithm supports both Euclidean Distance and DTW distance with or without z -normalization. Experimental results show that the proposed algorithm is up to about 3 ~ 10 times faster than the state of art algorithm on the constrained z -normalized Euclidean Distance and DTW distance, and is up to 7 ~ 12 times faster on Euclidean Distance.

Full Text