Dynamic Time Warping (DTW) is a widely used distance measurement in time series clustering. DTW distance is invariant to time series phase perturbations but has a quadratic complexity. An effective acceleration method must reduce the DTW utilization ratio during time series clustering; for example, TADPole uses both upper and lower bounds to prune off a large ratio of expensive DTW calculations. To further reduce the DTW utilization ratio, we find that the linear-complexity L1-norm distance (Manhattan distance) is effective enough when the time series only comprise small phase perturbations. Therefore, we propose a novel time series clustering by Minimizing Dynamic Time Warping Utilization (MiniDTW) algorithm to accelerate time series clustering. In MiniDTW, the dataset is first greedily summarized into seed clusters, which comprise time series of small phase perturbations, by L1-norm distance. Then, we develop a new Sparse Symmetric Non-negative Matrix Factorization (SSNMF) algorithm, which factorizes the DTW distance matrix of seed cluster centers, to merge the seed clusters into the final clusters. The experiments on UCR time series datasets demonstrate that MiniDTW, pruning 98.52% of the DTW utilization, is better than the counterpart method, TADPole, which only prunes 75.56% of the DTW utilization; and thus MiniDTW is 10 times faster than TADPole.
Read full abstract