Efficient Time Series Clustering by Minimizing Dynamic Time Warping Utilization

Borui Cai,Guanghui Li,Najmeh Samadiani,Guangyan Huang,Chi-Hung Chi

doi:10.1109/access.2021.3067833

Borui Cai, Guanghui Li + Show 3 more

Open Access

https://doi.org/10.1109/access.2021.3067833

Copy DOI

Abstract

Dynamic Time Warping (DTW) is a widely used distance measurement in time series clustering. DTW distance is invariant to time series phase perturbations but has a quadratic complexity. An effective acceleration method must reduce the DTW utilization ratio during time series clustering; for example, TADPole uses both upper and lower bounds to prune off a large ratio of expensive DTW calculations. To further reduce the DTW utilization ratio, we find that the linear-complexity L1-norm distance (Manhattan distance) is effective enough when the time series only comprise small phase perturbations. Therefore, we propose a novel time series clustering by Minimizing Dynamic Time Warping Utilization (MiniDTW) algorithm to accelerate time series clustering. In MiniDTW, the dataset is first greedily summarized into seed clusters, which comprise time series of small phase perturbations, by L1-norm distance. Then, we develop a new Sparse Symmetric Non-negative Matrix Factorization (SSNMF) algorithm, which factorizes the DTW distance matrix of seed cluster centers, to merge the seed clusters into the final clusters. The experiments on UCR time series datasets demonstrate that MiniDTW, pruning 98.52% of the DTW utilization, is better than the counterpart method, TADPole, which only prunes 75.56% of the DTW utilization; and thus MiniDTW is 10 times faster than TADPole.

Highlights

Time series is one of the most important data in the modern data-driven society and can be generated from nearly every aspects in the daily life [1]
We propose a novel time series clustering by Minimizing Dynamic Time Warping Utilization (MiniDTW) algorithm to accelerate time series clustering
Since MiniDTW is proposed to accelerate time series clustering by reducing the DTW utilization ratio, TADPole [5] is the counterpart method most related to ours because it aims at accelerating time series clustering by pruning a fraction of DTW distance use based on faster DTW upper/lower (L1-norm/LB_Keogh [35]) bounds

Summary

INTRODUCTION

Time series is one of the most important data in the modern data-driven society and can be generated from nearly every aspects in the daily life [1]. Time series clustering is a basic technique for analyzing time series It can discover the underlying structure of the chaotic/raw datasets without the ground truth labels. To accelerate time series clustering with DTW distance, some methods reduce the DTW utilization ratio by pruning unnecessary DTW calculations with fast calculated upper/lower bounds of DTW, such as TADPole [5]. To significantly reduce the DTW utilization ratio for the acceleration, we only apply the complex DTW calculation on a summarized time series dataset (rather than the original dataset). To ‘‘greedily’’ reduce the DTW utilization ratio, we summarize the dataset into natural-shaped seed clusters with L1-norm distance. In MiniDTW, the original dataset is first ‘‘greedily’’ summarized as a small amount of natural-shaped seed clusters with the efficient L1-norm distance. MiniDTW minimizes DTW utilization ratio by dataset summarization with the linear-complexity L1-norm distance.

RELATED WORK

L1-NORM DISTANCE AND DTW DISTANCE

PROBLEM DEFINITION

THE PROPOSED METHOD

DATASET SUMMARIZATION WITH L1-NORM DISTANCE

MERGE THE TIME SERIES SEED CLUSTERS

9: Initialize clusters as K empty sets

EVALUATION

EXPERIMENT SETUP

ACCURACY ANALYSIS

EFFICIENCY ANALYSIS

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2021
Citations: 45	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Efficient Time Series Clustering by Minimizing Dynamic Time Warping Utilization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

An efficient implementation of anytime k-medoids clustering for time series under dynamic time warping
Van The Huy ... Duong Tuan Anh
-
Van The Huy, et. al.Van The Huy ... Duong Tuan Anh
08 Dec 2016
08 Dec 2016

Fuzzy clustering of time series data using dynamic time warping distance
Hesam Izakian ... Iqbal Jamal
Engineering Applications of Artificial Intelligence | VOL. 39
Hesam Izakian, et. al.Hesam Izakian ... Iqbal Jamal
17 Jan 2015
Engineering Applications of Artificial Intelligence | VOL. 39

On combining Websensors and DTW distance for kNN Time Series Forecasting
Ricardo M Marcacini ... Julio C Carnevali
-
Ricardo M Marcacini, et. al.Ricardo M Marcacini ... Julio C Carnevali
01 Dec 2016
01 Dec 2016

A Hybrid DTW Based Method for Integration Analysis of Time Series Data
Veselka Boeva ... Elena Kostadinova
-
Veselka Boeva, et. al.Veselka Boeva ... Elena Kostadinova
01 Sep 2009
01 Sep 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Time Series Clustering by Minimizing Dynamic Time Warping Utilization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions