An efficient implementation of k-means clustering for time series data with DTW distance

Duong Tuan Anh,Le Huu Thanh

doi:10.1504/ijbidm.2015.071311

Duong Tuan Anh, Le Huu Thanh

https://doi.org/10.1504/ijbidm.2015.071311

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Time series clustering is one of the crucial tasks in time series data mining. The most popular method in time series clustering is k-means algorithm due to its simplicity and flexibility. So far, k-means for time series clustering has been most used with Euclidean distance. Dynamic time warping DTW distance measure has increasingly been used as a similarity measurement for various data mining tasks in place of traditional Euclidean distance due to its superiority in sequence-alignment flexibility. However, there exist some difficulties in clustering with DTW distance, for example, the problem of shape averaging in DTW or the problem of speeding up DTW distance calculation. In this paper, we compare the performance of the three shape averaging methods in DTW: nonlinear alignment and averaging filter NLAAF, prioritised shape averaging PSA and DTW barycenter averaging DBA and propose an efficient method to implement k-means clustering for time series data with DTW distance. In our method, we choose to use DBA method for shape-based time series averaging, apply early abandoning method for speeding up DTW distance calculation and median-based method for determining initial centroids for k-means clustering. The experimental results on benchmark datasets validate our proposed implementation method for time series k-means clustering with DTW.

Full Text