Abstract

Time Series clustering is a domain with several applications spanning various fields. The concept of vector quantization, popularly used in signal processing to approximate a large number of signals, can be used to cluster signals and thereby time series data. Though a popular clustering algorithm such as K-Means is capable of performing vector quantization, the averaging technique to compute centroids in the algorithm is not well suited to handle time series data. The ability of Self Organizing Map algorithm, has, therefore, been explored in this work to perform clustering of time series data by adopting several modifications in the original steps of the algorithm. By initializing the prototype vectors using a farthest neighbors’ approach instead of random initialization and using the dynamic time warping distance measure to calculate similarity between signals, a novel procedure has been proposed to apply the Self Organizing Map algorithm to cluster time series data. The proposed algorithm is first tested on 119 data sets and its performance is compared to that of Agglomerative Clustering and k medoids clustering using 3 validation measures. Next, their scalability is compared by looking at their time of computation on the data sets. Performance of the proposed algorithm in terms of the fluctuations involved due to initialization and the parameters of the algorithm are studied next using 3 more validation measures. The results showcase that the modified Self Organizing Map is not only a better algorithm than Agglomerative Clustering in terms of clustering performance, but also more scalable in terms of taking less time to compute clusters as it performs them in lesser time that k medoids while having similar cluster quality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call