Evaluation of Hierarchical Structures for Time Series Data

Ruizhe Ma,Zongmin Ma,Soukaina Filali Boubrahimi,Rafal A Angryk

doi:10.1109/icbda49040.2020.9101255

Abstract

Clustering is an effective unsupervised machine learning method that can be used as a stand-alone heuristic or as a part of a data mining process. The goal of clustering analysis is to partition data into groups with high intra-cluster association, and low inter-cluster association. Hierarchical clustering requires minimal parameters, has flexibility with similarity measure, and has strong visualization power, all of which makes it ideal for exploratory analysis. Hierarchical clustering is a particular branch of clustering algorithms where the results are not given as partitions, but rather a nested structure, which can represent the ordering among elements within a dataset. The study of the performance of hierarchical structure on time series data is limited. In this paper, we examine the hierarchical structure of time series datasets. The most popular hierarchical structured clustering heuristics include the Hierarchical Agglomerative Clustering, which is distance based; and Ordering Points To Identify the Clustering Structure, which is density based. Both share many similar characteristics and are suitable for time series data processing. We examine the performance of different hierarchical clustering algorithms with time series both internally as well as externally.

Full Text