Time series clustering serves as a potent data mining method, facilitating the analysis of an extensive array of time series data without the prerequisite of any prior knowledge. It finds wide-ranging use across various sectors, including but not limited to, financial and medical data analysis, and sensor data processing. Given the high dimensionality, non-linearity, and redundancy characteristics associated with time series, conventional clustering algorithms frequently fall short in yielding satisfactory results when directly applied to this kind of data. As such, there is a critical need to judiciously select suitable feature extraction methods and dimension reduction techniques. This paper introduces a time series clustering algorithm, drawing primarily from polynomial fitting derivative features as a wellspring for feature extraction to achieve effective clustering results. Initially, Hodrick Prescott (HP) filtering comes into play for the processing of raw time series data, thereby eliminating noise and redundancy. Subsequently, polynomial curve fitting (PCF) is applied to the data to derive a globally continuous function fitting this time series. Next, by securing multi-order derivative values via this function, the time series is transformed into a multi-order derivative feature sequence. Lastly, we designed a polynomial function derivative features-based dynamic time warping (PFD_DTW) algorithm for determining the distance between two equal or unequal granular length time series, and subsequently a hierarchical clustering method anchored on the PFD_DTW distances for time series clustering after computing interspecies distances. The effectiveness of this method is corroborated by experimental results obtained from several practical datasets.
Read full abstract