Abstract
A large amount of time series data is being generated every day in a wide range of sensor application domains. The symbolic aggregate approximation (SAX) is a well-known time series representation method, which has a lower bound to Euclidean distance and may discretize continuous time series. SAX has been widely used for applications in various domains, such as mobile data management, financial investment, and shape discovery. However, the SAX representation has a limitation: Symbols are mapped from the average values of segments, but SAX does not consider the boundary distance in the segments. Different segments with similar average values may be mapped to the same symbols, and the SAX distance between them is 0. In this paper, we propose a novel representation named SAX-BD (boundary distance) by integrating the SAX distance with a weighted boundary distance. The experimental results show that SAX-BD significantly outperforms the SAX representation, ESAX representation, and SAX-TD representation.
Highlights
Time series data are being generated every day in a wide range of application domains [1], such as bioinformatics, finance, engineering, etc. [2]
There are many methods for feature extraction, for example: (1) spectral analysis such as discrete Fourier transform (DFT) [11], (2) discrete wavelet transform (DWT) [12], where features of the frequency domain are considered, and (3) singular value decomposition (SVD) [13], where eigenvalue analysis is carried out in order to find an optimal set of features
symbolic aggregate approximation (SAX)-TD is the same, but in our method, SAX-BD, the equation is not equal to 0, calculated using SAX-TD is the same, but in our method, SAX-BD, the equation is not equal to 0, indicating that there is a possibility of distinction between the time series
Summary
Time series data are being generated every day in a wide range of application domains [1], such as bioinformatics, finance, engineering, etc. [2]. SAX has applying deep learning methods multivariate[21], time series classification has received attention been widely used in mobile data to management financial investment [22], feature extraction [23]. A compromise is needed to reduce the dimension of time series while improving the ESAX representation can express the characteristics of time series in more detail [25]. The average value of the segment and its boundary distance help measure different trends of time series more accurately. We proved that our improved distance measure keeps a lower-bound to the Euclidean distance, and achieves a tighter lower bound than that of the original SAX distance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have