Abstract

A large amount of time series data is being generated every day in a wide range of sensor application domains. The symbolic aggregate approximation (SAX) is a well-known time series representation method, which has a lower bound to Euclidean distance and may discretize continuous time series. SAX has been widely used for applications in various domains, such as mobile data management, financial investment, and shape discovery. However, the SAX representation has a limitation: Symbols are mapped from the average values of segments, but SAX does not consider the boundary distance in the segments. Different segments with similar average values may be mapped to the same symbols, and the SAX distance between them is 0. In this paper, we propose a novel representation named SAX-BD (boundary distance) by integrating the SAX distance with a weighted boundary distance. The experimental results show that SAX-BD significantly outperforms the SAX representation, ESAX representation, and SAX-TD representation.

Highlights

  • Time series data are being generated every day in a wide range of application domains [1], such as bioinformatics, finance, engineering, etc. [2]

  • There are many methods for feature extraction, for example: (1) spectral analysis such as discrete Fourier transform (DFT) [11], (2) discrete wavelet transform (DWT) [12], where features of the frequency domain are considered, and (3) singular value decomposition (SVD) [13], where eigenvalue analysis is carried out in order to find an optimal set of features

  • symbolic aggregate approximation (SAX)-TD is the same, but in our method, SAX-BD, the equation is not equal to 0, calculated using SAX-TD is the same, but in our method, SAX-BD, the equation is not equal to 0, indicating that there is a possibility of distinction between the time series

Read more

Summary

Introduction

Time series data are being generated every day in a wide range of application domains [1], such as bioinformatics, finance, engineering, etc. [2]. SAX has applying deep learning methods multivariate[21], time series classification has received attention been widely used in mobile data to management financial investment [22], feature extraction [23]. A compromise is needed to reduce the dimension of time series while improving the ESAX representation can express the characteristics of time series in more detail [25]. The average value of the segment and its boundary distance help measure different trends of time series more accurately. We proved that our improved distance measure keeps a lower-bound to the Euclidean distance, and achieves a tighter lower bound than that of the original SAX distance

Related Work
The Distance Calculation by SAX
An Improvement of SAX Distance Measure for Time Series
An Improvement
Method
Our Method SAX-BD
Difference
Lower Bound
Experimental Validation
Data Sets
Comparison Methods and Parameter Settings
Result Analysis
Methods n*
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call