Abstract

Various representations have been proposed for time series to facilitate similarity searches and discovery of interesting patterns. Although the Euclidean distance and its variants have been most frequently used as similarity measures, they are relatively sensitive to noise and fail to provide meaningful information in many cases. Moreover, for time series with high dimensionality, the similarity calculation may be extremely inefficient. To solve this problem, we introduce a new method which gives a symbolic representation of the time series and can dramatically reduce its dimensionality. The method employs Vector Quantization to encode time series using symbols prior to performing similarity analysis. Due to the symbolic representation, we can apply string matching algorithms to calculate the similarities more efficiently and accurately. We propose a similarity measure that is based on the Longest Common Subsequence (LCSS) model. The experimental results on real and simulated data demonstrate the utility and efficiency of the proposed technique.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call