Abstract

A huge challenge in nowadays' data mining is similarity search in streaming time series under Dynamic Time Warping (DTW). In the similarity search, data normalization is a must to obtain accurate results. However, data normalization on the fly and the DTW calculation cost a great deal of computational time and memory space. In the paper, we present two methods, SUCR-DTW and ESUCR-DTW, which conduct similarity search for numerous prespecified patterns over multiple time-series streams under DTW supporting data normalization. These two methods utilize a combination of techniques to mitigate the aforementioned costs. The efficient methods inherit the cascading lower bounds introduced in UCR-DTW, a state-of-the-art method of similarity search in the static time series, to admissibly prune off unpromising subsequences. To be adaptive in the streaming setting, SUCR-DTW performs incremental updates on the envelopes of new-coming time-series subsequences and incremental data normalization on time-series data. However, like UCR-DTW, SUCR-DTW retrieves only similar subsequences that have the same length as the patterns. ESUCR-DTW, an extension of SUCR-DTW, can find similar subsequences whose lengths are different from those of the patterns. Furthermore, our proposed methods exploit multi-threading to have a fast response to high-speed time-series streams. The experimental results show that SUCR-DTW obtains the same precision as UCR-DTW and has lower wall clock time. Besides, the experimental results of SUCR-DTW and ESUCR-DTW reveal that the extended method has higher accuracy in spite of longer wall clock time. Also, the paper evaluates the influence of incremental z-score normalization and incremental min---max normalization on the obtained results.

Highlights

  • A time-series stream is a sequence of data collected in a continuous manner as time progresses

  • As for one time-series stream, if the similarity search is conducted over one new-coming time-series subsequence of the same length as the pattern, we propose SUCR-Dynamic Time Warping (DTW), which stands for Streaming UCR-DTW; otherwise, ESUCRDTW standing for Extended SUCR-DTW is proposed to carry out the similarity search over many new-coming timeseries subsequences

  • The first method, SUCR-DTW [15], is a modification of UCR-DTW, a state-of-the-art method of similar search for prespecified patterns in static time series, so that SUCR-DTW can cope with difficulties and complexities of similarity search in streaming time series

Read more

Summary

Introduction

A time-series stream is a sequence of data collected in a continuous manner as time progresses. Rakthanmanon et al [11] have introduced UCR-DTW, a method of similarity search for patterns, which are prespecified time-series sequences, in static time series under DTW. The method, works only on static time series and requires two sequences of the same length while computing the DTW distance, so it leaves many things open in similarity search over time-series streams. Motivated by the above observation, in this paper we will present two methods, SUCR-DTW and ESUCR-DTW, of similarity search for prespecified patterns in streaming time series under DTW, which support data normalization. ESUCR-DTW is compared with SPRING, a well-known method of similarity search in streaming time series, combined with incremental min–max normalization in terms of wall clock time and the quality of similar subsequences. There have been many incessant researches to speed up DTW, since Berndt and Clifford introduced the distance metric in 1994 [6]

Techniques to speedup Dynamic Time Warping
Data normalization
Typical tasks of similarity search in streaming time series
Related work
Get each s of T
Incremental data normalization
Problem definition
SUCR-DTW
ESUCR-DTW
Experimental evaluation
Evaluation of SUCR-DTW
Evaluation of ESUCR-DTW
Conclusions and future work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call