A symbolic representation of time series, with implications for streaming algorithms

Jessica Lin,Stefano Lonardi,Eamonn Keogh,Bill Chiu

doi:10.1145/882082.882086

Abstract

The parallel explosions of interest in streaming data, and data mining of time series have had surprisingly little intersection. This is in spite of the fact that time series data are typically streaming data. The main reason for this apparent paradox is the fact that the vast majority of work on streaming data explicitly assumes that the data is discrete, whereas the vast majority of time series data is real valued.Many researchers have also considered transforming real valued time series into symbolic representations, nothing that such representations would potentially allow researchers to avail of the wealth of data structures and algorithms from the text processing and bioinformatics communities, in addition to allowing formerly batch-only problems to be tackled by the streaming community. While many symbolic representations of time series have been introduced over the past decades, they all suffer from three fatal flaws. Firstly, the dimensionality of the symbolic representation is the same as the original data, and virtually all data mining algorithms scale poorly with dimensionality. Secondly, although distance measures can be defined on the symbolic approaches, these distance measures have little correlation with distance measures defined on the original time series. Finally, most of these symbolic approaches require one to have access to all the data, before creating the symbolic representation. This last feature explicitly thwarts efforts to use the representations with streaming algorithms.In this work we introduce a new symbolic representation of time series. Our representation is unique in that it allows dimensionality/numerosity reduction, and it also allows distance measures to be defined on the symbolic approach that lower bound corresponding distance measures defined on the original series. As we shall demonstrate, this latter feature is particularly exciting because it allows one to run certain data mining algorithms on the efficiently manipulated symbolic representation, while producing identical results to the algorithms that operate on the original data. Finally, our representation allows the real valued data to be converted in a streaming fashion, with only an infinitesimal time and space overhead.We will demonstrate the utility of our representation on the classic data mining tasks of clustering, classification, query by content and anomaly detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A symbolic representation of time series, with implications for streaming algorithms

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Experiencing SAX: a novel symbolic representation of time series
Jessica Lin ... Li Wei
Data Mining and Knowledge Discovery | VOL. 15
Jessica Lin, et. al.Jessica Lin ... Li Wei
03 Apr 2007
Data Mining and Knowledge Discovery | VOL. 15

Adaptive Segmentation-Based Symbolic Representations of Time Series for Better Modeling and Lower Bounding Distance Measures
Bernard Hugueney
-
Bernard HugueneyBernard Hugueney
01 Jan 2006
01 Jan 2006

Interpretable time series classification using linear models and multi-resolution multi-domain symbolic representations
Thach Le Nguyen ... Severin Gsponer
Data Mining and Knowledge Discovery | VOL. 33
Thach Le Nguyen, et. al.Thach Le Nguyen ... Severin Gsponer
21 May 2019
Data Mining and Knowledge Discovery | VOL. 33

Repeating patterns as symbols for long time series representation
Jakub Sevcech ... Maria Bielikova
Journal of Systems and Software | VOL. 127
Jakub Sevcech, et. al.Jakub Sevcech ... Maria Bielikova
06 Jun 2016
Journal of Systems and Software | VOL. 127

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A symbolic representation of time series, with implications for streaming algorithms

Abstract

Talk to us

Similar Papers