Abstract

Early time series classification (eTSC) is the problem of classifying a time series after as few measurements as possible with the highest possible accuracy. The most critical issue of any eTSC method is to decide when enough data of a time series has been seen to take a decision: Waiting for more data points usually makes the classification problem easier but delays the time in which a classification is made; in contrast, earlier classification has to cope with less input data, often leading to inferior accuracy. The state-of-the-art eTSC methods compute a fixed optimal decision time assuming that every times series has the same defined start time (like turning on a machine). However, in many real-life applications measurements start at arbitrary times (like measuring heartbeats of a patient), implying that the best time for taking a decision varies widely between time series. We present TEASER, a novel algorithm that models eTSC as a two-tier classification problem: In the first tier, a classifier periodically assesses the incoming time series to compute class probabilities. However, these class probabilities are only used as output label if a second-tier classifier decides that the predicted label is reliable enough, which can happen after a different number of measurements. In an evaluation using 45 benchmark datasets, TEASER is two to three times earlier at predictions than its competitors while reaching the same or an even higher classification accuracy. We further show TEASER’s superior performance using real-life use cases, namely energy monitoring, and gait detection.

Highlights

  • A time series (TS) is a collection of values sequentially ordered in time

  • This need arises when the classification decision is time-critical, for instance to prevent damage (the earlier a warning system can predict an earthquake from seismic data (Perol et al 2018), the more time there is for preparation), to speed-up diagnosis (the earlier an abnormal heart-beat is detected, the more time there is for prevention of fatal attacks (Griffin and Moorman 2001)), or to protect markets and systems (the earlier a crisis of a particular stock is detected, the faster it can be banned from trading (Ghalwash et al 2014))

  • We model this task as a classification problem of its own, where the i-th master classifier uses the results of the i-th slave classifier as features for learning its model

Read more

Summary

Introduction

A time series (TS) is a collection of values sequentially ordered in time. One strong force behind their rising importance is the increasing use of sensors for automatic and high resolution monitoring in domains like smart homes (Jerzak and Ziekow 2014), starlight observations (Protopapas et al 2006), machine surveillance (Mutschler et al 2013), or smart grids (Hobbs et al 1999; Lew and Milligan 2016). Many state-of-the-art methods in eTSC (Xing et al 2012, 2011; Mori et al 2017b) assume that all time series being classified have a defined start time These methods assume that characteristic patterns appear roughly at the same offset in all TS, and try to learn the fixed fraction of the TS that is needed to make high accuracy predictions, i.e., when the accuracy of classification is most likely close to the accuracy on the full TS. In this paper we present TEASER, a Two-tier Early and Accurate Series classifiER, that is robust regarding the start time of a TS’s recording It models eTSC as a twotier classification problem (see Fig. 3).

Background: time series and eTSC
Related work
Early and accurate TS classification
Slave classifier
Master classifier
Training slave and master classifiers
Computational complexity
Experimental evaluation
Choice of slave and master classifiers
Being early and accurate
Impact of the number of masters and slaves
Ablation study
Three real-life datasets
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call