Abstract

Over recent years the popularity of time series has soared. Given the widespread use of modern information technology, a large number of time series may be collected during business, medical or biological operations, for example. As a consequence there has been a dramatic increase in the amount of interest in querying and mining such data, which in turn has resulted in a large number of works introducing new methodologies for indexing, classification, clustering and approximation of time series. In particular, many new distance measures between time series have been introduced. In this paper, we propose a new distance function based on a derivative. In contrast to well-known measures from the literature, our approach considers the general shape of a time series rather than point-to-point function comparison. The new distance is used in classification with the nearest neighbor rule. In order to provide a comprehensive comparison, we conducted a set of experiments, testing effectiveness on 20 time series datasets from a wide variety of application domains. Our experiments show that our method provides a higher quality of classification on most of the examined datasets.

Highlights

  • Time-series classification has been studied extensively by machine learning and data mining communities, resulting in a variety of different approaches, ranging from neural (Petridis and Kehagias 1997) and Bayesian networks (Pavlovic et al 1999) through HMM-AR models (Penny and Roberts 1999) to genetic algorithms and support vector machines (Eads et al 2002)

  • Dynamic Time Warping (DTW) is a classical distance measure well suited to the task of comparing time series (Berndt and Clifford 1994). It differs from Euclidean distance (ED) by allowing the vector components being compared to “drift” from exactly corresponding positions. It is an algorithm for measuring similarity between two sequences which may vary in time or speed

  • A time series is a sequence of observations which are ordered in time or space (Box et al 2008)

Read more

Summary

Introduction

Time-series classification has been studied extensively by machine learning and data mining communities, resulting in a variety of different approaches, ranging from neural (Petridis and Kehagias 1997) and Bayesian networks (Pavlovic et al 1999) through HMM-AR models (Penny and Roberts 1999) to genetic algorithms and support vector machines (Eads et al 2002). The perfect case is if the compared time series are similar as functions—if the point values are identical It seems that in the classification domain there could be objects for which function value comparison is not sufficient. 3.1), in most cases the classification result more or less depends on function value comparison It seems that the best approach is to create a method which considers both function values of time series (point to point comparison) and values of the derivative of the function (general shape comparison). In this paper we construct a distance measure that considers the two above-mentioned approaches to time series classification Thanks to this we are able to deal with situations where examinated sequences are not different enough.

Related works and new contribution
Methods
An illustrative example
Dynamic Time Warping distance
Distance based on derivative
Parameter dimension
Parameter tuning
Lower bound and triangular inequality
Experimental setup
Main results
Noise and smoothing
Comparison with related works
Conclusions and future work
Lower bound and triangular inequality proof
Derivative influence comparison
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call