Abstract

Time series data are widely found in finance, health, environmental, social, mobile and other fields. A large amount of time series data has been produced due to the general use of smartphones, various sensors, RFID and other internet devices. How a time series is represented is key to the efficient and effective storage and management of time series data, as well as being very important to time series classification. Two new time series representation methods, Hexadecimal Aggregate approXimation (HAX) and Point Aggregate approXimation (PAX), are proposed in this paper. The two methods represent each segment of a time series as a transformable interval object (TIO). Then, each TIO is mapped to a spatial point located on a two-dimensional plane. Finally, the HAX maps each point to a hexadecimal digit so that a time series is converted into a hex string. The experimental results show that HAX has higher classification accuracy than Symbolic Aggregate approXimation (SAX) but a lower one than some SAX variants (SAX-TD, SAX-BD). The HAX has the same space cost as SAX but is lower than these variants. The PAX has higher classification accuracy than HAX and is extremely close to the Euclidean distance (ED) measurement; however, the space cost of PAX is generally much lower than the space cost of ED. HAX and PAX are general representation methods that can also support geoscience time series clustering, indexing and query except for classification.

Highlights

  • The Symbolic Aggregate approXimation (SAX) method was selected because the SAX and the Hexadecimal Aggregate approXimation (HAX) methods are both symbol-based representation methods based on the Piecewise Aggregate Approximation (PAA) division and have the same string length for a time series object

  • In terms of space cost, the HAX realizes the dimensionality reduction of high-dimensional time series by representing a time series as a set of hex strings, reducing the amount of information required for time series storage and making it more convenient to be used in various fields

  • AggreInIn terms of space cost, the realizes the reduction of high-dimensional time series by representing a time series as dimensionality a set of approXimate hex strings, reducing theproposed

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Human society is facing many challenges, such as environmental pollution, population growth, urban expansion, the transmission of infectious diseases and various natural disaster monitoring and prevention issues, etc These are all closely related to the concept of time and produce massive data containing information regarding time. Many applications in the fields of scientific research, industry and business produce large amounts of time-series data that need effective analysis, requiring rational representation and efficient similarity computing and search. These applications cover the domains of images, audio, finance, environmental monitoring and other scientific disciplines [7,8,9].

Related Work
Hexadecimal Aggregate Approximation Representation
Basic Principle of HAX
Summary of of time series
ADistance
Experimental Evaluation
Experimental Data
Experimental Parameter Setting
Experimental Results and Analysis
Accuracy
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call