Abstract
Performing time series mining tasks directly on raw data is inefficient, therefore these data require representation methods that transform them into low-dimension spaces where they can be managed more efficiently. Owing to its simplicity, the piecewise aggregate approximation is a popular time series representation method. But this method uses a uniform word-size for all the segments in the time series, which reduces the quality of the representation. Although some alternatives use representations with different word-sizes in a way that reflects the various information contents of different segments, such methods apply a complicated representation scheme, as it uses a different representation for each time series in the dataset. In this paper we present two modifications of the original piecewise aggregate approximation. The novelty of these modifications is that they use different word-sizes, which allows for a flexible representation that reflects the level of activity in each segment, yet these new medications address this problem on a dataset-level, which simplifies establishing a lower bounding distance. The word-sizes are determined through an optimization process. The experiments we conducted on a variety of time series datasets validate the two new modifications.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have