Abstract
Symbolic approximation representation is a key problem in time series which can significantly affect the accuracy and efficiency of data mining. However, since currently used methods divide the original sequence into segments with equal size, they ignore one of the most important features of time series: the trend. To overcome the defect of equal-sized segmenting, we present a trend segmentation representation based on Iterative End Point Fitting algorithm (IEPF-TSR). Particularly, we use iterative end point fitting (IEPF) algorithm to search the break point of each segment and get the trend segmentation. Then a triplet based symbolic representation is proposed for each segment which includes the start point, mean and trend. Moreover, we define a new distance measure method based on trend segmentation representation (TSR-DIST) which can suit for two representations with different lengths, and prove it to be the lower bound of Euclidean distance. The experimental results on UCR datasets show that the proposed representation and distance measure achieve better performance than the state-of-the-art methods in the classification accuracy and the dimensionality reduction ratio.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.