Abstract
Distance and dissimilarity functions are of undoubted importance to Time Series Data Mining. There are literally hundreds of methods proposed in the literature that rely on a dissimilarity measure as the main manner to compare objects. One notable example is the 1-Nearest Neighbor classification algorithm. These methods frequently outperform more complex methods in tasks such as classification, clustering, prediction, and anomaly detection. All these methods leave open the distance or dissimilarity function, being Euclidean distance (ED) and Dynamic Time Warping (DTW) the two most used dissimilarity measures in the literature. This paper empirically compares 48 measures on 42 time series data sets. Our objective is to call the attention of the research community about other dissimilarity measures besides ED and DTW, some of them able to significantly outperform these measures in classification. Our results show that Complex Invariant Distance DTW (CIDDTW) significantly outperforms DTW and that CIDDTW, DTW, CID, Minkowski L-p (p-norm difference with data set-crafted p parameter), Lorentzian L-infinity, Manhattan L-1, Average L-1/L-infinity (arithmetic average), Dice distance, and Jaccard distance outperform ED, but only CIDDTW, DTW, and CID outperform ED with statistical significance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.