We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called Series2Vec for self-supervised representation learning. Unlike the state-of-the-art methods in time series which rely on hand-crafted data augmentation, Series2Vec is trained by predicting the similarity between two series in both temporal and spectral domains through a self-supervised task. By leveraging the similarity prediction task, which has inherent meaning for a wide range of time series analysis tasks, Series2Vec eliminates the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at https://github.com/Navidfoumani/Series2Vec
Read full abstract