Abstract

Data augmentation is an effective strategy to improve model performance and generalization capabilities by expanding both the size and diversity of the original dataset via subtle modifications to the original data. In mainstream contrastive learning frameworks, data augmentation is leveraged by default to compose positive and negative pairs. However, selecting optimal augmentations that consistently preserve the semantics of the original samples poses challenges owing to variations in the quality of different augmentation transformations. To address this challenge, we provide a novel perspective and formulate the data augmentation optimization problem as an outlier-driven augmentation blending problem. Specifically, we design an innovative metric named Clustering-Aware Outlier Factor (CAOF), to quantify the semantic inconsistency between the augmented samples and original samples. CAOF is capable of identifying semantically inconsistent outliers in augmented samples that are generated by various transformations, thereby offering better context-aware density assessments by considering the intra-cluster neighbors for individual samples. Each augmentation transformation is evaluated using the proposed CAOF, which reflects the likelihood of producing semantically inconsistent augmented samples. The likelihood determines the importance of the transformation in generating a refined blended augmentation, which is subsequently fed into the contrastive learning framework. We conduct empirical evaluations on six time series datasets. We compare our method with three state-of-the-art augmentation selection methods on five contrastive learning models. The results demonstrate the superiority of our proposed approach in discovering outliers and refining augmentations in contrastive learning for time series classification tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.