Abstract

Data augmentation has emerged as a widely adopted technique for improving the generalization capabilities of deep neural networks. However, evaluating the effectiveness of data augmentation methods solely based on model training is computationally demanding and lacks interpretability. Moreover, the absence of quantitative standards hinders our understanding of the underlying mechanisms of data augmentation approaches and the development of novel techniques. To this end, we propose interpretable quantitative measures that decompose the effectiveness of data augmentation methods into two key dimensions: similarity and diversity. The proposed similarity measure describes the overall similarity between the original and augmented datasets, while the diversity measure quantifies the divergence in inherent complexity between the original and augmented datasets in terms of categories. Importantly, our proposed measures are model training-agnostic, ensuring efficiency in their calculation. Through experiments on several benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, and ImageNet, we demonstrate the efficacy of our measures in evaluating the effectiveness of various data augmentation methods. Furthermore, although the proposed measures are straightforward, they have the potential to guide the design and parameter tuning of data augmentation techniques and enable the validation of data augmentation methods’ efficacy before embarking on large-scale model training.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.