Data augmentation, which improves the diversity of datasets by applying image transformations, has become one of the most effective techniques in visual representation learning. Usually, the design of augmentation policies faces a diversity-difficulty trade-off. On the one hand, a simple augmentation leads to a low training set diversity, which can not improve model performance significantly. On the other hand, an excessively hard augmentation has an overlarge regularization effect which harms model performance. Recently, automatic augmentation methods have been proposed to address this issue by searching the optimal data augmentation policy from a predefined searching space. However, these methods still suffer from heavy searching overhead or complex optimization objectives. In this paper, instead of searching the optimal augmentation policy, we propose to break the diversity-difficulty trade-off from a multi-task learning perspective. By formulating model learning on the augmented images and the original images as the auxiliary task and the primary task in multi-task learning respectively, the hard augmentation does not directly influence the training of the primary branch and thus its negative influence can be alleviated. Hence, neural networks can learn valuable semantic information even with a totally random augmentation policy. Experimental results on ten datasets for four tasks demonstrate the superiority of our method over the other twelve methods. Codes have been released in https://github.com/ArchipLab-LinfengZhang/data-augmentation-multi-task.
Read full abstract