Abstract

Self-supervised learning can be adopted to mine deep semantic information of visual data without a large number of human-annotated supervision by using a pretext task to pretrain a model. In this study, we proposed a novel self-supervised learning paradigm, namely multi-task self-supervised (MTSS) representation learning. Unlike existing self-supervised learning methods, which pretrain neural networks on the pretext task and then fine-tune the parameters of neural networks on the downstream task, in our scheme, downstream and pretext tasks are considered primary and auxiliary tasks, respectively, and are trained simultaneously. Our method involves maximizing the similarity of two augmented views of an image as an auxiliary task and using a multi-task network to train the primary task alongside the auxiliary task. We evaluated the proposed method on standard datasets and backbones through a rigorous experimental procedure. Experimental results revealed that proposed MTSS can achieve better performance and robustness than other self-supervised learning methods on multiple image classification data sets without using negative sample pairs and large batches. This simple yet effective method can inspire people to rethink self-supervised learning.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.