Abstract
The performance of self-supervised learning (SSL) models is hindered by the scale of the network. Existing SSL methods suffer a precipitous drop in lightweight models, which is important for many mobile devices. To address this problem, we propose a method to improve the lightweight network (as student) via distilling the metric knowledge in a larger SSL model (as teacher). We exploit the relation between teacher and student to mine the positive and negative supervision from the unlabeled data, which captures more accurate supervision signals. To adaptively handle the uncertainty in positive and negative sample pairs, we incorporate a dynamic weighting strategy to the metric relation between embeddings. Different from previous self-supervised distillers, our solution directly optimizes the network from a metric transfer perspective by utilizing the relationships between samples and networks, without additional SSL constraints. Our method significantly boosts the performance of lightweight networks and outperforms existing distillers with fewer training epochs on the large-scale ImageNet. Interestingly, the SSL performance even beats the teacher network in several settings.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.