Abstract
Semi-supervised learning aims to boost the model performance by large amounts of unlabelled data, thereby reducing the overheads of labelling. For joint pedestrian and face detection in real-world scenarios, the existing semi-supervised object detection methods rarely focus on the category relevance between samples, resulting in unsatisfied classification performance. And it is not effective for existing semi-supervised methods to integrate the categories from two datasets to obtain an ensemble network. In this work, a novel approach aiming to fully utilise category-relevant information is proposed. Firstly, the multi-teacher distillation for decoupling pedestrian and face categories are introduced to eliminate category unfairness in distillation process. Second, a coupled attention module embedded in classification head of the student network is proposed to better grasp the relevance of different categories from teachers and guide distillation. Moreover, the constraint loss is designed for stabilising the training process and better converging, so as to tailor a versatile student. The experimental results on the CrowdHuman and WiderFace benchmarks demonstrate the superiority of the approach over the state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.