Abstract

In this paper, a novel knowledge distillation (KD)-based pedestrian attribute recognition (PAR) model is developed, where a multi-label mixed feature learning network (MMFL-Net) is designed and adopted as the student model. In particular, by applying the grouped depth-wise separable convolution, re-parameterization and coordinate attention mechanism, not only the multi-scale receptive field information is sufficiently fused and spatially dependent robust features are extracted, the model complexity is also effectively kept acceptable. To alleviate the imbalance of category samples, an attribute weight parameter is proposed and considered when calculating the multi-label loss. Moreover, the Jensen–Shannon (JS) divergence-based KD scheme can facilitate the learning of MMFL-Net from the teacher model, which benefits strong fitting ability of the deep feature correlations so as to realize a highly generalized model. The proposed KD-PAR is comprehensively evaluated through many of experiments, and experimental results show the effectiveness and superiority of the proposed model as compared with other advanced MLL-based methods and state-of-the-art PAR models, which efficiently achieves the balance between accuracy and complexity. When facing the complex scenes such as blurry background, similar object interference, and target occlusion, the proposed KD-PAR can even present satisfactory recognition results with strong robustness, thereby providing a feasible and practical solution to the PAR tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call