Abstract
Human parsing and pose estimation are both closely related to the human body structure, and it is advantageous to deal with both together in order to assist each other’s learning. Nevertheless, the two tasks have their own unique characteristics, and it is therefore difficult to efficiently and simultaneously obtain a good performance for both tasks in one single multi-task learning (MTL) network. This paper proposes a new MTL network (named “KDNet”) to enhance the communication between human parsing and pose estimation through knowledge distillation so as to improve the simultaneous learning of both tasks. In the proposed KDNet, a consistent representation module is used to enhance the related information sharing between the tasks, and a single-task model-based loss weighting method is developed to balance the loss levels. The proposed KDNet has been verified across three benchmark datasets, and in comparison to other state-of-the-art methods it achieves substantial performance gains without increasing model parameters during inference. The codes are shared at https://github.com/Yhdian/KDNet.git.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have