Abstract
Skeleton-based action recognition is vital for comprehending human-centric videos and has applications in diverse domains. One of the challenges of skeleton-based action recognition is dealing with low-quality data, such as skeletons that have missing or inaccurate joints. This paper addresses the issue of enhancing action recognition using low-quality skeletons through a general knowledge distillation framework. The proposed framework employs a teacher-student model setup, where a teacher model trained on high-quality skeletons guides the learning of a student model that handles low-quality skeletons. To bridge the gap between heterogeneous high-quality and low-quality skeletons, we present a novel part-based skeleton matching strategy, which exploits shared body parts to facilitate local action pattern learning. An action-specific part matrix is developed to emphasize critical parts for different actions, enabling the student model to distill discriminative part-level knowledge. A novel part-level multi-sample contrastive loss achieves knowledge transfer from multiple high-quality skeletons to low-quality ones, which enables the proposed knowledge distillation framework to include training low-quality skeletons that lack corresponding high-quality matches. Comprehensive experiments conducted on the NTU-RGB+D, Penn Action, and SYSU 3D HOI datasets demonstrate the effectiveness of the proposed knowledge distillation framework.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.