Abstract
Action Quality Assessment aims to evaluate how well an action performs. Existing methods have achieved remarkable progress on fully-supervised action assessment. However, in real-world applications, with expert’s experience, it is not always feasible to manually label all samples. Therefore, it is important to study the problem of semi-supervised action assessment with only a small amount of samples annotated. A major challenge for semi-supervised action assessment is how to exploit the temporal pattern from unlabeled videos. Inspired by the temporal dependencies of the action execution, we propose a self-supervised learning on the unlabeled videos by recovering the feature of a masked segment of an unlabeled video. Furthermore, we leverage adversarial learning to align the representation distribution of the labeled and the unlabeled samples to close their gap in the sample space since unlabeled samples always come from unseen actions. Finally, we propose an adversarial self-supervised framework for semi-supervised action quality assessment. The extensive experimental results on the MTL-AQA and the Rhythmic Gymnastics datasets will demonstrate the effectiveness of our framework, achieving the state-of-the-art performances of semi-supervised action quality assessment.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.