Abstract

In this paper, we propose an integrated action classification and regression learning framework for the fine-grained human action quality assessment of RGB videos. On the basis of 2D skeleton data obtained per frame of RGB video sequences, we present an effective representation of joint trajectories to train action classifiers and a class-specific regression model for a fine-grained assessment of the quality of human actions. To manage the challenge of view changes due to camera motion, we develop a self-similarity feature descriptor extracted from joint trajectories and a joint displacement sequence to represent dynamic patterns of the movement and posture of the human body. To weigh the impact of joints for different action categories, a class-specific regression model is developed to obtain effective fine-grained assessment functions. In the testing stage, with the supervision of the action classifier’s output, the regression model of a specific action category is selected to assess the quality of skeleton motion extracted from the action video. We take advantage of the discrimination of the action classifier and the viewpoint invariance of the self-similarity feature to boost the performance of the learning-based quality assessment method in a realistic scene. We evaluate our proposed method using diving and figure skating videos of the publicly available MIT Olympic Scoring dataset, and gymnastic vaulting videos of the recent benchmark University of Nevada Las Vegas (UNLV) Olympic Scoring dataset. The experimental results show that the proposed method achieved an improved performance, which is measured by the mean rank correlation coefficient between the predicted regression scores and the ground truths.

Highlights

  • Human action evaluation (HAE) aims to tackle the challenging problem of making computers automatically quantify how well people perform actions

  • To accurately predict the quality score of an action video, we developed effective self-similarity feature descriptors extracted from the self-similarity matrices (SSMs) of joint trajectories and a joint displacement sequence that has been proven to alleviate the impact of camera motion in diving, figure skating, and vaulting videos

  • Some of the reviewed published works regarded the task of human action evaluation as a video paper, we develop self-similarity feature representation extracted from joint trajectories and joint sequence recognition problem

Read more

Summary

Introduction

Human action evaluation (HAE) aims to tackle the challenging problem of making computers automatically quantify how well people perform actions. Since the traditional manual assessment of human motion quality needs a great deal of expertise from specialized fields, longtime learning, and training processes are required to summarize the experience and evaluation rules for automatic scoring sport activity in a specialized field. This requires a great amount of time and high labor cost. Apart from traditional action recognition research, human action evaluation aims to design computation models for automatically assessing the quality score of human actions or activities and further give interpretable feedback to improve human body movement It relies on accurate human motion detection and segmentation, action feature extraction and representation, and effective evaluation methods for measuring the quality of action performance. Compulsory routines are required in the performances of the Olympic games, including a spin, axel, spiral, and transition

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.