Using BlazePose on Spatial Temporal Graph Convolutional Networks for Action Recognition

Motasem S Alsawadi,El-Sayed M El-Kenawy,Miguel Rio

doi:10.32604/cmc.2023.032499

Motasem S Alsawadi, El-Sayed M El-Kenawy + Show 1 more

Open Access

PDF Available

https://doi.org/10.32604/cmc.2023.032499

Copy DOI

Export

Save

Cite

Journal: Computers, Materials & Continua	Publication Date: Jan 1, 2023
Citations: 3	License type: cc-by

Affiliation: University College London

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The ever-growing available visual data (i.e., uploaded videos and pictures by internet users) has attracted the research community's attention in the computer vision field. Therefore, finding efficient solutions to extract knowledge from these sources is imperative. Recently, the BlazePose system has been released for skeleton extraction from images oriented to mobile devices. With this skeleton graph representation in place, a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action. We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest, it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks. Hence, in this study, we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition. Moreover, we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor. Additionally, we propose different skeleton detection thresholds that can improve the accuracy performance even further. We reached a top-1 accuracy performance of 40.1% on the Kinetics dataset. For the NTU-RGB+D dataset, we achieved 87.59% and 92.1% accuracy for Cross-Subject and Cross-View evaluation criteria, respectively.

Full Text