Research on Discriminative Skeleton-Based Action Recognition in Spatiotemporal Fusion and Human-Robot Interaction

Qiubo Zhong,Haoxiang Zhang,Caiming Zheng

doi:10.1155/2020/8717942

Qiubo Zhong, Haoxiang Zhang + Show 1 more

Open Access

https://doi.org/10.1155/2020/8717942

Copy DOI

Journal: Complexity	Publication Date: Aug 25, 2020
Citations: 3	License type: CC BY 4.0

Affiliation: Ningbo University of Technology

Abstract

A novel posture motion-based spatiotemporal fused graph convolutional network (PM-STGCN) is presented for skeleton-based action recognition. Existing methods on skeleton-based action recognition focus on independently calculating the joint information in single frame and motion information of joints between adjacent frames from the human body skeleton structure and then combine the classification results. However, that does not take into consideration of the complicated temporal and spatial relationship of the human body action sequence, so they are not very efficient in distinguishing similar actions. In this work, we enhance the ability of distinguishing similar actions by focusing on spatiotemporal fusion and adaptive feature extraction for high discrimination information. Firstly, the local posture motion-based attention (LPM-TAM) module is proposed for the purpose of suppressing the skeleton sequence data with a low amount of motion in the temporal domain, and the representation of motion posture features is concentrated. Besides, the local posture motion-based channel attention module (LPM-CAM) is introduced to make use of the strongly discriminative representation between different action classes of similarity. Finally, the posture motion-based spatiotemporal fusion (PM-STF) module is constructed which fuses the spatiotemporal skeleton data by filtering out the low-information sequence and enhances the posture motion features adaptively with high discrimination. Extensive experiments have been conducted, and the results demonstrate that the proposed model is superior to the commonly used action recognition methods. The designed human-robot interaction system based on action recognition has competitive performance compared with the speech interaction system.

Highlights

With the development of artificial intelligence technology, human-robot interaction technology has become a research hotspot
Erefore, action recognition plays an important role in the field of human-robot interaction [3]. e two main approaches of human action recognition are RGB-based and skeleton-based. e RGB-based method makes full use of the image data and can obtain higher performance in the recognition rate
E main contributions of our methods are the following: (1) A novel local posture motion-based attention module (LPM-TAM) filters out low motion information in the temporal domain that helps to improve the ability of relevance motion feature extraction

Summary

Introduction

With the development of artificial intelligence technology, human-robot interaction technology has become a research hotspot. (1) A novel local posture motion-based attention module (LPM-TAM) filters out low motion information in the temporal domain that helps to improve the ability of relevance motion feature extraction (2) Local posture motion-based channel attention module (LPM-CAM) is employed to enhance the ability to distinguish similar actions for learning the strong discriminative representation adaptively between different action classes (3) e posture motion-based spatiotemporal fusion (PM-STF) module is used, which integrates LPM-TAM and LPM-CAM to effectively fuse the spatiotemporal feature information and extract high-discriminative feature for improving the ability to distinguish similar actions (4) e effectiveness of the proposed method has been verified through extensive experiments, compared with other common methods to evaluate the competitiveness of the proposed method and applied in humanoid robots successfully to verify that action interaction is better than speech interaction

Related Work

Posture Motion-Based Spatiotemporal Fusion Graph Convolution

Datasets

Methods

Findings

Conclusion