MLP-JCG: Multi-Layer Perceptron With Joint-Coordinate Gating for Efficient 3D Human Pose Estimation

Zhenhua Tang,Richang Hong,Yanbin Hao,Jia Li

doi:10.1109/tmm.2023.3240455

Abstract

Various structural relations/dependencies exist among human body joints, which makes it possible to estimate 3D poses from 2D sources. The current research on 3D human pose estimation (3D-HPE for short) mainly focuses on structural information from a specific perspective. However, this information cannot facilitate 2D-to-3D pose lifting. This paper presents a novel and efficient multi-layer perceptron with a joint-coordinate gating (MLP-JCG) model, exploring and utilizing both the local and global structural information to perform 3D pose estimations. Specifically, MLP-JCG contains two independent MLP blocks, i.e., joint-mixing MLP and coordinate-mixing MLP, which solely act on the joint and coordinate axes in modelling their local structural information. For the global structural information, we first explore two kinds of global statistics from the pose matrix embeddings, which are referred to as the dynamics aggregated along the joint/coordinate axis. Then, we propose two kinds of gating units to elementwisely contextualize the features learned from MLP blocks. All the model components are designed based on MLP, making the MLP-JCG easy to implement and train. We conduct experiments on three 3D-HPE benchmarks, and the results demonstrate the superior effectiveness and efficiency of the proposed approach.

Full Text