Abstract
Existing whole-body human pose estimation methods mostly segment the parts of the body’s hands and feet for specific processing, which not only splits the overall semantics of the body, but also increases the amount of calculation and the complexity of the model. To address these drawbacks, we designed a novel semantic–structural graph convolutional network (SSGCN) for whole-body human pose estimation tasks, which leverages the whole-body graph structure to analyze the semantics of the whole-body keypoints through a graph convolutional network and improves the accuracy of pose estimation. Firstly, we introduced a novel heat-map-based keypoint embedding, which encodes the position information and feature information of the keypoints of the human body. Secondly, we propose a novel semantic–structural graph convolutional network consisting of several sets of cascaded structure-based graph layers and data-dependent whole-body non-local layers. Specifically, the proposed method extracts groups of keypoints and constructs a high-level abstract body graph to process the high-level semantic information of the whole-body keypoints. The experimental results showed that our method achieved very promising results on the challenging COCO whole-body dataset.
Highlights
Human pose estimation is a challenging computer vision task, which aims to locate the human body keypoints in images and videos
This work presents a novel graph convolutional network framework for whole-body human pose estimation tasks, which leverages the whole-body graph structure to analyze the semantics of each part of the body through the graph convolutional network; We propose a novel heat-map-based keypoint embedding module, which encodes the position information and feature information of the keypoints of the human body; The proposed semantic–structural graph convolutional network consists of a structurebased graph layer to capture skeleton structure information and a data-dependent non-local layer to analyze the long-range grouped joint features; We represent groups of keypoints and construct a high-level abstract body graph to process the high-level semantic information of the whole-body keypoints
We performed the semantic fusion of whole-body poses based on the whole-body skeleton and leveraged the heat-map-based graph convolutional network to calibrate human whole-body human pose estimation
Summary
Human pose estimation is a challenging computer vision task, which aims to locate the human body keypoints in images and videos. Our main contributions are summarized as follows: This work presents a novel graph convolutional network framework for whole-body human pose estimation tasks, which leverages the whole-body graph structure to analyze the semantics of each part of the body through the graph convolutional network; We propose a novel heat-map-based keypoint embedding module, which encodes the position information and feature information of the keypoints of the human body; The proposed semantic–structural graph convolutional network consists of a structurebased graph layer to capture skeleton structure information and a data-dependent non-local layer to analyze the long-range grouped joint features; We represent groups of keypoints and construct a high-level abstract body graph to process the high-level semantic information of the whole-body keypoints.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.