Abstract

Human pose estimation in images is challenging and important for many computer vision applications. Large improvements in human pose estimation have been achieved with the development of convolutional neural networks. Even though, when encountered some difficult cases even the state-of-the-art models may fail to predict all the body joints correctly. Some recent works try to refine the pose estimator. GAN (Generative Adversarial Networks) has been proved to be efficient to improve human pose estimation. However, GAN can only learn local body joints structural constrains. In this paper, we propose to apply Self-Attention GAN to further improve the performance of human pose estimation. With attention mechanism in the framework of GAN, we can learn long-range body joints dependencies, therefore enforce the entire body joints structural constrains to make all the body joints to be consistent. Our method outperforms other state-of-the-art methods on two standard benchmark datasets MPII and LSP for human pose estimation. Our code is available at: https://github.com/idotc/Hg-SAGAN.

Highlights

  • Human pose estimation (HPE) aims to predict the locations of body joints from input images

  • Motivated by Self-Attention GAN (SAGAN), in this paper, we propose to apply self-Attention Generative Adversarial Networks (GAN) to further improve the performance of human pose estimation

  • HUMAN POSE ESTIMATION WITH SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORKS As mentioned above, human pose estimation has been significantly advanced by deep learning, still, all the difficulties lie in occlusion, overlapping with other people, or clutter background

Read more

Summary

INTRODUCTION

Human pose estimation (HPE) aims to predict the locations of body joints from input images. HUMAN POSE ESTIMATION WITH SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORKS As mentioned above, human pose estimation has been significantly advanced by deep learning, still, all the difficulties lie in occlusion, overlapping with other people, or clutter background In such cases, the model may find similar features which belong to the background or another person. Recent works try to improve the performance of human pose estimation by refinements [10], [12], [14], which are shown to be efficient, since such refinement processes are to learn structural constraints of human body joints. Self-Attention can be complementary to convolutions and more suitable to capture widely separated spatial long-range multi-level dependencies among body joints in human pose estimation problem. Armed with self-attention, the discriminator can more accurately enforce global geometric structural constraints on generated human pose

TRAINING THE SELF-ATTENTION GAN
INFERENCE
EVALUATION METRICS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call