Abstract

Although achieving significant improvement on pose estimation, the major drawback is that most top-performing methods tend to adopt complex architecture and spend large computational cost to achieve higher performance. Due to the edge device’s limited resources, its top-performing methods are hard to maintain fast inference speed in practice. To address this issue, we proposed the fast and lightweight human pose estimation method to maintain high performance and bear the less computational cost. Especially, the proposed method consists of two parts, i.e., the fast and lightweight pose network (FLPN) for pose estimation and a novel lightweight bottleneck block for reducing computational cost, which can integrate the simple network and lightweight bottleneck into an efficient method for accurate pose estimation. In terms of lightweight bottleneck block, we introduce the structural similarity measurement (SSIM) to refine the appropriate ratio of intrinsic feature maps and reduce the model size. Furthermore, an attention mechanism is also adopted in our lightweight bottleneck block for modeling the contextual information. We demonstrate the performance of the proposed method with extensive experiments on the two standard benchmark datasets by comparing our method with state-of-the-art methods. On the COCO keypoint detection dataset, our proposed method attains a similar accuracy with these state-of-the-art methods, but the computational cost of these top-performing methods is more than 7 times that of ours.

Highlights

  • The goal of estimating human pose based on input images can be simplified to precisely localize human anatomical keypoints

  • Similar as plenty of vision tasks, great advances on human pose estimation have been achieved by deep convolutional neural networks (DCNNs) [10], [12], [13], [15], [18], [19], [24], [25], [29]–[33]

  • We propose a novel lightweight human pose estimation method by redesigning a simple network (FLPN) with several groups of lightweight bottleneck (Smart bottleneck) blocks

Read more

Summary

Introduction

The goal of estimating human pose based on input images can be simplified to precisely localize human anatomical keypoints (elbows, wrists, knees, etc. ). ). Human pose estimation which is a fundamental task in computer vision is extensively adopted for action recognition [24], [25], pose tracking [26], and human-computer interaction [27]. Multiple tasks related to human pose estimation have been extensively studied in various fields [28]–[30], [33]. Similar as plenty of vision tasks, great advances on human pose estimation have been achieved by deep convolutional neural networks (DCNNs) [10], [12], [13], [15], [18], [19], [24], [25], [29]–[33]. Through the pioneering work in [20], [31], the performance on the two baseline

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call