Abstract

Human pose estimation (HPE) is a research hotspot in the field of computer vision. Most of the existing approaches first generate low-resolution representation from high-resolution representation through continuous serial downsampling, and then reconstruct high-resolution results from low-resolution features through continuous serial upsampling, which loses a lot of effective feature information and leads to slow model inference. In this paper, the Fast Accuracy Network (FANet), a framework that enables fast and high-accuracy HPE, is proposed. The innovation lies in that, first of all, a grid structure is proposed and adopted, which can be regarded as a set of deep paths and shallow paths. The structure uses multiple high-resolution and low-resolution branch pairs to perform skip-level connections at different scale-space levels so that the information can be exchanged between different resolution representations for many times. The feature information fusion of multi-scale space is realized to obtain more abundant feature information. Second, an improved bottleneck block is proposed to extract effective feature information with fewer parameters, ensuring that the computational burden is reduced without sacrificing accuracy performance. The experimental results show that, compared with other current models, FANet has faster inference speed on the premise of a slight improvement in accuracy performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call