A Lightweight Context-Aware Feature Transformer Network for Human Pose Estimation

Yanli Ma,Fan Zhang,Qingxuan Shi

doi:10.3390/electronics13040716

Abstract

We propose a Context-aware Feature Transformer Network (CaFTNet), a novel network for human pose estimation. To address the issue of limited modeling of global dependencies in convolutional neural networks, we design the Transformerneck to strengthen the expressive power of features. Transformerneck directly substitutes 3×3 convolution in the bottleneck of HRNet with a Contextual Transformer (CoT) block while reducing the complexity of the network. Specifically, the CoT first produces keys with static contextual information through 3×3 convolution. Then, relying on query and contextualization keys, dynamic contexts are generated through two concatenated 1×1 convolutions. Static and dynamic contexts are eventually fused as an output. Additionally, for multi-scale networks, in order to further refine the features of the fusion output, we propose an Attention Feature Aggregation Module (AFAM). Technically, given an intermediate input, the AFAM successively deduces attention maps along the channel and spatial dimensions. Then, an adaptive refinement module (ARM) is exploited to activate the obtained attention maps. Finally, the input undergoes adaptive feature refinement through multiplication with the activated attention maps. Through the above procedures, our lightweight network provides powerful clues for the detection of keypoints. Experiments are performed on the COCO and MPII datasets. The model achieves a 76.2 AP on the COCO val2017 dataset. Compared to other methods with a CNN as the backbone, CaFTNet has a 72.9% reduced number of parameters. On the MPII dataset, our method uses only 60.7% of the number of parameters, acquiring similar results to other methods with a CNN as the backbone.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Lightweight Context-Aware Feature Transformer Network for Human Pose Estimation

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Feb 9, 2024
License type: CC BY 4.0

Similar Papers

Super-resolution reconstruction of binocular image based on multi-level fusion attention network
Lei Xu ... Huihui Song
Journal of Image and Graphics | VOL. 28
Lei Xu, et. al.Lei Xu ... Huihui Song
01 Jan 2023
Journal of Image and Graphics | VOL. 28

Dite-HRNet: Dynamic Lightweight High-Resolution Network for Human Pose Estimation
Qun Li ... Bir Bhanu
-
Qun Li, et. al.Qun Li ... Bir Bhanu
01 Jul 2022
01 Jul 2022

Deep Discriminative Representation Learning with Attention Map for Scene Classification
Jun Li ... Guangluan Xu
Remote Sensing | VOL. 12
Jun Li, et. al.Jun Li ... Guangluan Xu
26 Apr 2020
Remote Sensing | VOL. 12

Recognition of ECG Signals by Convolutional Neural Network Based on Attentional Mechanism
Fang Gao ... Zhan Li
-
Fang Gao, et. al.Fang Gao ... Zhan Li
01 Mar 2021
01 Mar 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Lightweight Context-Aware Feature Transformer Network for Human Pose Estimation

Abstract

Talk to us

Similar Papers

More From: Electronics