Deep Layer and Spatial Aggregation neural network for human pose estimation

Deyuan Zhang,Haoguang Wang,Xiangbin Shi,Chao Weng

doi:10.1088/1742-6596/2010/1/012166

Abstract

The simplebaseline model achieves high performance of human pose estimation with simple network structure. But the model lacks the layer and spatial information fusion. In this paper, we propose DLSAnet, which fuse layers and spatial information efficetively. DLSAnet uses DLA as backbone which has excellent feature extraction capabilities in the field of object detection. In addition, a modified spatial pyramid pooling is introduced to pool and connect multi-scale local area features, allowing the network to learn object features more comprehensively. Using a four-branch SPP module instead of a single-branch SPP module connected by a single hopping layer. This method is effective in alleviating the problem of slow loss drop late in training. Experiments show that DLSAnet can achieve better accuracy.

Full Text