Abstract

Based on Convolutional Neural Network, the paper presents a compact detection algorithm that can estimate the head pose from a single picture. Our method is based on soft stage wise regression. In order to reduce model complexity, three-dimensional detection of the “pitch, yaw, and roll” of the head posture adopts multi-level classification. Each level of classification requires only a small number of classification tasks and fewer neurons. In order to enhance the feature expression of the algorithm, the attention model is embedded. Attention model includes channel attention structure and spatial attention structure, enhancing the feature expression of the feature map in the two dimensions of the intermediate feature map channel and space. The attention model can be seamlessly integrated into the CNN architecture with low overhead. The experiment proves that the improved algorithm compares the method model proposed by Yang with a smaller complexity of 4.36M and an average absolute error of 0.7%~0.9%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call