Abstract

In order to analyze facial expression in human-computer interaction, real time face-tracking has become significant research problem. Traditional face tracking methods have achieved good results in some constrained environments (such as good illumination, no background interference, etc.) However, these methods require to design manual facial features depending on researcher’s experience. In addition, lacking ability for generalization problems is worthy of study. The robustness of face tracking in complex scenes is challenging due to fast moving, multi-scale changes, rotation and occlusion, illumination changes, etc. In view of the above considerations, this paper proposes an improved method based on Siamese-Net to optimize for face tracking tasks. Our work mainly includes four aspects. First, the first two convolutional layers of deeper VGG-16 are used to extract feature. So we call our method Siamese-VGG. Second, we report experiment on face tracking using a pre-trained VGG-Face model which is trained by 2.6M images for face recognition and then fine-tuning to acceleration convergence. Third, in this research the same size crops are input to two branches in the framework and then the inner smaller template feature maps are extracted during training. The proposed method in this paper reduce offset losses by this way. Finally, L2 regularization add to the loss function to improve the generalization ability of the model. The experiment results show better robustness and generalization performance over the original algorithm. In complex scenes, the proposed improved method have achieved the great improvements by almost 11% on average overlap. But, the frame rate of improved method is still 18.5fps on the Nvidia GTX1070Ti GPU.The improved method proposed in this paper is more practical in terms of speed and accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call