Abstract
Recently, tremendous strides have been made in generic object detection when used to detect faces, and there are still some remaining challenges. In this paper, a novel method is proposed named multilevel single stage network for face detection (MSNFD). Three breakthroughs are made in this research. Firstly, multilevel network is introduced into face detection to improve the efficiency of anchoring faces. Secondly, enhanced feature module is adopted to allow more feature information to be collected. Finally, two-stage weight loss function is employed to balance network of different levels. Experimental results on the WIDER FACE and FDDB datasets confirm that MSNFD has competitive accuracy to the mainstream methods, while keeping real-time performance.
Highlights
Face detection, the basis of face alignment [1, 2], face recognition [3, 4], facial expression analysis [5, 6], and other related facial problems, has always been a hot issue and widely applied in terms of computer vision
With the development of deep learning, convolutional neural networks (CNN), great advance has been achieved and practically applied in image classification [9] and semantic segmentation [10], which inspired the research of face detectors that can be classified into two modes
Add enhanced feature module into YOLO v3 only, the result shows that average precision (AP) increases from 58.5% to 65.5% on the small set of Multiattribute Labelled Faces (MALF) and from 47.3% to 52.3% on the hard set of wider face, because the enhanced feature module can combine the contextual information of small faces to improve the representation ability of features
Summary
The basis of face alignment [1, 2], face recognition [3, 4], facial expression analysis [5, 6], and other related facial problems, has always been a hot issue and widely applied in terms of computer vision. SSD is an excellent one-stage method and uses multiscale feature map, it is worse for small objects because semantic value for bottom layer is not high at least not higher than YOLO v3. A multilevel single stage network for face detection based on YOLO v3 is designed and gets excellent results especially for small faces on the public datasets. (1) Using multilevel network structure with more anchor scales to detect smaller faces (2) Adopting enhanced feature module to add contextual and multiscale information to improve the ability of detecting small faces (3) Balancing the outputs of networks of each level with two-stage weight loss function to optimize the network training (4) Achieving excellent results on FDDB and WIDER FACE datasets with real-time detection speed
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have