Abstract

Rotation-invariant face detection (RIFD) aims to detect faces with arbitrary rotation-in-plane (RIP) on both images and video sequences, which is challenging because of large appearance variations of rotated faces under unconstrained scenarios. The problem becomes more formidable when the speed and memory efficiency of the detectors are taken into account. To solve this problem, we propose a Multi-Task Progressive Calibration Networks (MTPCN), which not only enjoy the natural advantage of explicit geometric structure representation, but also retain important cue to guide the feature learning for precise calibration. More concretely, our framework leverages a cascaded architecture with three stages of carefully designed convolutional neural networks to predict face and landmark location with gradually decreasing RIP ranges. In addition, we propose a novel loss by further integrating geometric information into penalization, which is much more reasonable than simply measuring the differences of training samples equally. MTPCN achieves significant performance improvement on the multi-oriented FDDB dataset and the rotation WIDER face dataset. Extensive experiments are performed, and demonstrate the effectiveness of our method for each of three tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call