Abstract

Face recognition tasks have seen a significantly improved performance due to ConvNets. However, less attention has been given to face verification from videos. This paper makes two contributions along these lines. First, we propose a method, called stream loss, for learning ConvNets using unlabeled videos in the wild. Second, we present an approach for generating a face verification dataset from videos in which the labeled streams can be created automatically without human annotation intervention. Using this approach, we have assembled a widely scalable dataset, FaceSequence, which includes 1.5M streams capturing ∼ 500K individuals. Using this dataset, we trained our network to minimize the stream loss. The network achieves accuracy comparable to the state-of-the-art on the LFW and YTF datasets with much smaller model complexity. We also fine-tuned the network using the IJB-A dataset. The validation results show competitive accuracy compared with the best previous video face verification results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.