Abstract

Automatically linking faces in Web videos with their names scattered in the surrounding text (e.g., the user generated title and tags) is an important task for many applications. Traditionally, this task is accomplished either by jointly exploring visual-textual consistency under constraints, or by leveraging external resources, e.g., public facial images. This paper follows the second paradigm and implements the name-face association by matching faces appearing in Web videos with carefully collected Web facial images. Specially, given a Web video, we first identify the relevant and discriminative tags from its surrounding text. The tags are defined as Contextual Tags (CTags) as they roughly give the semantic context of the video (e.g., who are doing what at when and where). Then, facial images are retrieved by issuing a commercial search engine using the assembled text queries, where each query contains a detected name and one of the top CTags. By doing this, we crawl facial images that are highly relevant to the person in the video context, and thus the task of name-face association can be simply implemented by matching faces. Compared with traditional methods, our novelty lies in the exploration of both visual content of the video and crowdsourced text of the context that aims to find more specific facial images from the Web to facilitate the association. Experimental results on real-world Web videos containing faces and celebrity names show that the proposed method outperforms several existing methods in performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.