Abstract

Content based video retrieval has been an active research area for many decades. Unlike tagged-based search engines which rely on user-assigned annotations to retrieve the desired content, content based retrieval systems match the actual content of video with the provided query to fetch the required set of videos. Thanks to the recent advancements in deep learning, the traditional pipeline of content based systems (pre-processing, segmentation, object classification, action recognition etc.) is being replaced by end-to-end trainable systems which are not only effective and robust but also avoid the complex processing in the conventional image based techniques. The present study exploits these developments to develop a semantic video retrieval system accepting natural language queries and retrieving the relevant videos. We focus on key individuals appearing in certain scenarios as queries in the current study. Persons appearing in a video are recognized by tuning FaceNet to our set of images while caption generation is exploited to make sense of the scenario within a given video frame. The outputs of the two modules are combined to generate a description of the frame. During the retrieval phase, natural language queries are provided to the system and the concept of word embeddings is employed to find similar words to those appearing in the query text. For a given query, all videos where the queried individuals and scenarios have appeared are returned by the system. The preliminary experimental study on a collection of 50 videos reported promising retrieval results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.