Abstract

Large-scale content-based semantic search in video is an interesting and fundamental problem in multimedia analysis and retrieval. Existing methods index a video by the raw concept detection score that is dense and inconsistent, and thus cannot scale to big data that are readily available on the Internet. This paper proposes a scalable solution. The key is a novel step called concept adjustment that represents a video by a few salient and consistent concepts that can be efficiently indexed by the modified inverted index. The proposed adjustment model relies on a concise optimization framework with interpretations. The proposed index leverages the text-based inverted index for video retrieval. Experimental results validate the efficacy and the efficiency of the proposed method. The results show that our method can scale up the semantic search while maintaining state-of-the-art search performance. Specifically, the proposed method (with reranking) achieves the best result on the challenging TRECVID Multimedia Event Detection (MED) zero-example task. It only takes 0.2 second on a single CPU core to search a collection of 100 million Internet videos.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call