Abstract

This chapter reviews and discusses recent research progress in multimodal analysis, representation, summarization, browsing, and retrieval. It introduces the video table of contents (ToC), the highlights, and the index, and presents techniques for constructing them. It further proposes a unified framework for video summarization, browsing, and retrieval to enable a user to go back and forth between browsing and retrieval. An essential part of the unified framework is composed of the weighted links. The links can be established between index entities and scenes, groups, shots, and key frames in the ToC structure for scripted content and between index entities and finer-resolution highlights, highlight candidates, audio-visual markers, and plays/breaks. For scripted content, focus is given on the links between index entities and shots. Shots are the building blocks of the ToC. An example of going from the visual index to the highlights is shown for unscripted content. This chapter recapitulates the key components of video highlights extraction and video retrieval. Video retrieval is concerned with how to return similar video clips to a user given a video query.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.