Effective video hyperlinking by means of enriched feature sets and monomodal query combinations

Mohammad Reza Kavoosifar,Paolo Garza,Benoit Huet,Elena Baralis,Daniele Apiletti

doi:10.1007/s13735-019-00173-y

Abstract

Video content has been increasing at an unprecedented rate in recent years, bringing the need for improved tools providing efficient access to specific contents of interest. Within the management of video content, hyperlinking aims at determining related video segments from a collection with respect to an input video anchor. This paper describes the system we designed to address feature selection for the video hyperlinking challenge, as defined by TRECVID, one of the top worldwide venues for multimedia benchmarking. The proposed solution is based on different combinations of textual and visual features, enriched to capture the various facets of the videos: automatically generated transcripts, visual concepts, video metadata, named-entity recognition, and concept-mapping techniques. The different combinations of monomodal queries are experimentally evaluated, and the impact of both parameters and single features are discussed to identify their contributions. The best performing approach at the TRECVID 2017 video hyperlinking challenge was the ensemble feature selection, which includes three different monomodal queries based on enriched feature sets.

Full Text