Classification-enhancement deep hashing for large-scale video retrieval

Xiushan Nie,Xin Zhou,Yang Shi,Jiande Sun,Yilong Yin

doi:10.1016/j.asoc.2021.107467

Abstract

With the explosive growth of video data on the Internet, retrieving and detecting similar video contents effectively has become a challenging problem. Whereas hashing is a mature technique for dealing with this problem, especially in image retrieval, when hashing techniques are applied to videos, a large gap in performance is always encountered compared with that of image hashing methods because of the complicated structures of videos. Generally, existing video hashing methods just directly apply image hashing approaches into video frames without considering temporal structure, leading to low performance in video retrieval. In this study, we proposed a video hashing method, called classification-enhancement deep hashing (CEDH), for large-scale video searches. The proposed CEDH first fuses the spatial–temporal information of videos in a deep end-to-end hashing network, and then leverages both neighborhood structure of semantics and triple similarity information to learn video hash codes. Subsequently, to enhance the precision of the hash codes during hash learning, a classification module is added after the fully connected layer of the deep network. We also use an additional code constraint to make the hash codes more suitable for containing sufficient information. Extensive experiments on three real-world large-scale video datasets show that our proposed method significantly outperforms state-of-the-art algorithms.

Full Text