Abstract
With the explosive growth of video data on the Internet, retrieving and detecting similar video contents effectively has become a challenging problem. Whereas hashing is a mature technique for dealing with this problem, especially in image retrieval, when hashing techniques are applied to videos, a large gap in performance is always encountered compared with that of image hashing methods because of the complicated structures of videos. Generally, existing video hashing methods just directly apply image hashing approaches into video frames without considering temporal structure, leading to low performance in video retrieval. In this study, we proposed a video hashing method, called classification-enhancement deep hashing (CEDH), for large-scale video searches. The proposed CEDH first fuses the spatial–temporal information of videos in a deep end-to-end hashing network, and then leverages both neighborhood structure of semantics and triple similarity information to learn video hash codes. Subsequently, to enhance the precision of the hash codes during hash learning, a classification module is added after the fully connected layer of the deep network. We also use an additional code constraint to make the hash codes more suitable for containing sufficient information. Extensive experiments on three real-world large-scale video datasets show that our proposed method significantly outperforms state-of-the-art algorithms. • Triplet-wise loss is applied into video hashing for similarity preserving. • Add constraints to make hash codes more informative. • Utilize classification term to enhance the hash learning.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have