Abstract

Video hashing has attracted increasing attention in the field of large-scale video retrieval. However, only low-level features or their combinations, referred to as appearance features, are used to generate the video hash in most of the existing video hashing algorithms and these kinds of features are referred to as appearance features. In this paper, a visual attention model is used to extract visual attention features, and the video hash is generated from a fusion of visual-appearance and visual-attention features via a deep belief network (DBN) to obtain representative video features. In addition, hash distance is taken as a vector to measure the similarity between hashes. BER is used as the amplitude of hash distance and the vector cosine similarity is used as the angle of hash distance. Experimental results demonstrate that the fusion of visual appearance and attention features brings about better performance of video hash on recall and precision rates, and the angle of hash distance is useful to improve the accuracy of hash matching.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call