Abstract

Micro-videos are one of the most popular multimedia forms in mobile internet domain, and scene recognition is important for micro-video semantic analyses and understanding. Compared with traditional videos, scene recognition in micro-videos is subject to content inconsistency for the same scene owing to the subjectivity of photographers. Moreover, the importance of frames for semantic representation differs in the same micro-video. These phenomenons limit micro-video scene recognition. To address these issues, in this paper, attention-based consistent semantic learning (ACSL) is proposed for micro-video scene recognition; this consists of a two-branch framework combined with an attention mechanism for the maintenance of the semantic consistency within classes. The experiments conducted in this study on multiple datasets revealed that the proposed ACSL achieves a better performance than other video scene recognition methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call