A Continuous Semantic Embedding Method for Video Compact Representation

Tingting Han,Suguo Zhu,Yuankai Qi

doi:10.3390/electronics10243106

Tingting Han, Suguo Zhu + Show 1 more

Open Access

https://doi.org/10.3390/electronics10243106

Copy DOI

Abstract

Video compact representation aims to obtain a representation that could reflect the kernel mode of video content and concisely describe the video. As most information in complex videos is either noisy or redundant, some researchers have instead focused on long-term video semantics. Recent video compact representation methods heavily rely on the segmentation accuracy of video semantics. In this paper, we propose a novel framework to address these challenges. Specifically, we designed a novel continuous video semantic embedding model to learn the actual distribution of video words. First, an embedding model based on the continuous bag of words method is proposed to learn the video embeddings, integrated with a well-designed discriminative negative sampling approach, which helps emphasize the convincing clips in the embedding while weakening the influence of the confusing ones. Second, an aggregated distribution pooling method is proposed to capture the semantic distribution of kernel modes in videos. Finally, our well-trained model can generate compact video representations by direct inference, which provides our model with a better generalization ability compared with those of previous methods. We performed extensive experiments on event detection and the mining of representative event parts. Experiments on TRECVID MED11 and CCV datasets demonstrated the effectiveness of our method. Our method could capture the semantic distribution of kernel modes in videos and shows powerful potential to discover and better describe complex video patterns.

Highlights

Video representation is a classical topic in computer vision
We propose a novel discriminative negative sampling approach when training the continuous video embedding model to ensure that the learned semantic embedding of video words encode both the context coherence and the discriminative degree of video semantics
We propose a continuous video semantic embedding model to learn the actual distribution of video words

Summary

Introduction

Video representation is a classical topic in computer vision. Generally, to obtain video representations, the primary task is to extract the useful features from the videos. Recurrent neural networks (RNNs), long short-term memory (LSTM) [7,11,12,13] and modified hierarchical recurrent neural encoder (HRNE) [14] are used to model temporal information and represent videos. These frame-level or segment-level features are learned from well-trained deep models and could embed the visual semantic information of video content and temporal information together

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Continuous Semantic Embedding Method for Video Compact Representation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Dec 14, 2021
License type: CC BY 4.0

Similar Papers

Text-based sentential stress prediction using continuous lexical embedding for Mandarin speech synthesis
Yibin Zheng ... Ya Li
-
Yibin Zheng, et. al.Yibin Zheng ... Ya Li
01 Oct 2016
01 Oct 2016

Learning continuous temporal embedding of videos using pattern theory
Zhao Xie ... Jinkui Hou
Pattern Recognition Letters | VOL. 146
Zhao Xie, et. al.Zhao Xie ... Jinkui Hou
17 Mar 2021
Pattern Recognition Letters | VOL. 146

Discovering Latent Discriminative Patterns for Multi-Mode Event Representation
Wenlong Xie ... Xiaoshuai Sun
IEEE Transactions on Multimedia | VOL. 21
Wenlong Xie, et. al.Wenlong Xie ... Xiaoshuai Sun
01 Jun 2019
IEEE Transactions on Multimedia | VOL. 21

Learning a Facial Expression Embedding Disentangled from Identity
Wei Zhang ... Changjie Fan
-
Wei Zhang, et. al.Wei Zhang ... Changjie Fan
01 Jun 2021
01 Jun 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Continuous Semantic Embedding Method for Video Compact Representation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics