Video representation learning through prediction for online object detection

Masato Fujitake,Akihiro Sugimoto

doi:10.1109/wacvw54805.2022.00059

Abstract

We present a video representation learning framework for real-time video object detection. Our approach is based on the interesting observation that a powerful prior knowledge of video context helps to improve object recognition, and it can be acquired via learning video representations through stochastic video prediction. Our proposed framework utilizes the stochastic video prediction into object detection so that we first acquire a prior knowledge of videos to have video representations and then adjust them to object detection to improve the accuracy. We validate our proposed method on ImageNet VID and VisDrone-VID2019 datasets to demonstrate the effectiveness of video representation learning via future video prediction. In particular, our extensive experiments on ImageNet VID show that our approach achieves 73.1% mAP at 54 fps with ResNet-50 on commercial GPUs

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Video representation learning through prediction for online object detection

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

REAL-TIME OBJECT DETECTION IN VIDEOS USING DEEP LEARNING MODELS
Monika M ... Aniruddha S Rumale
ICTACT Journal on Image and Video Processing | VOL. 14
Monika M, et. al.Monika M ... Aniruddha S Rumale
01 Nov 2023
ICTACT Journal on Image and Video Processing | VOL. 14

Real-Time Object Detection Based on UAV Remote Sensing: A Systematic Literature Review
Zhen Cao ... João Valente
Drones | VOL. 7
Zhen Cao, et. al.Zhen Cao ... João Valente
03 Oct 2023
Drones | VOL. 7

Design of Object Detection Systems on a SoC-FPGA Platform

-

20 Dec 2017
20 Dec 2017

CNN-based object detection solutions for embedded heterogeneous multicore SoCs
Cheng Wang ... Zhenyu Quan
-
Cheng Wang, et. al.Cheng Wang ... Zhenyu Quan
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Video representation learning through prediction for online object detection

Abstract

Talk to us

Similar Papers