Learnable spatiotemporal feature pyramid for prediction of future optical flow in videos

Laisha Wadhwa,Snehasis Mukherjee

doi:10.1007/s00138-020-01145-7

Abstract

The success of deep learning-based techniques in solving various computer vision problems motivated the researchers to apply deep learning to predict the optical flow of a video in the next frame. However, the problem of predicting the motion of an object in the next few frames remains an unsolved and less explored problem. Given a sequence of frames, predicting the motion in the next few frames of the video becomes difficult in cases where the displacement of optical flow vector across frames is large. Traditional CNNs often fail to learn the dynamics of the objects across frames in case of large displacements of objects in consecutive frames. In this paper, we present an efficient CNN based on the concept of feature pyramid for extracting the spatial features from a few consecutive frames. The spatial features extracted from consecutive frames by a modified PWC-Net architecture are fed into a bidirectional LSTM for obtaining the temporal features. The proposed spatiotemporal feature pyramid is able to capture the abrupt motion of the moving objects in video, especially when displacement of the object is large across the consecutive frames. Further, the proposed spatiotemporal pyramidal feature can effectively predict the optical flow in next few frames, instead of predicting only the next frame. The proposed method of predicting optical flow outperforms the state of the art when applied on challenging datasets such as “MPI Sintel Final Pass,” “Monkaa” and “Flying Chairs” where abrupt and large displacement of the moving objects in consecutive frames is the main challenge.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learnable spatiotemporal feature pyramid for prediction of future optical flow in videos

Abstract

Talk to us

Similar Papers

More From: Machine Vision and Applications

Lead the way for us

Journal: Machine Vision and Applications	Publication Date: Nov 17, 2020
Citations: 1

Similar Papers

Modified YOLO Module for Efficient Object Tracking in a Video
Varsha Kshirsagar ... Manish Chaturvedi
IEEE Latin America Transactions | VOL. 21
Varsha Kshirsagar, et. al.Varsha Kshirsagar ... Manish Chaturvedi
01 Mar 2023
IEEE Latin America Transactions | VOL. 21

Recovery of Lost Color and Depth Frames in Multiview Videos.
Ting-Lan Lin ... Neng-Chieh Yang
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. 27
Ting-Lan Lin, et. al.Ting-Lan Lin ... Neng-Chieh Yang
29 Aug 2017
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. 27

Chapter 12 - FPGA-Based Detection and Tracking System for Surveillance Camera
Anitha Mary ... Aldrin Karunakaran
The Cognitive Approach in Cloud Computing and Internet of Things Technologies for Surveillance Tracking Systems | VOL. -
Anitha Mary, et. al.Anitha Mary ... Aldrin Karunakaran
01 Jan 2020
The Cognitive Approach in Cloud Computing and Internet of Things Technologies for Surveillance Tracking Systems | VOL. -

Dynamic Feature Cascade for Multiple Object Tracking with Trackability Analysis
Zheng Li ... Haifeng Gong
-
Zheng Li, et. al.Zheng Li ... Haifeng Gong
27 Aug 2007
27 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learnable spatiotemporal feature pyramid for prediction of future optical flow in videos

Abstract

Talk to us

Similar Papers

More From: Machine Vision and Applications