Abstract

Moving object detection (MOD) in videos is a challenging task. Estimation of accurate background is the key to extracting the foreground from video frames. In this paper, we have proposed a novel compact end-to-end convolutional neural network architecture, motion saliency foreground network (MSFgNet), to estimate the background and to extract the foreground from video frames. Initially, the long streaming video is divided into a number of small video streams (SVS). The proposed network takes the SVS as an input and estimates the background frame for each SVS. Second, the saliency map is extracted using the current video frame and estimated background. Furthermore, a compact encoder-decoder network is proposed to extract the foreground from the estimated saliency maps. The performance of the proposed MSFgNet is tested on three benchmark datasets (CDnet-2014, LASIESTA, and PTIS) for MOD. The computational complexity (handling of number of parameters and execution time) and the performance of the proposed MSFgNet are compared with the existing state-of-the-art methods for MOD in terms of precision, recall, and F-measure. Performance analysis shows that the proposed network is very compact and outperforms the existing state-of-the-art methods for MOD in videos.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call