End-to-End Background Subtraction via a Multi-Scale Spatio-Temporal Model

Yizhong Yang,Guangjun Xie,Jinzhao Hu,Dong Xu,Tao Zhang

doi:10.1109/access.2019.2930319

Abstract

Background subtraction is an important task in computer vision. Traditional approaches usually utilize low-level visual features like color, texture, or edge to build background models. Due to the lack of deep features, they often achieve poor performance when facing complex video scenes such as illumination changes, background, or camera motions, camouflage effects and shadows. Recently, deep learning has shown to perform well in extracting deep features. To improve the robustness of background subtraction, in this paper, we propose an end-to-end multi-scale spatio-temporal (MS-ST) method which is able to extract deep features from video sequences. First, a video clip is input into a convolutional neural network for extracting multi-scale spatial features. Subsequently, to exploit the temporal information, we combine temporal sampling operations and ConvLSTM modules to extract the multi-scale temporal contextual information. Finally, the segmentation result is generated by fusing multi-scale spatio-temporal features. The experimental results on the CDnet-2014 dataset and the LASIESTA dataset demonstrate the effectiveness and superiority of the proposed method.

Highlights

Background subtraction is an important task in the computer vision domain and it plays a fundamental role in many applications such as automatic drive [1], object tracking [2], crowd analysis [3], traffic analytics [4], and automated anomaly detection [5] in video surveillance
The common background subtraction evaluation metrics are used for comparison including: Recall, Precision, Specificity, False Positive Rate (FPR), False Negative Rate (FNR), Percentage of Wrong Classifications (PWC), and F-Measure
In this paper, we proposed a novel background subtraction method to label the foreground on video sequences automa-tically

Summary

INTRODUCTION

Background subtraction is an important task in the computer vision domain and it plays a fundamental role in many applications such as automatic drive [1], object tracking [2], crowd analysis [3], traffic analytics [4], and automated anomaly detection [5] in video surveillance. These algorithms work well only on some specific or simple videos, but yield poor performance when facing sudden illumination changes, hard shadows, camouflage and so on. Yang et al [35] proposed a background modeling method, which extracts spatio-temporal features using 2D fully convolutional network. Multi-scale features are effectively extracted by 3D convolution operations in both spatial and temporal domains, [37] performs poorly when processing intermittent motion. We propose to subtract background by using a novel end-to-end multi-scale spatio-temporal (MS-ST) method without complex background model and conventional hand-crafted features. 2D CNN and ConvLSTM are used to extract deep multi-scale temporal and spatial features from input video clip.

RELATED WORK

TEMPORAL FEATURE EXTRACTOR

EXPERIMENTS ANALYSIS

INTRODUCTION TO DATASET

INTRODUCTION TO EVALUATION METRIC

RESULTS ON CDnet-2014 DATASET

RESULTS ON LASIESTA DATASET

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 18	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

End-to-End Background Subtraction via a Multi-Scale Spatio-Temporal Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Deep Convolutional Neural Network-Assisted Feature Extraction for Diagnostic Discrimination and Feature Visualization in Pancreatic Ductal Adenocarcinoma (PDAC) versus Autoimmune Pancreatitis (AIP).
Sebastian Ziegelmayer ... Tamara Müller
Journal of Clinical Medicine | VOL. 9
Sebastian Ziegelmayer, et. al.Sebastian Ziegelmayer ... Tamara Müller
11 Dec 2020
Journal of Clinical Medicine | VOL. 9

Object-oriented multiscale deep features for hyperspectral image classification
Liang Hong ... Meng Zhang
International Journal of Remote Sensing | VOL. 41
Liang Hong, et. al.Liang Hong ... Meng Zhang
03 May 2020
International Journal of Remote Sensing | VOL. 41

A Universal Foreground Segmentation Technique using Deep-Neural Network
Midhula Vijayan ... R Mohan
Multimedia Tools and Applications | VOL. 79
Midhula Vijayan, et. al.Midhula Vijayan ... R Mohan
06 May 2020
Multimedia Tools and Applications | VOL. 79

Convolutional Neural Network based Age Estimation from Facial Image and Depth Prediction from Single Image

-

01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

End-to-End Background Subtraction via a Multi-Scale Spatio-Temporal Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access