Abstract

Background subtraction is described as the task of distinguishing pixels into moving objects and the background in a frame. In this paper, we propose a fully convolutional encoder–decoder spatial–temporal network (FCESNet) to achieve real-time background subtraction. In the proposed many-to-many architecture method encoded features of consecutive frames are fed into a spatial–temporal information transmission (STIT) module to capture the spatial–temporal correlation in the frame sequence, and then a decoder is designed to output the subtraction results of all frames. A “patch-based” training method is designed to increase the practicability and flexibility of the proposed method. The experiments over CDNet2014 have shown that the proposed method could achieve state-of-the-art performance. The proposed method is proved to be able to achieve real-time background subtraction.

Highlights

Read more

Summary

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call