3D convolutional long short-term encoder-decoder network for moving object segmentation

Anil Turker,Ender Eksioglu

doi:10.2298/csis230129044t

Abstract

Moving object segmentation (MOS) is one of the important and wellstudied computer vision tasks that is used in a variety of applications, such as video surveillance systems, human tracking, self-driving cars, and video compression. While traditional approaches to MOS rely on hand-crafted features or background modeling, deep learning methods using Convolution Neural Networks (CNNs) have been shown to be more effective in extracting features and achieving better accuracy. However, most deep learning-based methods for MOS offer scene-dependent solutions, leading to reduced performance when tested on previously unseen video content. Because spatial features are insufficient to represent the motion information, the spatial and temporal features should be used together to succeed in unseen videos. To address this issue, we propose the MOS-Net deep framework, an encoder-decoder network that combines spatial and temporal features using the flux tensor algorithm, 3D CNNs, and ConvLSTM in its different variants. MOS-Net 2.0 is an enhanced version of the base MOS-Net structure, where additional ConvLSTM modules are added to 3D CNNs for extracting long-term spatiotemporal features. In the final stage of the framework the output of the encoder-decoder network, the foreground probability map, is thresholded for producing a binary mask where moving objects are in the foreground and the rest forms the background. In addition, an ablation study has been conducted to evaluate different combinations as inputs to the proposed network, using the ChangeDetection2014 (CDnet2014) which includes challenging videos such as those with dynamic backgrounds, bad weather, and illumination changes. In most approaches, the training and test strategy are not announced, making it difficult to compare the algorithm results. In addition, the proposed method can be evaluated differently as video-optimized or video-agnostic. In video-optimized approaches, the training and test set is obtained randomly and separated from the overall dataset. The results of the proposed method are compared with competitive methods from the literature using the same evaluation strategy. It has been observed that the introduced MOS networks give highly competitive results on the CDnet2014 dataset. The source code for the simulations provided in this work is available online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer Science and Information Systems	Publication Date: Jan 1, 2024
Citations: 2	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

3D convolutional long short-term encoder-decoder network for moving object segmentation

Abstract

Talk to us

Similar Papers

More From: Computer Science and Information Systems

Lead the way for us

Similar Papers

A Fully Convolutional Encoder-Decoder Network for Moving Object Segmentation
Anil Turker ... Ender M Eksioglu
-
Anil Turker, et. al.Anil Turker ... Ender M Eksioglu
08 Aug 2022
08 Aug 2022

A Novel Saliency-Based Cascaded Approach for Moving Object Segmentation
Prashant W Patil ... Anil B Gonde
-
Prashant W Patil, et. al.Prashant W Patil ... Anil B Gonde
01 Jan 2020
01 Jan 2020

Lightweight Multilevel Feature Fusion Network for Hyperspectral Image Classification
Quanyu Huang ... Yanshan Li
-
Quanyu Huang, et. al.Quanyu Huang ... Yanshan Li
26 Nov 2022
26 Nov 2022

Two‐stage deep learning model for fully automated pancreas segmentation on computed tomography: Comparison with intra‐reader and inter‐reader reliability at full and reduced radiation dose on an external dataset
Ananya Panda ... Suresh T Chari
Medical Physics | VOL. 48
Ananya Panda, et. al.Ananya Panda ... Suresh T Chari
16 Mar 2021
Medical Physics | VOL. 48

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

3D convolutional long short-term encoder-decoder network for moving object segmentation

Abstract

Talk to us

Similar Papers

More From: Computer Science and Information Systems