DeepFTSG: Multi-stream Asymmetric USE-Net Trellis Encoders with Shared Decoder Feature Fusion Architecture for Video Motion Segmentation

Gani Rahmon,Kannappan Palaniappan,Imad Eddine Toubal,Filiz Bunyak,Raghuveer Rao,Guna Seetharaman

doi:10.1007/s11263-023-01910-x

Abstract

Discriminating salient moving objects against complex, cluttered backgrounds, with occlusions and challenging environmental conditions like weather and illumination, is essential for stateful scene perception in autonomous systems. We propose a novel deep architecture, named DeepFTSG, for robust moving object detection that incorporates single and multi-stream multi-channel USE-Net trellis asymmetric encoders extending U-Net with squeeze and excitation (SE) blocks and a single shared decoder network for fusing multiple motion and appearance cues. DeepFTSG is a deep learning based approach that builds upon our previous hand-engineered flux tensor split Gaussian (FTSG) change detection video analysis algorithm which won the CDNet CVPR Change Detection Workshop challenge competition. DeepFTSG generalizes much better than top-performing motion detection deep networks, such as the scene-dependent ensemble-based FgSegNet_v2, while using an order of magnitude fewer weights. Short-term motion and longer-term change cues are estimated using general-purpose unsupervised methods—flux tensor and multi-modal background subtraction, respectively. DeepFTSG was evaluated using the CDnet-2014 change detection challenge dataset, the largest change detection video sequence benchmark with 12.3 billion labeled pixels, and had an overall F-measure of 97%. We also evaluated the cross-dataset generalization capability of DeepFTSG trained solely on CDnet-2014 short video segments and then evaluated on unseen SBI-2015, LASIESTA and LaSOT benchmark videos. On the unseen SBI-2015 dataset, DeepFTSG had an F-measure accuracy of 87%, more than 30% higher compared to the top-performing deep network FgSegNet_v2 and outperforms the recently proposed KimHa method by 17%. On the unseen LASIESTA, DeepFTSG had an F-measure of 88% and outperformed the best recent deep learning method BSUV-Net2.0 by 3%. On the unseen LaSOT with axis-aligned bounding box ground-truth, network segmentation masks were converted to bounding boxes for evaluation, DeepFTSG had an F-Measure of 55%, outperforming KimHa method by 14% and FgSegNet_v2 by almost 1.5%. When a customized single DeepFTSG model is trained in a scene-dependent manner for comparison with state-of-the-art approaches, then DeepFTSG performs significantly better, reaching an F-Measure of 97% on SBI-2015 (+ 10%) and 99% on LASIESTA (+ 11%). The source code, pre-trained weights, and video demo for DeepFTSG are available at https://github.com/CIVA-Lab/DeepFTSG.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Computer Vision	Publication Date: Oct 17, 2023
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

DeepFTSG: Multi-stream Asymmetric USE-Net Trellis Encoders with Shared Decoder Feature Fusion Architecture for Video Motion Segmentation

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Vision

Lead the way for us

Similar Papers

Hyperspectral Image Classification Model Using Squeeze and Excitation Network with Deep Learning.
Rajendran T ... Rinesh S
Computational Intelligence and Neuroscience | VOL. 2022
Rajendran T, et. al.Rajendran T ... Rinesh S
04 Aug 2022
Computational Intelligence and Neuroscience | VOL. 2022

A novel attention-based deep learning method for post-disaster building damage classification
Chang Liu ... Linlin Ge
Expert Systems with Applications | VOL. 202
Chang Liu, et. al.Chang Liu ... Linlin Ge
20 Apr 2022
Expert Systems with Applications | VOL. 202

NucleiSegNet: Robust deep learning architecture for the nuclei segmentation of liver cancer histopathology images
Shyam Lal ... Jyoti Kini
Computers in Biology and Medicine | VOL. 128
Shyam Lal, et. al.Shyam Lal ... Jyoti Kini
03 Nov 2020
Computers in Biology and Medicine | VOL. 128

Unified building change detection pre-training method with masked semantic annotations
Yujun Quan ... Xuanbei Lu
International Journal of Applied Earth Observation and Geoinformation | VOL. 120
Yujun Quan, et. al.Yujun Quan ... Xuanbei Lu
01 Jun 2023
International Journal of Applied Earth Observation and Geoinformation | VOL. 120

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DeepFTSG: Multi-stream Asymmetric USE-Net Trellis Encoders with Shared Decoder Feature Fusion Architecture for Video Motion Segmentation

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Vision