Interpretable Neural Networks for Video Separation: Deep Unfolding RPCA with Foreground Masking.

Boris Joukovsky,Nikos Deligiannis,Yonina C Eldar

doi:10.1109/tip.2023.3336176

Abstract

We present two deep unfolding neural networks for the simultaneous tasks of background subtraction and foreground detection in video. Unlike conventional neural networks based on deep feature extraction, we incorporate domain-knowledge models by considering a masked variation of the robust principal component analysis problem (RPCA). With this approach, we separate video clips into low-rank and sparse components, respectively corresponding to the backgrounds and foreground masks indicating the presence of moving objects. Our models, coined ROMAN-S and ROMAN-R, map the iterations of two alternating direction of multipliers methods (ADMM) to trainable convolutional layers, and the proximal operators are mapped to non-linear activation functions with trainable thresholds. This approach leads to lightweight networks with enhanced interpretability that can be trained on limited data. In ROMAN-S, the correlation in time of successive binary masks is controlled with side-information based on ℓ1-ℓ1 minimization. ROMAN-R enhances the foreground detection by learning a dictionary of atoms to represent the moving foreground in a high-dimensional feature space and by using reweighted-ℓ1-ℓ1 minimization. Experiments are conducted on both synthetic and real video datasets, for which we also include an analysis of the generalization to unseen clips. Comparisons are made with existing deep unfolding RPCA neural networks, which do not use a mask formulation for the foreground, and with a 3D U-Net baseline. Results show that our proposed models outperform other deep unfolding networks, as well as the untrained optimization algorithms. ROMAN-R, in particular, is competitive with the U-Net baseline for foreground detection, with the additional advantage of providing video backgrounds and requiring substantially fewer training parameters and smaller training sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interpretable Neural Networks for Video Separation: Deep Unfolding RPCA with Foreground Masking.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing

Lead the way for us

Journal: IEEE Transactions on Image Processing	Publication Date: Jan 1, 2024
License type: cc-by-nc-sa

Similar Papers

Refining the Efficiency of R-CNN in Pedestrian Detection
Katleho L Masita ... Thokozani Shongwe
-
Katleho L Masita, et. al.Katleho L Masita ... Thokozani Shongwe
10 Sep 2021
10 Sep 2021

A convergence analysis of Nesterov’s accelerated gradient method in training deep linear neural networks
Xin Liu ... Zhisong Pan
Information Sciences | VOL. 612
Xin Liu, et. al.Xin Liu ... Zhisong Pan
05 Sep 2022
Information Sciences | VOL. 612

On random matrices arising in deep neural networks: General I.I.D. case
Leonid Pastur ... Victor Slavin
Random Matrices: Theory and Applications | VOL. 12
Leonid Pastur, et. al.Leonid Pastur ... Victor Slavin
14 Jul 2022
Random Matrices: Theory and Applications | VOL. 12

Meta Learning-Based MIMO Detectors: Design, Simulation, and Experimental Test
Jing Zhang ... Yu-Wen Li
IEEE Transactions on Wireless Communications | VOL. 20
Jing Zhang, et. al.Jing Zhang ... Yu-Wen Li
21 Oct 2020
IEEE Transactions on Wireless Communications | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpretable Neural Networks for Video Separation: Deep Unfolding RPCA with Foreground Masking.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing