Abstract

Foreground-background separation of surveillance video, that models static background and extracts moving foreground simultaneously, attracts increasing attentions in building a smart city. Conventional techniques towards this always consider the background as primary target and tend to adopt low-rank constraint as its estimator, which provides finite (equal to the value of rank) alternatives when constructing the background. However, in practical missions, although general sketch of background is stable, some details change constantly. Aimed at this, we propose to represent the general background by a linear combination of some atoms and record the detailed background by spatiotemporal clustered patches. Then, the moving foreground is considered as a mixture of active contours and continuous contents. Eventually, joint optimization is conducted under a unified framework, i.e., alternating direction multipliers method (ADMM), and produces our tensor model for hierarchical background and hierarchical foreground separation (THHS). The employed tensor space, which agrees with the instinct structure of video data, benefits all the spatiotemporal designs in both background modular and foreground part. Experimental results show that THHS is more adaptive to the dynamic background and produces more accurate foreground when compared against current state-of-the-art techniques.

Highlights

  • Background modeling and foreground extraction is a fundamental task in the area of artificial intelligence

  • A straightforward way is accomplished by employing extensive kinds of statistical models, i.e., Mixture of Gaussian (MOG) [6], clustering codebook [7], Support Vector Machine (SVM) [8], etc

  • By assuming that the foreground area obtained by the active contour model is denoted by φ, a constraint for foreground is given by min − φ

Read more

Summary

INTRODUCTION

Background modeling and foreground extraction is a fundamental task in the area of artificial intelligence. Supervised approaches to construct deep learned features and produce more effective classifiers for background/foreground [17], [18] Another way treats video data globally, i.e., extracting background by projecting the high-dimensional video data onto certain sparsity-base low-dimension space. As a low-rank space is spanned by a few basis vectors, intensive selection and dynamic training of the basic vectors (or dictionary atoms) are considered able to produce more effective representation of the background [23]–[25]. Attempts towards this have not yet produced satisfying background modeling performance.

RELATED WORK
FORMULATION OF THHS
HIERARCHICAL FOREGROUND MODEL
SOLVING BACKGROUND PHASE
SOLVING FOREGROUND MODELS
COMPARATIVE TEST
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call