Abstract

Low-cost systems that can obtain a high-quality foreground segmentation almost independently of the existing illumination conditions for indoor environments are very desirable, especially for security and surveillance applications. In this paper, a novel foreground segmentation algorithm that uses only a Kinect depth sensor is proposed to satisfy the aforementioned system characteristics. This is achieved by combining a mixture of Gaussians-based background subtraction algorithm with a new Bayesian network that robustly predicts the foreground/background regions between consecutive time steps. The Bayesian network explicitly exploits the intrinsic characteristics of the depth data by means of two dynamic models that estimate the spatial and depth evolution of the foreground/background regions. The most remarkable contribution is the depth-based dynamic model that predicts the changes in the foreground depth distribution between consecutive time steps. This is a key difference with regard to visible imagery, where the color/gray distribution of the foreground is typically assumed to be constant. Experiments carried out on two different depth-based databases demonstrate that the proposed combination of algorithms is able to obtain a more accurate segmentation of the foreground/background than other state-of-the art approaches.

Highlights

  • Video surveillance in indoor environments is an active focus of research because of its high interest for the security industry [1,2]

  • The proposed FG/BG segmentation system consists of three modules: (1) mixture of Gaussians algorithm (MoG)-based background subtraction (MoG-BS); (2) Bayesian network-based FG/BG prediction (BN-FBP); and

  • The proposed technique, BayesNet, achieves the best score, which is mainly attributed to the spatial and depth dynamic models that explicitly exploit the inherent properties of the depth imagery

Read more

Summary

Introduction

Video surveillance in indoor environments is an active focus of research because of its high interest for the security industry [1,2]. We propose a combination of two algorithms to obtain a high-quality foreground segmentation using only depth data information acquired by a Kinect sensor (first generation), which is ideal for security/surveillance indoor applications that have to deal with situations of low, unpredictable or no lighting. An explicit depth-based dynamic model is proposed in this paper, allowing for the prediction of the depth evolution of foreground moving objects with an arbitrary and more realistic camera setting Another key advantage of the proposed Bayesian network is that it is able to obtain satisfactory results in distances beyond the recommended operating conditions of the Kinect sensor, extending the range of approximately 1.2 to 3.5 m [25] up to 10–12 m thanks to the adaptive processing of the Kinect sensor noise.

System Description
BN-FBP Module
Description of the Bayesian Network
Derivation of the Posterior Joint pdf
Spatial Dynamic Models
Depth-Based Appearance Dynamic Models
Depth-Based Observation Model
Inference
Results
Operational and Practical Issues
Conclusions
Background

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.