Abstract

Foreground extraction from video stream is an important component in many multimedia applications. By exploiting commodity RGBD cameras, we could further extract dynamic foreground objects with 3D information in real-time, thereby enabling new forms of multimedia applications such as 3D telepresence. However, one critical problem with existing methods for real-time foreground extraction is temporal coherency. They could exhibit severe flickering results for foreground objects such as human motion, thus affecting the visual quality as well as the image object analysis in the multimedia applications. This paper presents a new GPU-based real-time foreground extraction method with several novel techniques. First, we detect shadow and fill missing depth data accordingly in RGBD video, and then adaptively combine color and depth masks to form a trimap. After that, we formulate a novel closed-form matting model to improve the temporal coherency in foreground extraction while achieving real-time performance. Particularly, we propagate RGBD data across temporal domain to improve the visual coherence in the foreground object extraction, and take advantage of various CUDA strategies and spatial data structures to improve the speed. Experiments with a number of users on different scenarios show that, compared with state-of-the-art methods, our method can extract stabler foreground objects with higher visual quality as well as better temporal coherency, while still achieving real-time performance (experimentally, 30.3 frames per second on average).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call