Abstract

The identification of human activities from videos is important for many applications. For such a task, three-dimensional (3D) depth images or image sequences (videos) can be used, which represent the positioning information of the objects in a 3D scene obtained from depth sensors. This paper presents a framework to create foreground–background masks from depth images for human body segmentation. The framework can be used to speed up the manual depth image annotation process with no semantics known beforehand and can apply segmentation using a performant algorithm while the user only adjusts the parameters, or corrects the automatic segmentation results, or gives it hints by drawing a boundary of the desired object. The approach has been tested using two different datasets with a human in a real-world closed environment. The solution has provided promising results in terms of reducing the manual segmentation time from the perspective of the processing time as well as the human input time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call