Abstract

In this paper, we present an unsupervised framework for discovering, detecting, tracking, and reconstructing dense objects from a video sequence. The system simultaneously localizes a moving camera, and discovers a set of shape and appearance models for multiple objects, including the scene background. Each object model is represented by both a 2D and 3D level-set. This representation is used to improve detection, 2D-tracking, 3D-registration and importantly subsequent updates to the level-set itself. This single framework performs dense simultaneous localization and mapping as well as unsupervised object discovery. At each iteration portions of the scene that fail to track, such as bulk outliers on moving rigid bodies, are used to either seed models for new objects or to update models of known objects. For the latter, once an object is successfully tracked in 2D with aid from a 2D level-set segmentation, the level-set is updated and then used to aid registration and evolution of a 3D level-set that captures shape information. For a known object either learned by our system or introduced from a third-party library, our framework can detect similar appearances and geometries in the scene. The system is tested using single and multiple object data sets. Results demonstrate an improved method for discovering and reconstructing 2D and 3D object models, which aid tracking even under significant occlusion or rapid motion.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call