Abstract
For autonomous robots to collaborate on joint tasks with humans they require a shared understanding of an observed scene. We present a method for unsupervised learning of common human movements and activities on an autonomous mobile robot, which generalises and improves on recent results. Our framework encodes multiple qualitative abstractions of RGBD video from human observations and does not require external temporal segmentation. Analogously to information retrieval in text corpora, each human detection is modelled as a random mixture of latent topics. A generative probabilistic technique is used to recover topic distributions over an auto-generated vocabulary of discrete, qualitative spatio-temporal code words. We show that the emergent categories align well with human activities as interpreted by a human. This is a particularly challenging task on a mobile robot due to the varying camera viewpoints which lead to incomplete, partial and occluded human detections.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Proceedings of the AAAI Conference on Artificial Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.