Abstract

The success of intelligent mobile robots operating and collaborating with humans in daily living environments depends on their ability to generalise and learn human movements, and obtain a shared understanding of an observed scene. In this paper we aim to understand human activities being performed in real-world environments from long-term observation from an autonomous mobile robot. For our purposes, a human activity is defined to be a changing spatial configuration of a person's body interacting with key objects that provide some functionality within an environment. To alleviate the perceptual limitations of a mobile robot, restricted by its obscured and incomplete sensory modalities, potentially noisy visual observations are mapped into an abstract qualitative space in order to generalise patterns invariant to exact quantitative positions within the real world. A number of qualitative spatial-temporal representations are used to capture different aspects of the relations between the human subject and their environment. Analogously to information retrieval on text corpora, a generative probabilistic technique is used to recover latent, semantically-meaningful concepts in the encoded observations in an unsupervised manner. The small number of concepts discovered are considered as human activity classes, granting the robot a low-dimensional understanding of visually observed complex scenes. Finally, variational inference is used to facilitate incremental and continuous updating of such concepts that allows the mobile robot to efficiently learn and update its models of human activity over time resulting in efficient life-long learning.

Highlights

  • 1.1 MotivationUnderstanding data from visual input is an increasingly important domain of scientific research

  • We investigate the problem of how an intelligent mobile robot can learn and understand human motion behaviours and simple activities in dynamic real-world human populated environments from partial and noisy observations of the inhabitants

  • We show how the encoded observations are mapped into an abstract qualitative space in order to generalise patterns invariant to exact quantitative positions within the real environment

Read more

Summary

Introduction

1.1 MotivationUnderstanding data from visual input is an increasingly important domain of scientific research. Video cameras capture data about a static scene or a dynamic environment and save this information as images These images are processed and represented in such a way that a system can extract particular properties or patterns and learn about the world being observed. Unsupervised learning frameworks over such long durations of time have the potential to allow mobile robots to become more helpful, especially when cohabiting human populated environments. Such robots can be adaptable to their surroundings, the particular time of day, or a specific individual being observed, saving considerable time and effort hard-coding specific information. By removing humans from the learning process, i.e. with no time-consuming data annotation, such robots can cheaply learn from greater quantities of available data (abstracted sensor observations), allowing them to adapt to their surroundings and save time/effort hard-coding specific information

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.