Abstract

An essential feature for navigating socially with a mobile robot is the upper body orientation of persons in its vicinity. For example, in a supermarket orientation indicates whether a person is looking at goods on the shelves or where a person is likely to go. However, given limited computing and battery capabilities, it is not possible to rely on high-performance graphics cards to run large, computationally expensive deep neural networks for orientation estimation in real time. Nevertheless, deep learning performs quite well for regression problems. Therefore, we tackle the problem of upper body orientation estimation with small yet efficient deep neural networks on a mobile robot in this paper. We employ a fast person detection approach as preprocessing that outputs fixed size person images before the actual estimation of the orientation is done. The combination with lightweight networks allows us to estimate a continuous angle in real time, even using a CPU only. We experimentally evaluate the performance of our system on a new, self-recorded data set consisting of more than 100,000 RGB-D samples from 37 persons, which is made publicly available. We also do an extensive comparison of different network architectures and output encodings for their applicability in estimating orientations. Furthermore, we show that depth images are more suitable for the task of orientation estimation than RGB images or the combination of both.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call