Abstract

Enabling collaborative robots to predict the human pose is a challenging, but important issue to address. Most of the development of human pose estimation (HPE) adopt RGB images as input to estimate anatomical keypoints with Deep Convolutional Neural Networks (DNNs). However, those approaches neglect the challenge of detecting features reliably during night-time or in difficult lighting conditions, leading to safety issues. In response to this limitation, we present in this paper an RGB/Infra-Red camera fusion approach, based on the open-source library OpenPose, and we show how the fusion of keypoints extracted from different images can be used to improve the human pose estimation performance in sparse light environments. Specifically, OpenPose is used to extract body joints from RGB and Infra-Red images and the contribution of each frame is combined by a fusion step. We investigate the potential of a fusion framework based on Deep Neural Networks and we compare it to a linear weighted average method. The proposed approach shows promising performances, with the best result outperforming conventional methods by a factor 1.8x on a custom data set of Infra-Red and RGB images captured in poor light conditions, where it is hard to recognize people even by human inspection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call