Recurrent Neural Networks for Hierarchically Mapping Human-Robot Poses

Zainab Al-Qurashi,Brian D Ziebart

doi:10.35708/rc1870-126267

Abstract

To perform many critical manipulation tasks successfully, human-robot mimicking systems should not only accurately copy the position of a human hand, but its orientation as well. Deep learning methods trained from pairs of corresponding human and robot poses offer one promising approach for constructing a human-robot mapping to accomplish this. However, ignoring the spatial and temporal structure of this mapping makes learning it less effective. We propose two different hierarchical architectures that leverage the structural and temporal human-robot mapping. We partially separate the robotic manipulator's end-effector position and orientation while considering the mutual coupling effects between them. This divides the main problem---making the robot match the human's hand position and mimic its orientation accurately along an unknown trajectory---into several sub-problems. We address these using different recurrent neural networks (RNNs) with Long-Short Term Memory (LSTM) that we combine and train hierarchically based on the coupling over the aspects of the robot that each controls. We evaluate our proposed architectures using a virtual reality system to track human table tennis motions and compare with single artificial neural network (ANN) and RNN models. We compare the benefits of using deep learning neural networks with and without our architectures and find smaller errors in position and orientation, along with increased flexibility in wrist movement are obtained by our proposed architectures. Also, we propose a hybrid approach to collect the training dataset. The hybrid training dataset is collected by two approaches when the robot mimics human motions (standard learn from demonstrator LfD) and when the human mimics robot motions (LfDr). We evaluate the hybrid training dataset and show that the performance of the machine learning system trained by the hybrid training dataset is better with less error and faster training time compared to using the collected dataset using standard LfD approach.

Full Text