Enhancing construction robot learning for collaborative and long-horizon tasks using generative adversarial imitation learning

Rui Li,Zhengbo Zou

doi:10.1016/j.aei.2023.102140

Abstract

The development and deployment of robots on construction sites are integral to the industrialization of construction, known as Construction 4.0. Tele-operated and pre-programmed robots have enhanced construction efficiency and safety. However, their utilization on-site remains limited due to the need for expert remote control and the lack of adaptability in dynamic environments. Reinforcement learning (RL) has emerged as a promising solution, as RL-controlled robots possess inherent self-learning abilities to adapt to diverse situations. Nevertheless, manual design of RL reward functions for complex tasks poses challenges. To address this issue, inverse reinforcement learning (IRL) methods, such as Generative Adversarial Imitation Learning (GAIL), have been proposed to learn optimal actions through expert demonstration and self-exploration, without explicitly defined reward functions. In this study, we propose an innovative approach integrating GAIL and virtual reality (VR) integrated robot control approach to control robots for long-horizon collaborative construction tasks involving multiple sub-tasks. We employ VR expert demonstrations as input for GAIL training, enabling a team of robots, including an Unmanned Ground Vehicle (UGV) and two robot arms, to interact with the designed RL environment and perform tasks such as transporting, picking, and installing window panels. Handle long-horizon collaborative construction tasks (i.e., a long sequence of several sub-tasks performed by multiple robots). For evaluation, we compare the performance of our VR-GAIL model with a prevalent and robust RL baseline model, Proximal Policy Optimization (PPO). The results demonstrate that our reward-free VR-GAIL model achieves, on average, a 4.5% higher success rate than the PPO counterpart equipped with carefully designed reward functions across all three sub-tasks and their randomized variations. Furthermore, the performance gap between GAIL and PPO widens as the task difficulty increases. These findings indicate that our approach effectively enhances RL agent performance in tackling complex construction tasks while expediting development by eliminating reward function design requirements.

Full Text