Multi-Agent Reinforcement Learning for Real-Time Dynamic Production Scheduling in a Robot Assembly Cell

Dazzle Johnson,Yuqian Lu,Gang Chen

doi:10.1109/lra.2022.3184795

Abstract

As industry rapidly shifts towards mass personalisation, the need for a decentralised multi-agent system capable of dynamic flexible job shop scheduling (FJSP) is evident. Traditional heuristic and meta-heuristic scheduling methods cannot achieve satisfactory results and have limited application to static environments. Recent Reinforcement Learning (RL) approaches that consider dynamic FJSP, lack flexibility and autonomy as they use a single-agent centralised model, assuming global observability. As such, we propose a Multi-Agent Reinforcement Learning (MARL) system for scheduling dynamically arriving assembly jobs in a robot assembly cell. We applied a Double DQN-based algorithm and proposed a generalised observation, action and reward design for the dynamic FJSP setting. Using a centralised training phase, each agent (i.e., robot) in the assembly cell executes decentralised scheduling decisions based on local observations. Our solution demonstrated improved performance against rule-based heuristic methods, for optimising makespan. We also reported the impact of different observation sizes of each agent on optimisation performance.

Full Text