Abstract

Self-organizing systems feature flexibility and robustness for tasks that may endure changes over time. Various methods, e.g., applying task-field and social-field, have been proposed to capture the complexity of task environments so that agents can remain simple. To expand to complex task domains, the multiagent reinforcement learning (MARL) approach has been taken to train agent teams to be more capable and intelligent, permitting reduced complexity in task descriptions. MARL depends on the design of reward functions, which has been a challenging endeavour thus far. This paper investigates the impact of reward shaping in the context of an “L-shape” assembly task that involves collision avoidance. After introducing a general form of reward shaping function, various types of reward shaping fields are studied empirically with agent teams of different sizes. The experiment results have shown that reward shaping can be highly effective, and the singularities, the proper forms of the fields, and the suitable shaping field gradients are essential for successful agent team training. Furthermore, the effect of reward shaping functions depends highly on the size of agent teams.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call