Multi-agent deep reinforcement learning (MADRL) has shown remarkable advancements in the past decade. However, most current MADRL models focus on task-specific short-horizon problems involving a small number of agents, limiting their applicability to long-horizon planning in complex environments. Hierarchical multi-agent models offer a promising solution by organizing agents into different levels, effectively addressing tasks with varying planning horizons. However, these models often face constraints related to the number of agents or levels of hierarchies. This paper introduces HiSOMA, a novel hierarchical multi-agent model designed to handle long-horizon, multi-agent, multi-task decision-making problems. The top-level controller, FALCON, is modeled as a class of self-organizing neural networks (SONN), designed to learn high-level decision rules as internal cognitive codes to modulate middle-level controllers in a fast and incremental manner. The middle-level controllers, MADRL models, in turn receive modulatory signals from the higher level and regulate bottom-level controllers, which learn individual action policies generating primitive actions and interacting directly with the environment. Extensive experiments across different levels of the hierarchical model demonstrate HiSOMA’s efficiency in tackling challenging long-horizon problems, surpassing a number of non-hierarchical MADRL approaches. Moreover, its modular design allows for extension into deeper hierarchies and application to more complex tasks with heterogeneous controllers. Demonstration videos and codes can be found on our project web page: https://smu-ncc.github.io.
Read full abstract