Abstract

AbstractThe ability to learn compositional strategies in multi-task learning and to exert them appropriately is crucial to the development of artificial intelligence. However, there exist several challenges: (i) how to maintain the independence of modules in learning their own sub-tasks; (ii) how to avoid performance degradation in situations where modules’ reward scales are incompatible; (iii) how to find the optimal composite policy for the entire set of tasks. In this paper, we introduce a Modular Reinforcement Learning (MRL) framework that coordinates the competition and the cooperation between separate modules. Furthermore, a selective update mechanism enables the learning system to align incomparable reward scales in different modules. Moreover, the learning system follows a “joint policy” to calculate actions’ preferences combined with their responsibility for the current task. We evaluate the effectiveness of our approach on a classic food-gathering and predator-avoidance task. Results show that our approach has better performance than previous MRL methods in learning separate strategies for sub-tasks, is robust to modules with incomparable reward scales, and maintains the independence of the learning in each module.KeywordsMulti-task learningModular reinforcement learningIncomparable reward scaleCompositionality policyModel-based reinforcement learning

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call