Abstract
Learning Classifier Systems (LCSs) have been widely used to tackle Reinforcement Learning (RL) problems as they have a good generalization ability and provide a simple understandable rule-based solution. The accuracy-based LCS, XCS, has been most popularly used for single-objective RL problems. As many real-world problems exhibit multiple conflicting objectives recent work has sought to adapt XCS to Multi-Objective Reinforcement Learning (MORL) tasks. However, many of these algorithms need large storage or cannot discover the Pareto Optimal solutions. This is due to the complexity of finding a policy having multiple steps to multiple possible objectives. This paper aims to employ a decomposition strategy based on MOEA/D in XCS to approximate complex Pareto Fronts. In order to achieve multi-objective learning, a new MORL algorithm has been developed based on XCS and MOEA/D. The experimental results show that on complex bi-objective maze problems our MORL algorithm is able to learn a group of Pareto optimal solutions for MORL problems without huge storage. Analysis of the learned policies shows successful trade-offs between the distance to the reward versus the amount of reward itself.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.