Abstract
When applying Reinforcement Learning (RL) to the real-world visual tasks, two major challenges necessitate consideration: sample inefficiency and limited generalization. To address the above two challenges, previous works focus primarily on learning semantic information from the visual state for improving sample efficiency, but they do not explicitly learn other valuable aspects, such as spatial information. Moreover, they improve generalization by learning representations that are invariant to alterations of task-irrelevant variables, without considering task-relevant variables. To enhance sample efficiency and generalization of the base RL algorithm in visual tasks, we propose an auxiliary task called Recovering Permuted Sequential Features (RPSF). Our method enhances generalization by learning the spatial structure information of the agent, which can mitigate the effects of changes in both task-relevant and task-irrelevant variables. Moreover, it explicitly learns both semantic and spatial information from the visual state by disordering and subsequently recovering a sequence of features to generate more holistic representations, thereby improving sample efficiency. Extensive experiments demonstrate that our method significantly improves the sample efficiency and generalization of the base RL algorithm and outperforms various state-of-the-art baselines across diverse tasks in unseen environments. Furthermore, our method exhibits compatibility with both CNN and Transformer architectures.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Neural networks : the official journal of the International Neural Network Society
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.