Learning controllable elements oriented representations for reinforcement learning

Qi Yi,Rui Zhang,Shaohui Peng,Jiaming Guo,Xing Hu,Zidong Du,Qi Guo,Ruizhi Chen,Ling Li,Yunji Chen

doi:10.1016/j.neucom.2023.126455

Abstract

Deep Reinforcement Learning (deep RL) has been successfully applied to solve various decision-making problems in recent years. However, the observations in many real-world tasks are often high dimensional and include much task-irrelevant information, limiting the applications of RL algorithms. To tackle this problem, we propose LCER, a representation learning method that aims to provide RL algorithms with compact and sufficient descriptions of the original observations. Specifically, LCER trains representations to retain the controllable elements of the environment, which can reflect the action-related environment dynamics and thus are likely to be task-relevant. We demonstrate the strength of LCER on the DMControl Suite, proving that it can achieve state-of-the-art performance. LCER enables the pixel-based SAC to outperform state-based SAC on the DMControl 100 K benchmark, showing that the obtained representations can match the oracle descriptions (i.e. the physical states) of the environment. We also carry out experiments to show that LCER can efficiently filter out various distractions, especially when those distractions are not controllable.

Full Text