Soft Contrastive Learning With Q-Irrelevance Abstraction for Reinforcement Learning

Minsong Liu,Yuanheng Zhu,Dongbin Zhao,Luntong Li,Shuai Hao

doi:10.1109/tcds.2022.3218940

Abstract

The difference between training and testing environments is a huge challenge to generalizing reinforcement learning (RL) algorithms. We propose a Soft Contrastive learning with a coarser approximate Q-irrelevance abstraction for Reinforcement Learning (SCQRL) to increase RL generalization. Specifically, we specify the coarser approximate Q-irrelevance abstraction as the feature of the state with a theoretical analysis for better generalization ability. We construct a positive and negative sample selection mechanism based on the Q value for contrastive learning to achieve efficient representation learning. Considering the selection error of positive and negative samples, we design soft contrastive learning and combine it with reinforcement learning in the form of an auxiliary task to propose SCQRL. The generalization experiments on several Procgen environments demonstrate that SCQRL outperforms the excellent generalized RL algorithm.

Full Text