Abstract

The difference between training and testing environments is a huge challenge to generalizing reinforcement learning (RL) algorithms. We propose a Soft Contrastive learning with a coarser approximate Q-irrelevance abstraction for Reinforcement Learning (SCQRL) to increase RL generalization. Specifically, we specify the coarser approximate Q-irrelevance abstraction as the feature of the state with a theoretical analysis for better generalization ability. We construct a positive and negative sample selection mechanism based on the Q value for contrastive learning to achieve efficient representation learning. Considering the selection error of positive and negative samples, we design soft contrastive learning and combine it with reinforcement learning in the form of an auxiliary task to propose SCQRL. The generalization experiments on several Procgen environments demonstrate that SCQRL outperforms the excellent generalized RL algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call