Structural Parameter Space Exploration for Reinforcement Learning via a Matrix Variate Distribution

Rui Yang,Zhen Kan,Bin Li,Shaochen Wang

doi:10.1109/tetci.2022.3140380

Abstract

The trade-off between exploration and exploitation is essential for reinforcement learning, where an agent needs to be aware of when to explore for high reward policies and when to exploit the optimal policy known so far. Parameter space exploration provides an elegant solution. As one of the principal methods, injecting noise into the model parameters greatly improves exploration. However, directly stretching the parameters of the neural network into a vector and generating noise for this vector ignore the structural information of the model. In this paper, we aim to incorporate spatial information into weight matrices and propose matrix-variate noise exploration, which exploits the structural weight uncertainty brought by matrix variate noise to enhance the stochasticity of the agent. Indeed, we construct a bridge between the matrix noise exploration and probabilistic neural networks, which theoretically explains the improved performance of parameter space exploration. Extensive experiments have shown that matrix variate noise exploration outperforms fully factorized noisy exploration on most Atari tasks and Super Mario Bros tasks and is competitive to the state-of-the-art methods.

Full Text