Abstract

Reinforcement learning faces the challenge of sparse rewards. Existing research utilizes reward shaping based on graph convolutional neural networks (GCNs) to address this challenge. However, the automatic construction of optimal graph has been a long standing issue. Here we propose Graph Convolution with Topology Refinement for Automatic Reinforcement Learning (GTR), based on the construction of new latent graph to replace the original input graph for more effective reward shaping. It is found from this work that, the most suitable state node can be extracted through the graph entropy. Subsequently we map the original graph to subset of nodes adaptively to form a new and more compact latent graph. Since GTR utilizes trainable projection vectors for projecting all node features into one-dimensional representation, the inter-connections between the nodes of the newly constructed latent graph are consistent with the original ones. The proposed GTR stems from mathematical grounds, and preliminary experiments have shown that the proposed GTR has considerable improvement on Atari benchmark and Mujoco benchmark. Further experiment and ablation analysis have given further supports to this work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call