Abstract

Reinforcement learning faces the challenge of sparse rewards. Existing research utilizes reward shaping based on graph convolutional neural networks (GCNs) to address this challenge. However, the automatic construction of optimal graph has been a long standing issue. Here we propose Graph Convolution with Topology Refinement for Automatic Reinforcement Learning (GTR), based on the construction of new latent graph to replace the original input graph for more effective reward shaping. It is found from this work that, the most suitable state node can be extracted through the graph entropy. Subsequently we map the original graph to subset of nodes adaptively to form a new and more compact latent graph. Since GTR utilizes trainable projection vectors for projecting all node features into one-dimensional representation, the inter-connections between the nodes of the newly constructed latent graph are consistent with the original ones. The proposed GTR stems from mathematical grounds, and preliminary experiments have shown that the proposed GTR has considerable improvement on Atari benchmark and Mujoco benchmark. Further experiment and ablation analysis have given further supports to this work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.