Abstract

Reward shaping using GCNs is a popular research area in reinforcement learning. However, it is difficult to shape potential functions for complicated tasks. In this paper, we develop Reward Shaping with Hierarchical Graph Topology (HGT). HGT propagates information about the rewards through the message passing mechanism, which can be used as potential functions for reward shaping. We describe reinforcement learning by a probability graph model. Then we generate a underlying graph with each state is a node and edges represent transition probabilities between states. In order to prominently shape potential functions for complex environments, HGT divides the underlying graph constructed from states into multiple subgraphs. Since these subgraphs provide a representation of multiple logical relationships between states in the Markov decision process, the aggregation process rich correlation information between nodes, which makes the propagated messages more powerful. When compared to cutting-edge RL techniques, HGT achieves faster learning rates in experiments on Atari and Mujoco tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call