Abstract

The Value Iteration Network (VIN) is a neural network widely used in path-finding reinforcement learning problems. The planning module in VIN enables the network to understand the nature of a problem, thus giving the network an impressive generalization ability. However, reinforcement learning (RL) with VIN can not guarantee efficient training due to the network depth and max-pooling operation. A great network depth makes it harder for the network to learn from samples when using gradient descent algorithms. The max-pooling operation may increase the difficulty of learning negative rewards due to overestimation. This paper proposes a new neural network, Value Iteration Residual Network (VIRN) with Self-Attention, using a unique spatial self-attention module and aggressive iteration to solve the above-mentioned problems. A preliminary evaluation using Mr. Pac-Man demonstrated that VIRN effectively improved the training efficiency compared with VIN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call