Abstract

The traffic assignment problem (TAP) consists of assigning routes to road users in order to minimize traffic congestion. Traditional methods for solving the TAP assume the existence of a central authority who computes and dictates routes to road users. Multi-agent reinforcement learning (MARL) approaches are more realistic in solving this kind of problem because they consider that road users (agents) have complete autonomy for choosing routes. However, MARL approaches usually require a long training period in order to compute the optimal routes, which could be a major limitation in more realistic traffic scenarios. In this paper, we tackle this problem by evaluating the performance of three conceptually different reward functions, namely: expert-designed rewards, difference rewards, and intrinsically motivated rewards. In particular, our focus lies on providing a deeper understanding of the impact of these reward functions on the agents’ performance, thus contributing towards reducing congestion levels. To this end, we perform an extensive experimental evaluation on different road networks, including up to 360,600 concurrently learning agents. Our results show that, although the adopted reward functions were not able to speed up the learning process, the correct reward function choice plays an important role in the quality of the learned solution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call