Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control.

Lucas N Alegre,Bruno C Da Silva,Ana L.C Bazzan

doi:10.7717/peerj-cs.575

Abstract

In reinforcement learning (RL), dealing with non-stationarity is a challenging issue. However, some domains such as traffic optimization are inherently non-stationary. Causes for and effects of this are manifold. In particular, when dealing with traffic signal controls, addressing non-stationarity is key since traffic conditions change over time and as a function of traffic control decisions taken in other parts of a network. In this paper we analyze the effects that different sources of non-stationarity have in a network of traffic signals, in which each signal is modeled as a learning agent. More precisely, we study both the effects of changing the context in which an agent learns (e.g., a change in flow rates experienced by it), as well as the effects of reducing agent observability of the true environment state. Partial observability may cause distinct states (in which distinct actions are optimal) to be seen as the same by the traffic signal agents. This, in turn, may lead to sub-optimal performance. We show that the lack of suitable sensors to provide a representative observation of the real state seems to affect the performance more drastically than the changes to the underlying traffic patterns.

Highlights

Controlling traffic signals is one way of dealing with the increasing volume of vehicles that use the existing urban network infrastructure
Our main goal with the following experiments is to quantify the impact of different causes of non-stationarity in the learning process of an reinforcement learning (RL) agent in traffic signal control
We first conduct an experiment where traffic signals use a fixed control policy—a common strategy in case the infrastructure lacks sensors and/or actuators. The results of this experiment are discussed in “Traffic Signal Control under Fixed Policies” and are used to emphasize the problem of lacking a policy that can adapt to different contexts; it serves as a baseline for later comparisons

Summary

Introduction

Controlling traffic signals is one way of dealing with the increasing volume of vehicles that use the existing urban network infrastructure. RL is based on an agent computing a policy mapping states to actions without requiring an explicit environment model. In reinforcement learning (Sutton & Barto, 1998), an agent learns how to behave by interacting with an environment, from which it receives a reward signal after each action. The agent uses this feedback to iteratively learn an optimal control policy π Ã—a function that specifies the most appropriate action to take in each state. We can model RL problems as Markov decision processes (MDPs) These are described by a set of states S, a set of actions A, a reward function Rðs; a; s0Þ ! In an infinite horizon MDP, the cumulative reward in the future

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PeerJ. Computer science	Publication Date: May 27, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ. Computer science

Lead the way for us

Similar Papers

Chula-SSS: Developmental Framework for Signal Actuated Logics on SUMO Platform in Over-saturated Sathorn Road Network Scenario
Chaodit Aswakul ... Patrachart Komolkiti
-
Chaodit Aswakul, et. al.Chaodit Aswakul ... Patrachart Komolkiti
25 Jun 2018
25 Jun 2018

Urban Traffic Signal System Control Structural Optimization Based on Network Analysis
Li Wang ... Zhi-Jian Wang
Mathematical Problems in Engineering | VOL. 2013
Li Wang, et. al.Li Wang ... Zhi-Jian Wang
01 Jan 2013
Mathematical Problems in Engineering | VOL. 2013

A108 Transport Systems and Public Health: the Case of Traffic Congestion and Signal Control
Ghassan Abu-Lebdeh
Journal of Transport & Health | VOL. 2
Ghassan Abu-LebdehGhassan Abu-Lebdeh
01 Jun 2015
Journal of Transport & Health | VOL. 2

Signal Control of Urban Traffic Network Based on Multi-Agent Architecture and Fireworks Algorithm
Zhimin Qiao ... Xin Lu
-
Zhimin Qiao, et. al.Zhimin Qiao ... Xin Lu
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Quantifying the impact of non-stationarity in reinforcement learning-based traffic signal control.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ. Computer science