In advanced transportation-management systems, variable speed limits are a crucial application. Deep reinforcement learning methods have been shown to have superior performance in many applications, as they are an effective approach to learning environment dynamics for decision-making and control. However, they face two significant difficulties in traffic-control applications: reward engineering with delayed reward and brittle convergence properties with gradient descent. To address these challenges, evolutionary strategies are well suited as a class of black-box optimization techniques inspired by natural evolution. Additionally, the traditional deep reinforcement learning framework struggles to handle the delayed reward setting. This paper proposes a novel approach using covariance matrix adaptation evolution strategy (CMA-ES), a gradient-free global optimization method, to handle the task of multi-lane differential variable speed limit control. The proposed method uses a deep-learning-based method to dynamically learn optimal and distinct speed limits among lanes. The parameters of the neural network are sampled using a multivariate normal distribution, and the dependencies between the variables are represented by a covariance matrix that is optimized dynamically by CMA-ES based on the freeway's throughput. The proposed approach is tested on a freeway with simulated recurrent bottlenecks, and the experimental results show that it outperforms deep reinforcement learning-based approaches, traditional evolutionary search methods, and the no-control scenario. Our proposed method demonstrates a 23% improvement in average travel time and an average of a 4% improvement in CO, HC, and NOx emission.Furthermore, the proposed method produces explainable speed limits and has desirable generalization power.
Read full abstract