In the last decade, substantial progress has been achieved in intelligent traffic control technologies to overcome consistent difficulties of traffic congestion and its adverse effect on smart cities. Edge computing is one such advanced progress facilitating real-time data transmission among vehicles and roadside units to mitigate congestion. An edge computing-based deep reinforcement learning system is demonstrated in this study that appropriately designs a multiobjective reward function for optimizing different objectives. The system seeks to overcome the challenge of evaluating actions with a simple numerical reward. The selection of reward functions has a significant impact on agents' ability to acquire the ideal behavior for managing multiple traffic signals in a large-scale road network. To ascertain effective reward functions, the agent is trained withusing the proximal policy optimization method in several deep neural network models, including the state-of-the-art transformer network. The system is verified using both hypothetical scenarios and real-world traffic maps. The comprehensive simulation outcomes demonstrate the potency of the suggested reward functions.
Read full abstract