Abstract

Traffic engineering is one of the most important methods of optimizing network performance by designing optimal forwarding and routing rules to meet the quality of service (QoS) requirements for a large volume of traffic flows. End-to-end (E2E) delay is one of the key TE metrics. Optimizing E2E delay, however, is very challenging in large-scale multihop networks due to the profound network uncertainties and dynamics. This paper proposes a model-free TE framework that adopts multi-agent reinforcement learning for distributed control to minimize the E2E delay. In particular, distributed TE is formulated as a multi-agent extension of Markov decision process (MA-MDP). To solve this problem, a modular and composable learning framework is proposed, which consists of three interleaving modules including policy evaluation, policy improvement, and policy execution. Each of component can be implemented using different algorithms along with their extensions. Simulation results show that the combination of several extensions, such as double learning, expected policy evaluation, and on-policy learning, can provide superior E2E delay performance under high traffic load cases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.