Deep Multi-Agent Reinforcement Learning With Minimal Cross-Agent Communication for SFC Partitioning

Angelos Pentelas,Danny De Vleeschauwer,Koen De Schepper,Panagiotis Papadimitriou,Chia-Yu Chang

doi:10.1109/access.2023.3269576

Angelos Pentelas, Danny De Vleeschauwer + Show 3 more

Open Access

https://doi.org/10.1109/access.2023.3269576

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 1	License type: CC BY-NC-ND 4.0

Affiliation: Nokia (Belgium), University of Macedonia

Abstract

Network Function Virtualization (NFV) decouples network functions from the underlying specialized devices, enabling network processing with higher flexibility and resource efficiency. This promotes the use of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">virtual network functions</i> (VNFs), which can be grouped to form a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">service function chain</i> (SFC). A critical challenge in NFV is SFC partitioning (SFCP), which is mathematically expressed as a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">graph-to-graph</i> mapping problem. Given its NP-hardness, SFCP is commonly solved by approximation methods. Yet, the relevant literature exhibits a gradual shift towards data-driven SFCP frameworks, such as (deep) reinforcement learning (RL). In this article, we initially identify crucial limitations of existing RL-based SFCP approaches. In particular, we argue that most of them stem from the centralized implementation of RL schemes. Therefore, we devise a cooperative deep multi-agent reinforcement learning (DMARL) scheme for decentralized SFCP, which fosters the efficient communication of neighboring agents. Our simulation results (i) demonstrate that DMARL outperforms a state-of-the-art centralized <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">double deep Q-learning</i> algorithm, (ii) unfold the fundamental behaviors learned by the team of agents, (iii) highlight the importance of information exchange between agents, and (iv) showcase the implications stemming from various network topologies on the DMARL efficiency.

Full Text