Software-Defined Network (SDN) paradigm decouples control plane from data plane and provides a logically-centralized control to whole underlying network, which enables the controller to centrally configure the network, and has demonstrated great potential in fully utilizing network capacity. However, the traditional SDN does not consider the accumulative long-term revenue in routing configuration for network service provider (SP). For example, it is prone to choosing the path passing through important nodes, which may increase the possibility of future congestion, and hamper the SP’s long-term revenue. Deep reinforcement learning (DRL) has brought about the transformative advances in decision-making when facing uncertainty. As a well-known type of DRL, Deep Q-network (DQN) can explore optimal routing schemes from high dimensional network state. However, most of existing DRL based methods fail to provide satisfactory configuration when confronted with a topology unseen during training. Meanwhile, many existing network optimization schemes ignore the impact of nodes and global network attributes on the routing configuration. Consequently, they may perform poorly in real scenarios. In this paper, we propose a graph reinforcement learning (GRL) based SDN routing selection scheme, GN-DQN for optimizing SP’s accumulative long-term revenue. First, besides the link embeddings, GN-DQN explicitly learns the representations of the routers and globe network topology with full graph neural network block. Then, through utilizing the formed network representation as the state, DQN is used to learn to sequentially choose the optimal routing path from the candidate path set to maximize the accumulative long-term revenue. Thorough experiments on multiple real network topologies show that our proposed GN-DQN outperforms other GRL based and traditional routing selection schemes, and can generalize over arbitrary topologies. Moreover, GN-DQN also demonstrates great robustness to the link failure scenario.
Read full abstract