Abstract
We consider the shortest path routing (SPR) problem of a network with time varying link metrics in unknown environments. Due to potential denial of service attacks, the distributions of link states could be stochastic (benign or i.i.d.), contaminated or adversarial (non-i.i.d.) at different temporal and spatial locations. Without any a priori, designing an adaptive SPR protocol to cope with all possible situations in practice optimally is a very challenging issue. In this paper, we present the first solution by formulating it as a multi-armed bandit problem. By introducing novel control parameters to explore link conditions, our proposed algorithms could automatically detect features of the environment within a unified framework and find the optimal SPR strategies with almost optimal learning performance in all possible cases over time. Moreover, we study important issues related to the practical implementation, such as decoupling route selection with multi-path route probing, cooperative learning among multiple sources, the cold-start issue and delayed feedback of our algorithm. Nonetheless, the proposed SPR algorithms can be implemented with low complexity and they are proved to scale very well with the network size. The efficacy of the proposed solutions is verified by simulations from the real tracedriven datasets. Comparing to existing approaches in a typical network scenario, our algorithm has a 65.3 percent improvement of network delay given a learning period and a 81.5 percent improvement of learning duration under a specified network delay.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have