Abstract

This work is concerned with distributed online <i>bandit learning</i> over a multi-agent network, where a group of agents aim to seek the minimizer of a time-changing global loss function cooperatively. At each epoch, the global loss function can be represented as the sum of local loss functions known privately by individual agent over the network. Furthermore, local functions are sequentially accessible to agents, and all the agents have no knowledge of future loss functions. Thus, agents of the network must interchange messages to pursue an online estimation of the global loss function. In this paper, we are interested in a bandit setup, where only values of local loss functions at sampling points are disclosed to agents. Meanwhile, we consider a more general network with unbalanced digraphs that the corresponding weight matrix is allowed to be only row stochastic. By extending the celebrated mirror descent algorithm, we first design a distributed bandit online leaning method for the online distributed convex problem. We then establish the sub-linear expected dynamic regret attained by the algorithm for convex and strongly convex loss functions, respectively, when the accumulative deviation of the minimizer sequence increases sub-linearly. Moreover, the expected dynamic regret bound is analysed for strongly convex loss functions. In addition, the expected static regret bound with the order of <inline-formula><tex-math notation="LaTeX">$O(\sqrt{T})$</tex-math></inline-formula> is obtained in the bandit setting while the corresponding static regret bound with the order of <inline-formula><tex-math notation="LaTeX">$O(\ln {T})$</tex-math></inline-formula> is also provided for the strongly convex case. Finally, numerical examples are provided to illustrate the efficiency of the method and to verify the theoretical findings.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.