Abstract

Multi-armed bandit (MAB) algorithms are designed to identify the best arm among several arms in an unknown environment. They guarantee optimal balance between exploration (select all arms sufficient number of times) and exploitation (select best arm as many times as possible). They are widely used in applications such as website advertisement, robotics, healthcare, finance, and wireless radios. Robotics and radio applications need integration of MAB algorithms with the PHY on the hardware to meet the stringent area, power and latency constraints. Moreover, a single MAB algorithm may not be suitable for various scenarios and hence, the application needs to switch between MAB algorithms on-the-fly. In this paper, we efficiently map the MAB algorithms on Zynq System on Chip (ZSoC) and make it reconfigurable such that the number of arms, as well as type of algorithm, can be changed on-the-fly. We validate the functional correctness and usefulness of the proposed architectures via realistic wireless application and detailed complexity analysis demonstrates the feasibility of the proposed solution in realizing intelligent radios/robots.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call