Online Learning of Rested and Restless Bandits

Cem Tekin,Mingyan Liu

doi:10.1109/tit.2012.2198613

Online Learning of Rested and Restless Bandits

Cem Tekin, Mingyan Liu

Open Access

https://doi.org/10.1109/tit.2012.2198613

Copy DOI

Journal: IEEE Transactions on Information Theory	Publication Date: Aug 1, 2012
Citations: 151

Affiliation: University of Michigan–Ann Arbor

#Restless Bandits #Finite-state Discrete-time Markov Chains + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this paper we study the online learning problem involving rested and restless multiarmed bandits with multiple plays. The system consists of a single player/user and a set of K finite-state discrete-time Markov chains (arms) with unknown state spaces and statistics. At each time step the player can play M arms. The objective of the user is to decide for each step which M of the K arms to play over a sequence of trials so as to maximize its long term reward. The restless multiarmed bandit is particularly relevant to the application of opportunistic spectrum access (OSA), where a (secondary) user has access to a set of K channels, each of time-varying condition as a result of random fading and/or certain primary users' activities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: IEEE Transactions on Information Theory

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.