Abstract

Combinatorial multi-armed bandit (MAB) problem can be used to formulate sequential decision problems with exploration-exploitation tradeoff. Dynamic spectrum access (DSA) in cognitive radio (CR) networks is one of important applications. In this work, we briefly overview combinatorial MAB problems with its possible applications to CR networks. We first investigate the standard MAB problems where a single player either explores an arm to gather information to improve its decision strategy, or exploits the arm based on the information that it has collected at each round. Then, we study the taxonomy of combinatorial MAB problems, in particular for multi-player scenarios with independent and identically distributed (i.i.d.) rewards. Finally, we discuss limitations of existing works and interesting open problems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.