Abstract
Combinatorial multi-armed bandit (MAB) problem can be used to formulate sequential decision problems with exploration-exploitation tradeoff. Dynamic spectrum access (DSA) in cognitive radio (CR) networks is one of important applications. In this work, we briefly overview combinatorial MAB problems with its possible applications to CR networks. We first investigate the standard MAB problems where a single player either explores an arm to gather information to improve its decision strategy, or exploits the arm based on the information that it has collected at each round. Then, we study the taxonomy of combinatorial MAB problems, in particular for multi-player scenarios with independent and identically distributed (i.i.d.) rewards. Finally, we discuss limitations of existing works and interesting open problems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.