Exploring the depths of Multi-Armed Bandit algorithms: From theoretical foundations to modern applications

Haoyu Nie

doi:10.54254/2755-2721/68/20241399

Haoyu Nie

Open Access

PDF Available

https://doi.org/10.54254/2755-2721/68/20241399

Copy DOI

Export

Save

Cite

Journal: Applied and Computational Engineering	Publication Date: Jun 6, 2024
License type: cc-by

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This paper provides an in-depth and comprehensive analysis of Multi-Armed Bandit (MAB) algorithms, which are crucial in decision-making under uncertainty. It begins with a detailed explanation of the fundamental scenarios where MAB algorithms are applicable, focusing on their features and key strategies. The paper then introduces and explains the core algorithms: Explore-Then-Commit (ETC), Upper Confidence Bound (UCB), and Thompson Sampling (TS). Utilizing various plots, the paper not only analyzes these classical algorithms but also compares them with several of their advanced versions. Additionally, the paper highlights two practical applications of MAB algorithms - in recommendation systems and in wireless digital twin networks - to illustrate their real-world relevance and potential. However, the paper also acknowledges the challenges posed by the complexity of different bandit settings, which affect the efficiency and scalability of MAB algorithms, indicating the need for ongoing research in this field. This review aims to delve deep into the realm of MAB algorithms, offering a thorough understanding of their theoretical underpinnings as well as practical implications.

Full Text