Investigation of selection and application of Multi-Armed Bandit algorithms in recommendation system

Panyangjie Chen

doi:10.54254/2755-2721/34/20230323

Abstract

The Multi-Armed Bandit (MAB) algorithm holds significant prominence as a recommendation system technique, effectively presenting user-centric content preferences based on the analysis of collected data. However, the application of the basic MAB algorithm in real-world recommendation systems is not without challenges, including issues related to data volume and data processing accuracy. Therefore, the optimization algorithm based on the MAB algorithm is more widely used in the recommendation system. This paper briefly introduces the multi-armed bandit algorithm, that is, the use of MAB in the recommendation system and the problems of the basic MAB algorithm. Aiming at the problems of the basic MAB algorithm, it introduces the MAB-based optimization algorithm used in the recommendation system. At the same time, this paper also analyzes and summarizes such algorithms. This paper introduces two different MAB-based optimization algorithms, namely The Details of Dynamic clustering based contextual combinatorial multi-armed bandit (DC3MAB) and Binary Upper Confidence Bound (BiUCB). In addition, this paper also introduces the application of algorithm in recommended system. Finally, this paper summarizes the introduced algorithms and proposes the future prospects for MAB optimization algorithms.

Full Text