Learning Automaton (LA) theory, a branch of reinforcement learning, initially began with the Fixed Structure Learning Automaton (FSLA) family and was later expanded to include the Variable Structure Learning Automaton (VSLA) family. Tuning the depth of FSLA in complex environments has long been a challenging task, significantly limiting the ability to effectively navigate the exploration-exploitation dilemma. No solution has been found for this open problem yet. This study addresses this issue by introducing a novel hybrid learning automaton model called Asymmetric Variable Depth Hybrid LA (AVDHLA). The AVDHLA model intelligently learns the depth of fixed structure LA in an autonomous manner by combining LKN,K from FSLA class and Variable Action Set LA (VASLA) from VSLA class. Computer simulations are conducted to validate the proposed model in diverse environments, including both stationary and non-stationary (Markovian switching and State-dependent) scenarios. Performance evaluation is based on predefined metrics, such as total number of rewards (TNR) and action switching (TNAS). Statistical tests indicate that across both stationary and non-stationary environments, AVDHLA consistently outperforms LKN,K in terms of TNR and TNAS across the majority of experiments. Moreover, the AVDHLA model is applied in two key applications. Firstly, it is used to defend against the selfish mining attack in Bitcoin and is compared with the well-known tie-breaking mechanism. Simulation results consistently demonstrate that our proposed method increases the threshold for successful selfish mining attacks from 25% to 40%. Secondly, the AVDHLA model has been applied to develop a novel learning automaton-based recommendation system. The results demonstrate the superiority of the proposed method in terms of the Click-Through Rate (CTR) and Precision compared to previous approaches.
Read full abstract