Reinforcement Learning-Based Near-Optimal Load Balancing for Heterogeneous LiFi WiFi Network

Rizwana Ahmad,Majid Safari,Mohammad Dehghani Soltani,Anand Srivastava

doi:10.1109/jsyst.2021.3088302

Abstract

Owing to the nonoverlapping spectrum, light fidelity (LiFi) and WiFi technologies can coexist and form a heterogeneous LiFi WiFi network (HLWN). The performance of HLWN significantly depends upon the load balancing strategies. Since load balancing of HLWN is a nonconvex mixed-integer nonlinear programming optimization problem, it is mathematically intractable, and therefore, the conventional optimization methods fail to provide an optimal global solution. Although an optimal solution can be obtained using the exhaustive search method, it would be computationally complex. Therefore, in this article, a reinforcement learning (RL)-based algorithm is explored for solving the load balancing problem for the downlink HLWN at reasonably low complexity and near optimal performance. We have proposed three different reward functions for RL; the first and second reward functions work toward maximizing average network throughput and user satisfaction, respectively. The third reward function is designed to maximize the long-term system throughput and ensure at least 50% user’s satisfaction for all users. In order to study the effects of link aggregation on the system performance, this article considers two different types of receiver schemes, namely, single access point (SAP) and link aggregation (LA) scheme. While the SAP allows the user to receive data only from an SAP, the LA scheme allows the user to receive data simultaneously from both LiFi and WiFi AP. This article also includes effect of random orientation of the receiver device and handover overhead. Furthermore, concepts of domain knowledge have been included in this article to reduce the computational complexity of the algorithm. The proposed system performance is compared with the the following two benchmarks: received signal strength (RSS) and exhaustive search based on the computational complexity, average system throughput, and user satisfaction. It is shown that the proposed RL scheme outperforms the RSS scheme in average system throughput and user satisfaction. The RL scheme with an appropriate reward function provides a matching performance to the exhaustive search at reasonably low complexity.

Full Text