Abstract

In recent times, the surge in applications like social networking and online learning platforms has led to a substantial rise in on-demand video streaming, thus ensuring a seamless user experience holds paramount importance, particularly given the dynamic nature of network conditions. Moreover, numerous service providers are embracing smaller buffer sizes, aiming to reduce bandwidth inefficiencies due to the possibility of users ending video sessions prematurely. However, this transition presents a significant challenge for conventional adaptive bitrate (ABR) algorithms, as they grapple with the task of harmonizing low stalling time, high playback bitrate, and the constraint of a minimized buffer size. In this study, we introduce a novel ABR approach, L2-ABR, which leverages self-play reinforcement learning to address these complexities. Unlike conventional reward-engineering learning-based ABR strategies that update gradients to maximize linear reward functions, L2-ABR treats video streaming as a fundamental objective and trains neural networks (NNs) with explicit requirements tailored to video streaming with small playback buffers. Through an extensive series of trace-driven experiments, we showcase that L2-ABR outperforms existing methods, effectively striking a balance between buffer management and network scenarios without inducing excessive buffer under-runs or overflows. Compared to existing ABR schemes, our method reduces undesirable buffer events, including rebuffering events and buffer full events, while improving the average QoE by up to 71.88% and 75.25% over the HSDPA and FCC datasets, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call