Abstract

Adaptive bitrate (ABR) streaming algorithms play an important role in ensuring a high Quality of Experience (QoE) for the consumer. However, a lot of ABR algorithms tend to be too ad hoc. In response, methods based on a Markov Decision Process (MDP) offer more intelligent models. In particular, Reinforcement Learning (RL) methods typically do so via QoE metrics. However, RL methods are plagued by high complexity and long convergence times due to their model-free nature. This paper proposes qMDP, which is an RL method with an MDP partially modeled by an M/D/1/K queue. Our study shows that qMDP results in higher QoE and faster convergence compared to a QoE-only model-free version.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call