Abstract

Virtual reality (VR) video streaming and 360◦ panoramic video have received extensive attention in recent years, which can bring users an immersive experience. However, the ultra-high bandwidth and ultra-low latency requirements of virtual reality video or 360◦ panoramic video also put tremendous pressure on the carrying capacity of the current network. In fact, since the user’s field of view (a.k.a viewport) is limited when watching a panoramic video and users can only watch about 20%∼30% of the video content, it is not necessary to directly transmit all high-resolution content to the user. Therefore, predicting the user’s future viewing viewport can be crucial for selective streaming and further bitrate decisions. Combined with the tile-based adaptive bitrate (ABR) algorithm for panoramic video, video content within the user’s viewport can be transmitted at a higher resolution, while areas outside the viewport can be transmitted at a lower resolution. This paper mainly proposes a viewport-driven adaptive 360◦ live streaming optimization framework, which combines viewport prediction and ABR algorithm to optimize the transmission of live 360◦ panoramic video. However, existing viewport prediction always suffers from low prediction accuracy and does not support real-time performance. With the advantage of convolutional network (CNN) in image processing and long short-term memory (LSTM) in temporal series processing, we propose an online-updated viewport prediction model called LiveCL which mainly utilizes CNN to extract the spatial characteristics of video frames and LSTM to learn the temporal characteristics of the user’s viewport trajectories. With the help of the viewport prediction and ABR algorithm, unnecessary bandwidth consumption can be effectively reduced. The main contributions of this work include: (1) a framework for 360◦ video transmission is proposed; (2) an online real-time viewport prediction model called LiveCL is proposed to optimize 360◦ video transmission combined with a novel ABR algorithm, which outperforms the existing model. Based on the public 360◦ video dataset, the tile accuracy, recall, precision, and frame accuracy of LiveCL are better than those of the latest model. Combined with related adaptive bitrate algorithms, the proposed viewport prediction model can reduce the transmission bandwidth by about 50%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call