Abstract
The popularity of video on-demand streaming services increased tremendously over the last years. Most services use http-based adaptive video streaming methods. Today’s movies and TV shows are typically recorded in UHD-1/4K and streamed using settings attuned to the end-device and current network conditions. Video quality prediction models can be used to perform an extensive analysis of video codec settings to ensure high quality. Hence, we present a framework for the development of pixel-based video quality models. We instantiate four different model variants ( hyfr , hyfu , fume and nofu ) for short-term video quality estimation targeting various use cases. Our models range from a no-reference video quality model to a full-reference model including hybrid model extensions that incorporate client accessible meta-data. All models share a similar architecture and the same core features, depending on their mode of operation. Besides traditional mean opinion score prediction, we tackle quality estimation as a classification and multi-output regression problem. Our performance evaluation is based on the publicly available AVT-VQDB-UHD-1 dataset. We further evaluate the introduced center-cropping approach to speed up calculations. Our analysis shows that our hybrid full-reference model ( hyfr ) performs best, e.g. 0.92 PCC for MOS prediction, followed by the hybrid no-reference model ( hyfu ), full-reference model ( fume ) and no-reference model ( nofu ). We further show that our models outperform popular state-of-the-art models. The introduced features and machine-learning pipeline are publicly available for use by the community for further research and extension.
Highlights
C ONSIDERING the enormous increase of uploaded, watched and shared videos, it is not a surprise that approximately 70% of the overall internet bandwidth is spent for video streaming [14], and this is projected to increase to about 80% to 90% by 2022 [13]
The paper describes the models in detail, as well as a number of evaluation experiments, where we show that our models are able to outperform other state-of-the-art video quality models
We introduce a model called fume that is based on all img, mov, img-fr and mov-fr features described in Table 1. fume is a combination of pure noreference pixel-based features with full-reference features, similar for example to the combination of full-reference features with motion features in case of Netflix’s VMAF
Summary
C ONSIDERING the enormous increase of uploaded, watched and shared videos, it is not a surprise that approximately 70% of the overall internet bandwidth is spent for video streaming [14], and this is projected to increase to about 80% to 90% by 2022 [13]. The core idea of HAS is to automatically adapt the played video quality to the used end device and in particular to the available network bandwidth, to avoid stalling of video play out due to buffer depletion, and continuously play out the video at the highest possible quality even in low bandwidth situations. To enable such an adaption, it is required to store several representations on the server. Different adaptation strategies or algorithms are investigated to improve quality of VOLUME X, 2016
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.