Abstract
A standard called MPEG Dynamic Adaptive Streaming over HTTP (MPEG DASH) ensures the interoperability between different streaming services and the highest possible video quality in changing network conditions. The solutions described in the available literature that focus on video segmentation are mostly proprietary, use a high amount of computational power, lack the methodology, model notation, information needed for reproduction, or do not consider the spatial and temporal activity of video sequences. This paper presents a new model for selecting optimal parameters and number of representations for video encoding and segmentation, based on a measure of the spatial and temporal activity of the video content. The model was developed for the H.264 encoder, using Structural Similarity Index Measure (SSIM) objective metrics as well as Spatial Information (SI) and Temporal Information (TI) as measures of video spatial and temporal activity. The methodology that we used to develop the mathematical model is also presented in detail so that it can be applied to adapt the mathematical model to another type of an encoder or a set of encoding parameters. The efficiency of the segmentation made by the proposed model was tested using the Basic Adaptation algorithm (BAA) and Segment Aware Rate Adaptation (SARA) algorithm as well as two different network scenarios. In comparison to the segmentation available in the relevant literature, the segmentation based on the proposed model obtains better SSIM values in 92% of cases and subjective testing showed that it achieves better results in 83.3% of cases.
Highlights
The development of the Internet, as well as hardware that supports novel Internet technologies, created the increased need for network and bandwidth resources
In comparison to the client side of MPEG DASH systems, very few research that focus on video coding parameters, selection of optimal parameters for representation sets, and the development of databases with video sequences prepared for MPEG
MPEG DASH is a standard for adaptive streaming that defines the format of video segments and the manifest file ensuring the interoperability between different streaming services and client protocols
Summary
The development of the Internet, as well as hardware that supports novel Internet technologies, created the increased need for network and bandwidth resources. DASH provides smooth playback due to an adaptation algorithm that selects appropriate video segments encoded with different spatial resolutions and target coding bitrates based on the client’s device characteristics and the state in a heterogeneous network [2]. There are usually multiple representation sets (10–20), which consist of video sequences encoded on different target bitrates and typically with lower spatial resolutions for lower bitrates. In stable network conditions, longer segments perform better than shorter ones because they have a larger number of P and B frames; they need a lower encoding bitrate to achieve the given video quality. Considering that most of the available solutions are proprietary, demand high computational power or do not consider the spatial and temporal activity of video sequences, there is room for improvement regarding the selection of optimal representation sets for MPEG DASH.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have