The scalable video coding (SVC) extension of the H.264/AVC video coding standard provides two mechanisms, namely coarse grain scalability (CGS) and medium grain scalability (MGS), for quality scalable video encoding, which varies the fidelity (signal-to-noise ratio) of the encoded video stream. As H.264/AVC and its SVC extension are expected to become widely adopted for the network transport of video, it is important to thoroughly study their network traffic characteristics, including the bit rate variability. In this paper, we report on a large-scale study of the rate-distortion (RD) and rate variability-distortion (VD) characteristics of CGS and MGS. We found that CGS achieves low bit rate overheads in the 10–30% range compared to H.264 SVC single-layer encodings only for encodings with a total of up to three quality levels; more quality levels result in substantially higher overheads. The traffic variabilities of CGS are generally lower than for single-layer streams. We found that in the low to mid range of the MGS quality scalability, MGS can achieve the same or even slightly higher RD efficiency than corresponding single-layer encoding; toward the upper end of the MGS quality scalability range the RD efficiency drops off significantly. MGS layer extraction following the hierarchical B frame structure gives nearly as high RD performance as RD-optimized extraction. In the range of high RD efficiency, MGS streams have significantly higher traffic variabilities than single-layer streams at the frame time scale. At the group-of-pictures (GoP) time scale, MGS has similar or lower levels of traffic variability compared to single-layer streams. Generally, MGS layer extraction over the time horizon of individual GoPs gives significantly lower traffic variability than extraction over the time horizon of the full video sequence.
Read full abstract