Learning-Based Video Compression with Continuously Variable Bitrate Coding
In this paper, we propose a learning-based video compression which can perform continuously variable bitrate coding. The proposed method generates feature transformation parameters through a conditional network according to the input spatial quality map. These parameters are then used to adaptively transform the intermediate features of the encoder, decoder, and spatiotemporal entropy model in the codec, thus enabling variable bitrate coding. Additionally, to improve the compression efficiency of the codec, we propose incorporating the quality map of the preceding frame into the hyperprior encoder and leveraging the temporal prior encoder. A multi-stage training strategy is employed to jointly train the codec with a multi-frame rate-distortion loss function. The experimental results demonstrate that the proposed method can achieve continuously variable bitrate adaptation while maintaining rate-distortion performance comparable to the fixed bitrate model. Furthermore, the proposed method also supports ROI-based compression.
- Conference Article
20
- 10.1109/icc.1993.397318
- May 23, 1993
The authors present the bit-rate characteristics of variable bit-rate (VBR) MPEG-1 compatible video intended for asynchronous transfer mode (ATM) network applications. The VBR coding mode for MPEG video is of special interest in teleconferencing and workstation multimedia applications requiring constant image quality, low delay, and/or integrated multimedia transport. Simulation data are provided for a 5-10 Mbps CCIR601 VBR MPEG encoder appropriate for standard-quality TV broadcasting or multimedia applications. The results presented include bit-rate traces and signal-to-noise ratio for typical test sequences, along with summary bit-rate statistics. The performance of frame-based peak rate control as a traffic shaping method is studied. Signal-to-noise ratio obtained with VBR and constant bit-rate coding modes operating at the same average bit-rate is also given for purposes of comparison. >
- Book Chapter
- 10.1007/978-3-540-71789-8_4
- Dec 5, 2006
In this paper, we propose an embedded variable bit-rate (VBR) audio coder to provide the fittest quality of service (QoS) and better connectivity of service for the ubiquitous speech communications. It has scalable bandwidth for narrowband to wideband speech signal, and embedded 8 32 kbit/s VBR corresponding to the network condition and terminal capacity. For the design of the embedded VBR coder, the narrowband signals are compressed by an existing standard speech coding method for the compatibility with G.729 coder, and then the other signals are compressed hierarchically on the basis of CELP enhancement and transform coding with temporal noise shaping (TNS) method. By the objective and subjective quality tests, it is shown that the proposed embedded VBR audio coder provides a reasonable quality compared with existing audio coders such as G.722 and G.722.2 in terms of mean opinion score (MOS) and perceptual evaluation of speech quality of wideband (PESQ-WB).
- Research Article
134
- 10.1109/49.32339
- Jun 1, 1989
- IEEE Journal on Selected Areas in Communications
The bandwidth flexibility offered by the asynchronous transfer mode (ATM) technique makes it possible to select picture quality and bandwidth over a wide range in a simple and straightforward manner. A prototype model of a video codec was developed that demonstrates the feasibility of both variable bit rate (VBR) coding and user-selectable picture quality. The VBR coding algorithm is discussed and it is shown how a stabilized quality is achieved and how this quality and associated bandwidth can be selected by the user. How error propagation is limited to reduce the visibility of cell losses is also discussed. Interfaces with the ATM network are analyzed, with emphasis on decoder synchronization and absorption of cell delay jitter. The VBR codec offers very good picture quality for videophony applications at an equivalent load of 5.9 Mb/s. Picture quality remains relatively constant, even for heavy motion. >
- Research Article
11
- 10.1006/jvci.1994.1006
- Mar 1, 1994
- Journal of Visual Communication and Image Representation
Layered Coding Schemes for Video Transmission on ATM Networks
- Conference Article
1
- 10.1109/ssp.2005.1628718
- Jan 1, 2005
- IEEE/SP 13th Workshop on Statistical Signal Processing, 2005
This paper describes the extension of the embedded zero wavelet tree coding technique (J.M. Shapiro, 1993) to the M-band wavelet transform. Through this scheme, we improve the efficiency of the embedded zero wavelet tree coding, that finds extensive application as a variable bit-rate coder. To prove the efficacy, we compare the compression ratio and the PSNR obtained by a traditional 2-band wavelet based coder, with those obtained with the M-band wavelet based EZW coder. One additional point discussed in this paper, is the performance of a traditional 2-level quantization in EZW against an M-adaptive coder. We see the advantages and disadvantages in going in for an M-adaptive coder and conclude based on the PSNR and compression ratio obtained at various bit-rates. The variable bit-rate is emulated as multiple scans. Hence graphs plot the compression ratio or the PSNR against the number of scans. An important inference drawn from this paper is that, M-band wavelet transform with M=Odd, exhibits special properties, as a result of which the EZW scheme performs much more efficiently with them
- Book Chapter
3
- 10.1007/978-1-4615-3266-8_5
- Jan 1, 1991
Two versions [1],[2] of Low-Delay Code-Excited Linear Predictive (LD-CELP) coders have been recently suggested as candidates for the CCITT 16 kbit/s speech coding standard. The goal of this standard is to cover a long list of possible applications like mobile radio, video-phone, Digital Circuit Multiplication Equipment (DCME), etc.. Many of these applications have different requirements so it has been difficult to define the performance requirements and objectives for a unique algorithm which will be suitable for a large variety of these applications. Thus it was suggested [3] and accepted as an objective of the standard that the adopted algorithm will have a nominal rate of 16 Kbit/s but can operate at bit rates higher and lower than the nominal rate. This may enable a more optimal implementation for each specific application and provide a better basis for acceptance of the standard by various user groups. For example, Variable-Bit-Rate (VBR) coding can be used to add an error correction/detection information for noisy channel applications like mobile radio. Another example is in DCME systems, where VBR coding can avoid speech clipping during overload traffic periods, and can improve speech quality during underload periods.
- Single Report
- 10.21236/ada327255
- May 31, 1997
: During the past year, we have continued our efforts in the area of signal, image and video representation, compression, storage, transmission and enhancement. In the area of video transmission, we have focused on optimal joint/source channel coding for noisy wireless channels. In the area of video compression, we submitted our very low bit rate algorithm to MPEG-4 standardization body into the area of video compression, we submitted our very low bit rate algorithm to MPEG-4 standardization body in November of 1995. Our matching pursuit algorithm performed among the top 3 of all the submissions for very low bit rate compression. In the area of video compression, we also focused on low complexity, real time software codecs for scalable video. Specifically, we have shown that by trading off compression efficiency with complexity, we can achieve real time encode and decode capability on today's workstations. In the area of video storage and retrieval, we continued our efforts on placement of Variable Bit Rate (VBR) and scalable video on parallel disk arrays and developed new admission control strategies. These schemes were tested experimentally on a real disk system in our lab. In the area of resolution enhancement, we developed a novel motion estimation scheme for multiframe video resolution enhancement. Finally, in the area of circuits and systems for signal processing, we continued our efforts on oversampled data conversion systems, such as sigma delta modulators. Specifically, we developed analytical and simulation techniques for locating dominant tones in double loop Sigma Delta A/D converters.
- Conference Article
1
- 10.1109/ictmicc.2007.4448650
- Jan 1, 2007
Video and audio communications like video conference, news, and chat applications are candidates to be the greater portion of internet traffic. Thus, using compressed video and audio traffic seems to be compulsory to reduce the costs of transmission and storage. Variable Bit Rate (VBR) compressed video and audio traffic have been receiving considerable attention due to their compression algorithms that transmit higher rate during high activity and low rate during less activity. This way of transmission creates a bursty traffic. Thus, transporting bursty traffic a crosses the network and needs a suitable environment to efficiently allocate network resources. Optical Burst Switching (OBS) is an optical environment that is suitable for bursty traffic. In this paper, we examine the conformity between OBS networks and VBR traffic by developing a simulation model to study the impact of OBS burst aggregation on VBR performance.
- Conference Article
1
- 10.1109/iccsit.2009.5234638
- Jan 1, 2009
Variable bit rate (VBR) compressed video targeted at constant video quality is also known to exhibit significant and multiple-time-scale rate variabilities. The burstiness of such a compressed VBR video complicates the management and provisioning of network resources for ever increasing multimedia services. In a heterogeneous internetworking environment, a single service provider typically doesn't control the entire path from multimedia streaming server to the client buffer. In this paper we analyze bit rate variabilities exhibited by scalable video coding (SVC) encoded VBR stream and present optical burst switching (OBS) network as a mechanism for VBR transport across the core network. In our experimental evaluation we use OBS inherent bursitification feature at the edge node and evaluate its effectiveness towards smoothing and transport of VBR video stream. SVC encoded VBR video is transported over OBS test bed and OBS burst assembly parameters like time threshold and offset time are tuned for a smooth transport of SVC encoded VBR video stream. Experimental results reveal that for a proper burst assembly time, OBS transported VBR video stream has low inter-frame time intervals as well as a high peak signal to noise ratio (PSNR).
- Research Article
- 10.53332/kuej.v10i1.914
- Oct 6, 2022
- University of Khartoum Engineering Journal
This paper shows the difference in video quality between two compressed videos using H.264 AVC (Advanced Video Coding) and H.265 HEVC (High Efficiency Video Coding) encoders. To evaluate video completely it should be prepared video files that have a variety of bit rates and content. There are many video quality assessment methods. We can divide the min to subjective and objective methods. Subjective are conducted by a human perception and objective are conducted by a computer software which is calculating the video quality. All of these methods have theirs advantages and disadvantages. To generate compressed videos from the original video FFmpeg (Fast Forward-moving picture experts group) converter has been used. MSU-VQMT (Moscow State University’s Video Quality Measure- mentTool) was used to perform comparative objective analysis of video quality. Delta, MSE (mean square Error), MSAD (Mean Sum of Absolute Difference), PSNR (Peak Signal-to-Noise Ratio), and SSIM (Structural Similarity Index Measure) metrics were measured. The result from FFmpeg shows that the size of the compressed video using the H.265 codec has been decreased by 50% compared to the compressed video using the H.264 codec. The comparison of metrics shows that delta, MSAD, PSNR, and SSIM values of H.265 encoded video was decreased, while Delta and MSE value was increased compared to H.264 encoded Video. That’s mean the overall video quality was decreased but the video size was enhanced.
- Book Chapter
- 10.4018/978-1-930708-14-3.ch009
- Jan 1, 2002
This chapter discusses various issues related to the shaping of Motion Picture Experts Group (MPEG) video for generating constrained or controlled variable bit rate (VBR) data streams. MPEG-2 defines a set of standards for coding and compression of digital video. VBR video can offer constant picture quality without incorporating too much processing overhead in the network or transmission system’s architecture. In addition, they can offer substantial (20% to 50%) savings in both storage and transmission bandwidth requirements compared to constant bit rate (CBR) video. Either source coding or encoder’s output shaping or a combination of both can be used for adapting the MPEG-2 video streams for transmission over real-time VBR (rt-VBR)-type asynchronous transfer mode (ATM) channel.
- Conference Article
22
- 10.1109/glocom.1988.25838
- Nov 28, 1988
The advent of the asynchronous time division (ATM) concept has created the opportunity to use variable bit rate (VBR) coding techniques for the coding of video services in broadband networks. The principles of VBR coding are explained, and the benefits-stabilized picture quality combined with a high bandwidth efficiency-are indicated. The results obtained with a hardware model of a VBR video codec are presented, and statistical multiplexing gain figures are given. A bandwidth allocation scheme is proposed that is based on the statistical characterization and policing of the VBR sources. It is concluded that this scheme offers high performance with a limited impact on the network and encoder complexity. >
- Research Article
3
- 10.1049/ip-vis:19981735
- Jan 1, 1998
- IEE Proceedings - Vision, Image, and Signal Processing
Fuzzy logic control has been employed to improve the rate control mechanism for a MPEG2 video encoder. The data rate of compressed video is controlled by video encoders for either variable bit rate (VBR) or constant bit rate (CBR) applications. In VBR video transmission, it is considered to be more efficient to regulate the video rate by the video coder than by network management in order to avoid network congestion and maintain stable video quality. This rationale can also be applied to CBR transmission. Two fuzzy-logic-based rate control techniques are proposed which maintain the buffer occupancy within a specified range. In the proposed technique for VBR applications, a video quality measure is taken as the crucial control parameter. In CBR rate control, the video data rate or the buffer occupancy is also considered as a fuzzy logic variable. The proposed techniques are designed to control either data rate or video quality, depending on the mode of transmission, i.e. CBR or VBR for the MPEG2 encoder. The performance is compared to a typical VBR MPEG video coder with fixed quantiser step sizes for VBR and also to the CBR video coder with MPEG2 TM5 at typical channel rates. Simulation results are presented with peak signal-to-noise ratio, data rate variation and buffer occupancy as the performance measures.
- Conference Article
3
- 10.1109/icc.1997.595006
- Jun 8, 1997
In this paper we introduce a video smoothing algorithm for MPEG compressed live video. This algorithm, called pattern smoothing, transmits compressed video via both constant bit rate (CBR) and variable bit rate (VBR) channels. In order to take advantage of the gains achieved through statistical multiplexing of multiple sources over a single link, this algorithm utilizes a CBR channel to reduce the peak rate and variance of the VBR transmission. In addition to presenting this new algorithm, we compare it against three smoothing techniques presented in the literature. Key attributes used for comparison include receiver buffer size, live video support, startup delay, losslessness versus lossiness, and smoothing scale. Because network utilization is the most important performance metric for any smoothing algorithm, we provide a performance analysis of the pattern smoothing algorithm via simulation and compare these results to the best of the three presented smoothing algorithms.
- Conference Article
2
- 10.1109/icon.2002.1033293
- Nov 7, 2002
For video sources the Moving Picture Experts Group (MPEG) compression scheme has become the defacto standard for video compression since then. However, even with the huge reduction of bits that MPEG compression provides, it does not smooth the video traffic. Indeed the variable bit rate (VBR) MPEG compression algorithm guarantees that the MPEG stream will be bursty. A service, where an asynchronous transfer mode (ATM) client requests and receives from an ATM server VBR MPEG coded video sequences, is considered. An algorithm for streaming VBR MPEG coded video delivery over ATM networks, which dynamically allocates the transmission parameters, is proposed. A scheme for optimal choice of the prediction window's size is also presented. The results obtained show that the proposed dynamic allocation algorithm can provide an efficient solution for VBR MPEG coded video transport with guaranteed quality of service (QoS) over ATM networks.