Video Compression Research Articles

Video compression is indispensable to most video analysis systems. Despite saving the transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this problem, we first thoroughly review the previous methods, revealing that three principles, i.e., task-decoupled, label-free, and data-emerged semantic prior, are critical to a machine-friendly coding framework but are not fully satisfied so far. In this paper, we propose a traditional-neural mixed coding framework that simultaneously fulfills all these principles, by taking advantage of both traditional codecs and neural networks (NNs). On one hand, the traditional codecs can efficiently encode the pixel signal of videos but may distort the semantic information. On the other hand, highly non-linear NNs are proficient in condensing video semantics into a compact representation. The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w.r.t. the coding procedure, which is spontaneously learned from unlabeled data in a self-supervised manner. The videos collaboratively decoded from two streams (codec and NN) are of rich semantics, as well as visually photo-realistic, empirically boosting several mainstream downstream video analysis task performances without any post-adaptation procedure. Furthermore, by introducing the attention mechanism and adaptive modeling scheme, the video semantic modeling ability of our approach is further enhanced. Fianlly, we build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach. All codes, data, and models will be open-sourced for facilitating future research.

Read full abstract

The viral spread of massive deepfake videos over social networks has caused serious security problems. Despite the remarkable advancements achieved by existing deepfake detection algorithms, deepfake videos over social networks are inevitably influenced by compression factors. This causes deepfake detection performance to be limited by the following challenging issues: (a) interfering with compression artifacts, (b) loss of feature information, and (c) aliasing of feature distributions. In this paper, we analyze the common mechanism between compression artifacts and deepfake artifacts, revealing the structural similarity between them and providing a reliable theoretical basis for enhancing the robustness of deepfake detection models against compression. Firstly, based on the common mechanism between artifacts, we design a frequency domain adaptive notch filter to eliminate the interference of compression artifacts on specific frequency bands. Secondly, to reduce the sensitivity of deepfake detection models to unknown noise, we propose a spatial residual denoising strategy. Thirdly, to exploit the intrinsic correlation between feature vectors in the frequency domain branch and the spatial domain branch, we enhance deepfake features using an attention-based feature fusion method. Finally, we adopt a multi-task decision approach to enhance the discriminative power of the latent space representation of deepfakes, achieving deepfake detection with robustness against compression. Extensive experiments show that compared with the baseline methods, the detection performance of the proposed algorithm on compressed deepfake videos has been significantly improved. In particular, our model is resistant to various types of noise disturbances and can be easily combined with baseline detection models to improve their robustness.

Read full abstract

Video Compression Research Articles

Related Topics

Articles published on Video Compression

A Coding Framework and Benchmark Towards Low-Bitrate Video Understanding.

Spatio-temporal feature learning for enhancing video quality based on screen content characteristics

Application of Interactive Object Model in Sports Teaching

Video compression and optimization technologies - review

A fast and efficient data reuse scheme for HEVC Integer Motion Estimation hardware architecture

Face forgery video detection based on expression key sequences

ColorVideoVDP: A visual difference predictor for image, video and display distortions

Efficient Key Frame Extraction from Videos Using Convolutional Neural Networks and Clustering Techniques

Multi-domain awareness for compressed deepfake videos detection over social networks guided by common mechanisms between artifacts

Spatial Decomposition and Temporal Fusion Based Inter Prediction for Learned Video Compression

Playful Trauma: TikTok Creators and the Use of the Platformed Body in Times of War

On Alpha-Expansion-Based Graph-Cut Optimization for Decoder-Side Depth Estimation

N-DEPTH: Neural Depth Encoding for Compression-Resilient 3D Video Streaming

Video Multi-Scale-Based End-to-End Rate Control in Deep Contextual Video Compression

JND-based multi-module cooperative perceptual optimization for HEVC

A Comprehensive Literature Review on Image and Video Compression: Trends, Algorithms, and Techniques

Learned Video Compression with Adaptive Temporal Prior and Decoded Motion-aided Quality Enhancement

High Efficiency Deep-learning Based Video Compression

Violence detection in compressed video

Tensor Golub–Kahan method based on Einstein product

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Video Compression Research Articles

Related Topics

Articles published on Video Compression

A Coding Framework and Benchmark Towards Low-Bitrate Video Understanding.

Spatio-temporal feature learning for enhancing video quality based on screen content characteristics

Application of Interactive Object Model in Sports Teaching

Video compression and optimization technologies - review

A fast and efficient data reuse scheme for HEVC Integer Motion Estimation hardware architecture

Face forgery video detection based on expression key sequences

ColorVideoVDP: A visual difference predictor for image, video and display distortions

Efficient Key Frame Extraction from Videos Using Convolutional Neural Networks and Clustering Techniques

Multi-domain awareness for compressed deepfake videos detection over social networks guided by common mechanisms between artifacts

Spatial Decomposition and Temporal Fusion Based Inter Prediction for Learned Video Compression

Playful Trauma: TikTok Creators and the Use of the Platformed Body in Times of War

On Alpha-Expansion-Based Graph-Cut Optimization for Decoder-Side Depth Estimation

N-DEPTH: Neural Depth Encoding for Compression-Resilient 3D Video Streaming

Video Multi-Scale-Based End-to-End Rate Control in Deep Contextual Video Compression

JND-based multi-module cooperative perceptual optimization for HEVC

A Comprehensive Literature Review on Image and Video Compression: Trends, Algorithms, and Techniques

Learned Video Compression with Adaptive Temporal Prior and Decoded Motion-aided Quality Enhancement

High Efficiency Deep-learning Based Video Compression

Violence detection in compressed video

Tensor Golub–Kahan method based on Einstein product