ROI-DVC: A Region-of-Interest Based Deep Video Coding Framework
Recently, deep video compression has exhibited significant advancements in Rate-Distortion performance. However, prevailing state-of-the-art networks operate uniformly across all spatial locations, lacking the capability for region-of-interest (ROI) based processing. In this study, we propose an ROI-based video coding network, ROI-DVC, by integrating an ROI perception strategy into both the training and inference processes. Initially, we introduce an ROI-aware training loss, which incorporates a binary ROI mask generated from an object detection network. Subsequently, we devise a resolution-adaptive ROI-based coding strategy to optimize bit rate allocation during inference. This strategy is realized through frame segmentation and differential rate coding. Comprehensive experimental results demonstrate that proposed network achieves over a 32% bit rate saving compared to H.265 at the same coding quality, particularly in regions of interest.