VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision.

Xihua Sheng,Houqiang Li,Li Li,Dong Liu

doi:10.1109/tpami.2024.3356548

Abstract

Almost all digital videos are coded into compact representations before being transmitted. Such compact representations need to be decoded back to pixels before being displayed to humans and - as usual - before being enhanced/analyzed by machine vision algorithms. Intuitively, it is more efficient to enhance/analyze the coded representations directly without decoding them into pixels. Therefore, we propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis, thereby being versatile for both human and machine vision. Our VNVC framework has a feature-based compression loop. In the loop, one frame is encoded into compact representations and decoded to an intermediate feature that is obtained before performing reconstruction. The intermediate feature can be used as reference in motion compensation and motion estimation through feature-based temporal context mining and cross-domain motion encoder-decoder to compress the following frames. The intermediate feature is directly fed into video reconstruction, video enhancement, and video analysis networks to evaluate its effectiveness. The evaluation shows that our framework with the intermediate feature achieves high compression efficiency for video reconstruction and satisfactory task performances with lower complexities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on pattern analysis and machine intelligence

Lead the way for us

Journal: IEEE transactions on pattern analysis and machine intelligence	Publication Date: Jul 1, 2024
Citations: 2

Similar Papers

Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics.
Wenhan Yang ... Ling-Yu Duan
IEEE transactions on pattern analysis and machine intelligence | VOL. 46
Wenhan Yang, et. al.Wenhan Yang ... Ling-Yu Duan
01 Jul 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. 46

Multigrid block matching motion estimation for generic video coding
Frédéric Dufaux
Signal Processing | VOL. 37
Frédéric DufauxFrédéric Dufaux
01 May 1994
Signal Processing | VOL. 37

Intermediate deep feature coding for human–machine vision collaboration
Weiqian Wang ... Chao Yang
Journal of Visual Communication and Image Representation | VOL. 95
Weiqian Wang, et. al.Weiqian Wang ... Chao Yang
01 Jun 2023
Journal of Visual Communication and Image Representation | VOL. 95

<title>Unsupervised motion-based object segmentation refined by color</title>
Matthijs C Piek ... Ralph Braspenning
-
Matthijs C Piek, et. al.Matthijs C Piek ... Ralph Braspenning
16 Jun 2003
16 Jun 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on pattern analysis and machine intelligence