Exploring the Benefits of Cross-Modal Coding

Zhe Yuan,Liang Zhou,Bin Kang,Xin Wei

doi:10.1109/tcsvt.2022.3196586

Abstract

Multi-modal services, typically integrating such signals as audio, video, and haptic, will become an inevitable application trend of the 5G and beyond. However, due to the essential differences among the haptic and audio/video signals, the existing coding schemes usually fail to satisfy the critical requirements in terms of the rate distortion performance. Inspired by the phenomenon that hearing, sight and touch are highly correlated, we provide an affirmative answer by proposing the framework of cross-modal coding, which compresses multi-modal signals aided by their semantic correlation. In particular, the highlights of this work lie in addressing three fundamental technical problems: i) how to exploit the semantic correlation among different modalities, ii) to what extent of benefit we can get from cross-modal coding, and iii) how to design a general cross-modal codec. On the theoretical end, we determine the minimum number of bits required to compress haptic signals under the rate conditions of video streams through investigating their semantic correlation. On the technical end, we design a general cross-modal codec to approach the optimal compression limit by using the AI-enabled cross-modal prediction and channel coding. Numerical results demonstrate that the proposed cross-modal coding can achieve significant benefits relative to the existing schemes, especially when multi-modal signals have strong semantic correlation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring the Benefits of Cross-Modal Coding

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Dec 1, 2022
Citations: 5

Similar Papers

On interactive encoding and decoding for distributed lossless coding of individual sequences
En-Hui Yang ... Jin Meng
-
En-Hui Yang, et. al.En-Hui Yang ... Jin Meng
01 Jan 2009
01 Jan 2009

An optimisation of formant synthesis parameter coding
Selim Awad ... Bernard Guerin
Speech Communication | VOL. 3
Selim Awad, et. al.Selim Awad ... Bernard Guerin
01 Dec 1984
Speech Communication | VOL. 3

4×4行列式法における適応的符号判定処理 (第1報)
Norimasa Yoshida ... Masato Shiokawa
Journal of the Japan Society for Precision Engineering | VOL. 60
Norimasa Yoshida, et. al.Norimasa Yoshida ... Masato Shiokawa
01 Jan 1993
Journal of the Japan Society for Precision Engineering | VOL. 60

Interactive Communication of Balanced Distributions and of Correlated Files
Alon Orlitsky
SIAM Journal on Discrete Mathematics | VOL. 6
Alon OrlitskyAlon Orlitsky
01 Nov 1993
SIAM Journal on Discrete Mathematics | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring the Benefits of Cross-Modal Coding

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology