Abstract

It is well known that multi-modal services, including video, audio, and haptic signals, aim to provide immersive experience with low latency and high reliability. Although multi-modal signals have differences in structure, transmission delay, and jitter, there are potential cross-modal relations among these multi-modal signals that can be explored to realize multi-modal signal processing and streaming. As a revolutionary communication paradigm, semantic communications can not only provide human-oriented services by delivering the intended meanings, but also break the modality barriers through compact semantic information extracted from multi-modal signals, which can help to achieve cross-modal applications. However, due to the polysemy and ambiguity issues, how to guarantee high reliability for semantic communications is a tremendous challenge. To overcome this dilemma, this article designs a cross-modal semantic communication paradigm that involves three modules: The cross-modal knowledge graph (CKG) provides essential background knowledge and signal patches for encoding and decoding respectively; the cross-modal semantic encoder infers potential implicit semantics to reduce encoding polysemy; and the cross-modal semantic decoder ensures consistency between source signals and recovered signals in terms of bit level and semantic level in order to reduce decoding ambiguity. They collaboratively improve the reliability of the communication system, which is verified by the simulation result.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call