Abstract

Semantic communication technology in the 6G wireless system focuses on semantic extraction in communication, that is, only the inherent meaning of the intention in the information. Existing technologies still have challenges in extracting emotional perception in the information, high compression rates, and privacy leakage due to knowledge sharing in communication. Large-scale generative-model technology could rapidly generate multimodal information according to user requirements. This paper proposes an approach that leverages large-scale generative models to create animated short films that are semantically and emotionally similar to real scenes and characters. The visual content of the data source is converted into text expression through semantic understanding technology; emotional clues from the data source media are added to the text form through reinforcement learning technology; and finally, a large-scale generative model is used to generate visual media, which is consistent with the semantics of the data source. This paper develops a semantic communication process with distinct modules and assesses the enhancements garnered from incorporating an emotion enhancement module. This approach facilitates the expedited generation of broad media forms and volumes according to the user's intention, thereby enabling the creation of generated multimodal media within applications in the metaverse and in intelligent driving systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call