Abstract

Cross-modal communications, devoting to collaboratively delivering and processing audio, visual, and haptic signals, have gradually become the supporting technology for the emerging multi-modal services. However, the inevitable resource competitions among different modality signals as well as the unexpected packet loss and latency during transmission seriously affect quality of the received signals and end user's immersive experience (especially visual experience). To overcome these dilemmas, this paper proposes a cross-modal signal reconstruction strategy from the perspective of human's perceptual facts. It tries to guarantee visual signal quality by considering potential correlations among modalities when processing audio and haptic signals. On the one hand, a time-frequency masking-based audio-haptic redundancy elimination mechanism is designed by resorting to the similarity of audio-haptic characteristics and human's masking effects. On the other hand, based on the fact that non-visual perception can assist to form and enhance visual perception, an audio-haptic fused visual signal restoration (AHFVR) approach for handling the impaired and delayed visual signals is proposed. Experiments on a standard multi-modal database and a constructed practical platform evaluate the performance of the proposed perception-aware cross-modal signal reconstruction strategy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call