Abstract

Existing stereoscopic image discomfort prediction methods may fail to work well because they are difficult to extract discomfort features from stereoscopic image’s statistical information since the mechanism of human binocular vision is very complex. In this work, we propose a dual-stream multi-level interactive network that is completely end-to-end trainable for stereoscopic image discomfort prediction. This method first extracts multi-level fusion and difference features from stereoscopic images through a multi-level interaction network. Then, the low-, medium- and high-level feature maps are concatenated to simulate the complicated visual interaction mechanism of the human visual system (HVS). Finally, two fully connected layers are used as a non-linear regression function that maps the feature vectors to stereoscopic image discomfort scores. Extensive experiments demonstrate that our approach performs favorably against the existing prediction models on the IEEE-SA dataset and NBU-S3D dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call