Abstract

In video applications it is necessary to continuously measure the video quality perceived by the end-user. Thus it is desirable to know which parts of video frame, i.e. which contents, attract viewers’ attention. If this information is known, then it is possible to estimate perceived video quality in a meaningful way. However, automatic detection of viewers’ fixation points is time-consuming process and often is omitted in objective video quality assessment (VQA) metrics. Based on our previous work, in which we proposed Foveation-based content Adaptive Root Mean Squared Error (FARMSE) VQA metric, in this work we propose two new full-reference (FR) VQA metrics called Multi-Point FARMSE (MP-FARMSE), and Simple-FARMSE (S-FARMSE). Both new-proposed metrics are based on foveated-vision features of human visual system and spatio-temporal features of video signal. In MP-FARMSE, by using an engineering approach, we implemented the fact that viewer’s attention can be directed out of the center of the frame, thus covering use-cases when objects of interest are not located in the center of the frame. The main idea when creating the S-FARMSE metric was to reduce the computational complexity of the final algorithm and to make S-FARMSE metric capable of processing high-resolution video signals in real-time. Performances of the new-proposed metrics are compared to performances of seven existing VQA metrics on two different video quality databases. The results show that performances achieved by MP-FARMSE and S-FARMSE are quite close to those of state-of-the-art VQA metrics, whereas at the same time their computational complexity level is significantly lower.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call