Abstract

Most recently, due to the demand of immersive communication, region-of-interest-based (ROI) high efficiency video coding (HEVC) approaches in conferencing scenarios have become increasingly important. However, there exists no objective metric, specially developed for efficiently evaluating the perceived visual quality of video conferencing coding. Therefore, this paper proposes a novel objective quality assessment method, namely Gaussian mixture model based peak signal-to-noise ratio (GMM-PSNR), for the perceptual video conferencing coding. First, eye tracking experiments, together with a real-time technique of face and facial feature extraction, are introduced. In the experiments, importance of background, face, and facial feature regions is identified, and it is then quantified based on eye fixation points over test videos. Next, assuming that the distribution of the eye fixation points obeys Gaussian mixture model, we utilize expectation-maximization (EM) algorithm to generate an importance weight map for each frame of video conferencing coding, in light of a new term eye fixation points/pixel (efp/p). According to the generated weight map, GMM-PSNR is developed for quality assessment by assigning different weights to the distortion of each pixel in the video frame. Finally, we utilize some experiments to investigate the correlation of the proposed GMM-PSNR and other conventional objective metrics with subjective quality metrics. The experimental results show the effectiveness of GMM-PSNR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call