Existing quality metrics of view synthesis usually perform on synthesized images, which are produced based on a computationally expensive depth-image-based rendering (DIBR) process. Moreover, current metrics quantify quality by extracting hand-crafted features, which may fail to fully capture the complex distortion characteristics. With the success of deep learning on numerous computer vision tasks, it has become possible to utilize convolutional neural networks to predict the quality of DIBR-synthesized images. In this letter, we propose a deep model to predict the quality of view synthesis based on Curriculum-style Structure Generation without conducting the DIBR process. Specifically, considering that the distortion of view synthesis is mainly manifested in the destruction of image structure, a structure generation network is first built to learn the structure of the new view from the original one by curriculum-style training. Then, we transfer the prior knowledge learned from the last phase into the quality prediction network for measuring the structure distortion, based on which a regressor is introduced to produce the quality score. Experimental results prove the advantages of the proposed model.
Read full abstract