We present a simple yet effective optical flow-based full-reference video quality assessment (FR-VQA) algorithm for assessing the perceptual quality of natural videos. Our algorithm is based on the premise that local optical flow statistics are affected by distortions and the deviation from pristine flow statistics is proportional to the amount of distortion. We characterize the local flow statistics using the mean, the standard deviation, the coefficient of variation (CV), and the minimum eigenvalue ( λ min ) of the local flow patches. Temporal distortion is estimated as the change in the CV of the distorted flow with respect to the reference flow, and the correlation between λ min of the reference and of the distorted patches. We rely on the robust multi-scale structural similarity index for spatial quality estimation. The computed temporal and spatial distortions, thus, are then pooled using a perceptually motivated heuristic to generate a spatio-temporal quality score. The proposed method is shown to be competitive with the state-of-the-art when evaluated on the LIVE SD database, the EPFL Polimi SD database, and the LIVE Mobile HD database. The distortions considered in these databases include those due to compression, packet-loss, wireless channel errors, and rate-adaptation. Our algorithm is flexible enough to allow for any robust FR spatial distortion metric for spatial distortion estimation. In addition, the proposed method is not only parameter-free but also independent of the choice of the optical flow algorithm. Finally, we show that the replacement of the optical flow vectors in our proposed method with the much coarser block motion vectors also results in an acceptable FR-VQA algorithm. Our algorithm is called the flow similarity index.
Read full abstract