Thqa: A Perceptual Quality Assessment Database for Talking Heads

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

In the realm of media technology, digital humans have gained prominence due to rapid advancements in computer technology. However, the manual modeling and control required for the majority of digital humans pose significant obstacles to efficient development. The speech-driven methods offer a novel avenue for manipulating the mouth shape and expressions of digital humans. Despite the proliferation of driving methods, the quality of many generated talking head (TH) videos remains a concern, impacting user visual experiences. To tackle this issue, this paper introduces the Talking Head Quality Assessment (THQA) database, featuring 800 TH videos generated through 8 diverse speechdriven methods. Extensive experiments affirm the THQA database's richness in character and speech features. Subsequent subjective quality assessment experiments analyze correlations between scoring results and speech-driven methods, ages, and genders. In addition, experimental results show that mainstream image and video quality assessment methods have limitations for the THQA database, underscoring the imperative for further research to enhance TH video quality assessment. The THQA database is publicly accessible at https://github.com/zyj-2000/THQA.

Similar Papers
  • Research Article
  • Cite Count Icon 190
  • 10.1109/tip.2013.2240003
Perceptual Full-Reference Quality Assessment of Stereoscopic Images by Considering Binocular Visual Characteristics
  • Jan 14, 2013
  • IEEE Transactions on Image Processing
  • Feng Shao + 4 more

Perceptual quality assessment is a challenging issue in 3D signal processing research. It is important to study 3D signal directly instead of studying simple extension of the 2D metrics directly to the 3D case as in some previous studies. In this paper, we propose a new perceptual full-reference quality assessment metric of stereoscopic images by considering the binocular visual characteristics. The major technical contribution of this paper is that the binocular perception and combination properties are considered in quality assessment. To be more specific, we first perform left-right consistency checks and compare matching error between the corresponding pixels in binocular disparity calculation, and classify the stereoscopic images into non-corresponding, binocular fusion, and binocular suppression regions. Also, local phase and local amplitude maps are extracted from the original and distorted stereoscopic images as features in quality assessment. Then, each region is evaluated independently by considering its binocular perception property, and all evaluation results are integrated into an overall score. Besides, a binocular just noticeable difference model is used to reflect the visual sensitivity for the binocular fusion and suppression regions. Experimental results show that compared with the relevant existing metrics, the proposed metric can achieve higher consistency with subjective assessment of stereoscopic images.

  • Conference Article
  • Cite Count Icon 19
  • 10.1109/icassp.2010.5495313
Spatial and temporal pooling of image quality metrics for perceptual video quality assessment on packet loss streams
  • Jan 1, 2010
  • Junyong You + 2 more

Video streaming through bandwidth-limited channels often suffer from packet losses. Therefore, perceptual quality assessment on video sequences with packet losses is a critical issue in digital video communications. This paper analyzes several image quality metrics and evaluates their applications using spatial and temporal pooling schemes in perceptual video quality assessment for video streams with packet losses. Several approaches using Minkowski summation and averages over different distorted spatial regions and temporal frames to pool the spatial and temporal qualities are evaluated. The experimental results with respect to the subjective video quality measurements demonstrate that the subjects are more sensitive to the most annoying spatial regions and temporal segments when assessing the video quality of the lossy streams.

  • Research Article
  • 10.1016/j.dcan.2024.07.001
Perceptual point cloud qality assessment for immersive metaverse experience
  • Jun 1, 2025
  • Digital Communications and Networks
  • Baoping Cheng + 4 more

Perceptual point cloud qality assessment for immersive metaverse experience

  • Research Article
  • Cite Count Icon 167
  • 10.1007/s11432-024-4133-3
Perceptual video quality assessment: a survey
  • Oct 17, 2024
  • Science China Information Sciences
  • Xiongkuo Min + 4 more

Perceptual video quality assessment plays a vital role in the field of video processing due to the existence of quality degradations introduced in various stages of video signal acquisition, compression, transmission and display. With the advancement of Internet communication and cloud service technology, video content and traffic are growing exponentially, which further emphasizes the requirement for accurate and rapid assessment of video quality. Therefore, numerous subjective and objective video quality assessment studies have been conducted over the past two decades for both generic videos and specific videos such as streaming, user-generated content, 3D, virtual and augmented reality, high dynamic range, high frame rate, audio-visual, etc. This survey provides an up-to-date and comprehensive review of these video quality assessment studies. Specifically, we first review the subjective video quality assessment methodologies and databases, which are necessary for validating the performance of video quality metrics. Second, the objective video quality assessment measures for general purposes are categorized and surveyed according to the methodologies utilized in the quality measures. Third, we overview the objective video quality assessment measures for specific applications and emerging topics. Finally, the performance of the state-of-the-art video quality assessment measures is compared and analyzed. This survey provides a systematic overview of both classical works and recent progress in the realm of video quality assessment, which can help other researchers quickly access the field and conduct relevant research.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/apsipaasc47483.2019.9023009
A Study of Perceptual Quality Assessment for Stereoscopic Image Retargeting
  • Nov 1, 2019
  • Zhenqi Fu + 3 more

Subjective and objective perceptual quality assessment for stereoscopic retargeted images is a fundamentally important issue in stereoscopic image retargeting (SIR) which has not been deeply investigated. Here, a stereoscopic image retargeting quality assessment (SIRQA) database is proposed to study the perceptual quality of different stereoscopic retargeted images. To construct the database, we collect 720 stereoscopic retargeted images generated by eight representative SIR methods. The perceptual quality (mean opinion scores, MOS) of each stereoscopic retargeted image is subjectively rated by 30 viewers. For objective assessment, several publicly available quality evaluation metrics are tested on the database. Experimental results show that there is a large room for improving the accuracy of objective quality assessment in SIRQA by comprehensively considering geometric distortion, content loss and stereoscopic perceptual quality.

  • Research Article
  • Cite Count Icon 617
  • 10.1007/s11432-019-2757-1
Perceptual image quality assessment: a survey
  • Apr 26, 2020
  • Science China Information Sciences
  • Guangtao Zhai + 1 more

Perceptual quality assessmentplays a vital role in the visual communication systems owing to theexistence of quality degradations introduced in various stages of visual signalacquisition, compression, transmission and display.Quality assessment for visual signals can be performed subjectively andobjectively, and objective quality assessment is usually preferred owing to itshigh efficiency and easy deployment. A large number of subjective andobjective visual quality assessment studies have been conducted during recent years.In this survey, we give an up-to-date and comprehensivereview of these studies.Specifically, the frequently used subjective image quality assessment databases are firstreviewed, as they serve as the validation set for the objective measures.Second, the objective image quality assessment measures are classified and reviewed according to the applications and the methodologies utilized in the quality measures.Third, the performances of the state-of-the-artquality measures for visual signals are compared with an introduction of theevaluation protocols.This survey provides a general overview of classical algorithms andrecent progresses in the field of perceptual image quality assessment.

  • Research Article
  • Cite Count Icon 241
  • 10.1109/tip.2015.2465145
Perceptual Quality Assessment of Screen Content Images.
  • Aug 5, 2015
  • IEEE Transactions on Image Processing
  • Huan Yang + 2 more

Research on screen content images (SCIs) becomes important as they are increasingly used in multi-device communication applications. In this paper, we present a study on perceptual quality assessment of distorted SCIs subjectively and objectively. We construct a large-scale screen image quality assessment database (SIQAD) consisting of 20 source and 980 distorted SCIs. In order to get the subjective quality scores and investigate, which part (text or picture) contributes more to the overall visual quality, the single stimulus methodology with 11 point numerical scale is employed to obtain three kinds of subjective scores corresponding to the entire, textual, and pictorial regions, respectively. According to the analysis of subjective data, we propose a weighting strategy to account for the correlation among these three kinds of subjective scores. Furthermore, we design an objective metric to measure the visual quality of distorted SCIs by considering the visual difference of textual and pictorial regions. The experimental results demonstrate that the proposed SCI perceptual quality assessment scheme, consisting of the objective metric and the weighting strategy, can achieve better performance than 11 state-of-the-art IQA methods. To the best of our knowledge, the SIQAD is the first large-scale database published for quality evaluation of SCIs, and this research is the first attempt to explore the perceptual quality assessment of distorted SCIs.

  • Research Article
  • Cite Count Icon 27
  • 10.1109/42.974930
A multistage perceptual quality assessment for compressed digital angiogram images.
  • Jan 1, 2001
  • IEEE Transactions on Medical Imaging
  • J Oh + 3 more

This paper describes a multistage perceptual quality assessment (MPQA) model for compressed images. The motivation for the development of a perceptual quality assessment is to measure (in)visible differences between original and processed images. The MPQA produces visible distortion maps and quantitative error measures informed by considerations of the human visual system (HVS). Original and decompressed images are decomposed into different spatial frequency bands and orientations modeling the human cortex. Contrast errors are calculated for each frequency and orientation, and masked as a function of contrast sensitivity and background uncertainty. Spatially masked contrast error measurements are then made across frequency bands and orientations to produce a single perceptual distortion visibility map (PDVM). A perceptual quality rating (PQR) is calculated from the PDVM and transformed into a one to five scale, PQR(1-5), for direct comparison with the mean opinion score, generally used in subjective ratings. The proposed MPQA model is based on existing perceptual quality assessment models, while it is differentiated by the inclusion of contrast masking as a function of background uncertainty. A pilot study of clinical experiments on wavelet-compressed digital angiogram has been performed on a sample set of angiogram images to identify diagnostically acceptable reconstruction. Our results show that the PQR(1-5) of diagnostically acceptable lossy image reconstructions have better agreement with cardiologists' responses than objective error measurement methods, such as peak signal-to-noise ratio A Perceptual thresholding and CSF-based Uniform quantization (PCU) method is also proposed using the vision models presented in this paper. The vision models are implemented in the thresholding and quantization stages of a compression algorithm and shown to produce improved compression ratio performance with less visible distortion than that of the embedded zerotrees wavelet (EZWs).

  • Conference Article
  • 10.1109/icce-berlin.2014.7034281
Perceptual audio quality assessment for coder evaluation
  • Sep 1, 2014
  • Julio C Garcia-Alvarez + 2 more

The sound is part of the human stimuli, whose information is given by pressure fluctuations into the ear. The perceived sound quality has recently evaluated the performance of audio coders. Newer evaluation methodologies involve psychological and physiological responses of human listeners. However, there is no available methodology providing any relationship between the perceptual assessment and the audio coder performance evaluation. The present work illustrates the Perceptual Audio Quality Assessment (PQA), which analyzes the sound quality perceived by a listener, using a database composed of distorted audio recordings. The distorted records also pass through channels commonly used for audio signal transmission. This experiment allows us to determine a methodology that gives the first step to select an adequate audio compression format and the appropriate modulation/coding scheme for transmitting an audio signal format without a perceived loss.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.jvcir.2022.103617
A no-reference perceptual image quality assessment database for learned image codecs
  • Aug 21, 2022
  • Journal of Visual Communication and Image Representation
  • Jiaqi Zhang + 2 more

A no-reference perceptual image quality assessment database for learned image codecs

  • Conference Article
  • Cite Count Icon 53
  • 10.1109/icip.2015.7351309
A video texture database for perceptual compression and quality assessment
  • Sep 1, 2015
  • Miltiadis Alexios Papadopoulos + 3 more

This paper presents a new publicly available video texture database (BVI Texture) that contains test sequences and subjective opinion scores. The database exhibits a wide range of static and dynamic textures together with some mixed content. Each sequence is indexed using various video feature descriptors that characterize its spatial activity, temporal activity, static texture content and dynamic texture content. Moreover, rate/distortion results for the new dataset are presented after compression using HEVC, alongside subjective quality evaluation data. The BVI texture database will provide utility in testing quality assessment metrics and emerging video compression methods, particularly those based on texture analysis and synthesis.

  • Research Article
  • 10.5594/jmi.2019.2941361
Visual Perception Entropy Measure
  • Nov 1, 2019
  • SMPTE Motion Imaging Journal
  • Francois Hel

This paper is about a theoretical framework on perceptual reproduction quality assessment. Visual perception is the main theme for two approaches on image metrics: visual performance and image quality. Visual performance looks at the perception of standard elementary images, such as gratings, by different observers or by a single observer in different visual conditions; image quality looks at the perception of variations of the same complex scene by a standard observer. In the audiovisual field, all the configurations approved by the creative team must be reproduced in the final viewing. This cannot be measured by visual performance or image quality indexes. A third approach is needed, which may be called perceptual reproduction quality assessment. Visual perception assessment is not measuring defects, per se; it is built to evaluate the perceptual effectiveness of a given scene coding through a given reproduction technology. It is only dependent on the number of configurations being perceived through a given theater system relative to the various configurations reproduced and approved in the review room. As such, it is a statistical measure. It is, in fact, similar to the definition of relative entropy. One important fact is that this evaluation must include a model of visual perception. This paper illustrates how relative entropy and entropy loss can be estimated, while taking into account the coding transform, the projector performances, maximum light, and the characteristics of the human visual system. This scheme works easily for monochromatic content. Color, trichromacy, is quite challenging. The size of configurations goes from 2 12 to 2 36 , around 6 billion configurations. Calculation and memory requirements are huge. Nonlinearities are adding complexity, leading to calculation of an upper limit only of entropy loss. The presentation of results is also a challenge. One way to solve this is to separately process lightness and chromatic information. In summary, the main arguments in this paper are: 1) besides image quality and visual performance, there is a need for perceptual reproduction quality assessment; 2) entropy is the proper statistical estimator for evaluating the perceptual reproduction quality; and 3) color reproduction quality estimation requires processing lightness and chromatic attributes separately.

  • Conference Article
  • 10.5594/m001837
Visual perceptual entropy measure
  • Oct 1, 2018
  • Francois Helt

This paper is about a theoretical framework on perceptual reproduction quality assessment. — Visual perception is the main theme for two approaches on image metrics, visual performance and image quality. Visual performance looks at the perception of standard elementary images, such as gratings, by different observers or by a single observer in different visual conditions; image quality looks at the perception of variations of the same complex scene by a standard observer. — Visual performance, for which Barten proposed a model in 1999, is studying threshold responses in visual perception. For image quality, on the other hand, indexes are built which must evaluate responses to a variety of suprathreshold visual characteristics, attempting to give results as close as possible to human visual perception. — In audiovisual field, all the configurations approved by the creative team must be reproduced in the final viewing. This cannot be measured by visual performance or image quality indexes. A third approach is needed. This may be called perceptual reproduction quality assessment. Visual perception assessment is not measuring defects, per se; it is built to evaluate the perceptual effectiveness of a given scene coding through a given reproduction technology. — It is only dependent on the number of configurations being reproduced by a given theater system relatively to the various configurations reproduced and approved in the review room. As such, it is a statistical measure. It is in fact similar to the definition of relative entropy. One important fact is that this evaluation must include a model of visual perception. — This paper shows how relative entropy, and entropy loss, can be estimated while taking into account the coding transform, the projector performances, maximum light and the characteristics of human visual system. — This scheme is carried easily for monochromatic content. Color, trichromacy, is quite challenging. The size of configurations goes from 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">12</sup> to 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">36</sup> , around 6 billion configurations. Calculation and memory requirements are huge. Non-linearities are adding complexity leading to calculation of an upper limit only of entropy loss. — Presentation of results is also a challenge. One way to solve this is to process separately lightness and chromatic information. — In summary the main arguments in this paper are: — • Besides image quality and visual performance there is a need for perceptual reproduction quality assessment — • Entropy is the proper statistical estimator to evaluate the perceptual reproduction quality — • Color reproduction quality estimation requires processing lightness and chromatic attributes separately

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/globalsip.2014.7032286
No-reference perceptual quality assessment of streamed videos using optical flow features
  • Dec 1, 2014
  • Mohammed A Aabed + 1 more

This paper proposes a novel perceptual video quality assessment metric for streamed videos using optical flow statistical features. We analyze the impact of network losses on the decoded videos and the resulting error propagation. We show that the statistical features of the optical flow of the corrupted frames can be used to measure the distortion in the received video. We show that this approach is suitable for videos with complex motion patterns. Our technique does not make any assumptions on the coding conditions, network loss patterns or error concealment techniques. The proposed approach is pixel-based and relies only on the inconsistency of the optical flow of the corrupted frames. We validate our proposed quality metric by testing it on a variety of coded sequences subject to network losses from the recently proposed LIVE mobile database. Our results show that the proposed metric can estimate perceptual quality of channel-induced distortions at the frame and sequence levels. For the test videos, we report Pearson's and Spearman's correlation coefficients with the temporal mean opinion scores (MOSs) reported in the database. The results show average correlations of 0.91 and 0.92 for the test sequences, respectively.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/icme.2015.7177436
Structure-preserving Image Quality Assessment
  • Jun 1, 2015
  • Yilin Wang + 2 more

Perceptual Image Quality Assessment (IQA) has many applications. Existing IQA approaches typically work only for one of three scenarios: full-reference, non-reference, or reduced-reference. Techniques that attempt to incorporate image structure information often rely on hand-crafted features, making them difficult to be extended to handle different scenarios. On the other hand, objective metrics like Mean Square Error (MSE), while being easy to compute, are often deemed ineffective for measuring perceptual quality. This paper presents a novel approach to perceptual quality assessment by developing an MSE-like metric, which enjoys the benefit of MSE in terms of inexpensive computation and universal applicability while allowing structural information of an image being taken into consideration. The latter was achieved through introducing structure-preserving kernelization into a MSE-like formulation. We show that the method can lead to competitive FR-IQA results. Further, by developing a feature coding scheme based on this formulation, we extend the model to improve the performance of NR-IQA methods. We report extensive experiments illustrating the results from both our FR-IQA and NR-IQA algorithms with comparison to existing state-of-the-art methods.

Save Icon
Up Arrow
Open/Close