Single-ended parametric voicing-aware models for live assessment of packetized VoIP conversations

Sofiene Jelassi,Guy Pujolle,Christian Hoene,Habib Youssef

doi:10.1007/s11235-010-9350-y

Sofiene Jelassi, Guy Pujolle + Show 2 more

Open Access

PDF Available

https://doi.org/10.1007/s11235-010-9350-y

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The perceptual quality of VoIP conversations depends tightly on the pattern of packet losses, i.e., the distribution and duration of packet loss runs. The wider (resp. smaller) the inter-loss gap (resp. loss gap) duration, the lower is the quality degradation. Moreover, a set of speech sequences impaired using an identical packet loss pattern results in a different degree of perceptual quality degradation because dropped voice packets have unequal impact on the perceived quality. Therefore, we consider the voicing feature of speech wave included in lost packets in addition to packet loss pattern to estimate speech quality scores. We distinguish between voiced, unvoiced, and silence packets. This enables to achieve better correlation and accuracy between human-based subjective and machine-calculated objective scores.This paper proposes novel no-reference parametric speech quality estimate models which account for the voicing feature of signal wave included in missing packets. Precisely, we develop separate speech quality estimate models, which capture the perceptual effect of removed voiced or unvoiced packets, using elaborated simple and multiple regression analyses. A new speech quality estimate model, which mixes voiced and unvoiced quality scores to compute the overall speech quality score at the end of an assessment interval, is developed following a rigorous multiple linear regression analysis. The input parameters of proposed voicing-aware speech quality estimate models, namely Packet Loss Ratio (PLR) and Effective Burstiness Probability (EBP), are extracted based on a novel Markov model of voicing-aware packet loss which captures properly the feature of packet loss process as well as the voicing property of speech wave included in lost packets. The conceived voicing-aware packet loss model is calibrated at run time using an efficient packet loss event driven algorithm. The performance evaluation study shows that our voicing-aware speech quality estimate models outperform voicing-unaware speech quality estimate models, especially in terms of accuracy over a wide range of conditions. Moreover, it validates the accuracy of the developed parametric no-reference speech quality models. In fact, we found that predicted scores using our speech quality models achieve an excellent correlation with measured scores (>0.95) and a small mean absolute deviation (<0.25) for ITU-T G.729 and G.711 speech CODECs.

Full Text