A study of artificial speech quality assessors of VoIP calls subject to limited bursty packet losses

Sofiene Jelassi,Gerardo Rubino

doi:10.1186/1687-5281-2011-9

Abstract

A revolutionary feature of emerging media services over the Internet is their ability to account for human perception during service delivery processes, which surely increases their popularity and incomes. In such a situation, it is necessary to understand the users' perception, what should obviously be done using standardized subjective experiences. However, it is also important to develop artificial quality assessors that enable to automatically quantify the perceived quality. This efficiently helps performing optimal network and service management at the core and edges of the delivery systems. In our article, we explore the behavior rating of new emerging artificial speech quality assessors of VoIP calls subject to moderately bursty packet loss processes. The examined Speech Quality Assessment (SQA) algorithms are able to estimate speech quality of live VoIP calls at run-time using control information extracted from header content of received packets. They are especially designed to be sensitive to packet loss burstiness. The performance evaluation study is performed using a dedicated set-up software-based SQA framework. It offers a specialized packet killer and includes the implementation of four SQA algorithms. A speech quality database, which covers a wide range of bursty packet loss conditions, has been created and then thoroughly analyzed. Our main findings are the following: (1) all examined automatic bursty-loss aware speech quality assessors achieve a satisfactory correlation under upper (> 20%) and lower (< 10%) ranges of packet loss processes; (2) they exhibit a clear weakness to assess speech quality under a moderated packet loss process; (3) the accuracy of sequence-by-sequence basis of examined SQA algorithms should be addressed in detail for further precision.

Highlights

Telecommunication networks were engineered in such a way that enables offering a steady perceived quality of delivered services during a media session
Under the scope of this work, we explore the accurate estimation of perceived listening quality of PC-to-PC and PC-to-PSTN phone calls, denoted often as VoIP (Voice over IP), that currently live in their blossoming period
Our study investigates the perceived effect of Comfort Noise (CN) and frequency bandwidth changeover required for speech material preparation

Summary

Introduction

Telecommunication networks were engineered in such a way that enables offering a steady perceived quality of delivered services during a media session. For the VQmon and Q-Model assessment tools, we use the quality model given in (5) to estimate distortions due to independent packet losses This model that is dedicated to the ITU-T G.729 speech CODEC has been obtained following a logarithmic regression analysis of PESQ scores under a wide range of PLR conditions [19]. All existing SQA algorithms are designed using monotonic quality models as functions of PLR values, which explains the observed good correlation coefficients This feature is more emphasized for the cluster-by-cluster measurement methodology, since it eliminates unusual deviations caused by a specific bursty packet loss pattern and speech content.

Clark A

Findings

Cole RG