Novel parameter-based models estimating quality of synthesized speech transmitted over IP network based on Genetic Programming approach

M Mrvova,P Pocta

doi:10.1109/radioelek.2013.6530946

Abstract

In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to design parameter-based speech quality estimation models. In particular, the models have been designed to estimate a quality of synthesized speech transmitted over IP channel. In principle, the idea is to apply an appropriate set of quality-affecting parameters (e.g. parameters characterizing packet loss process, speech codec type, type of synthesized speech) as an input of the designed estimation models. Those quality-affecting parameters together with the corresponding speech quality values predicted by PESQ (Perceptual Evaluation of Speech Quality) [2] are used in training process of the designed models in order to define a relationship between the used quality-affecting parameters and the corresponding speech quality values. Regarding the usage of PESQ as a source of speech quality values, the experiments presented in [3] have proven that PESQ is able to provide accurate predictions of quality of synthesized speech impaired by the impairments used in this study. This study has shown that all designed models provide accurate estimations of quality of synthesized speech transmitted over IP network. An accuracy of the estimations was quantified in terms of the Pearson correlation coefficient R, the respective root mean square error (rmse) and epsilon-insensitive root mean square error (rmse*). The developed models can be useful for network operators and service providers in planning phase or early-development stage of telecommunication services based on synthesized speech.

Full Text