Speech Quality Assessment Over Lossy Transmission Channels Using Deep Belief Networks

Emmanuel T Affonso,Demostenes Z Rodriguez,Renata L Rosa

doi:10.1109/lsp.2017.2773536

Abstract

Nowadays, there are several telephone services based on IP networks. However, the networks can present many disturbances, such as packet loss rate (PLR), which is one of the most impairing network factors. An impaired speech communication affects the users’ quality of experience; hence, the assessment of speech quality is relevant to the telephone operators. Therefore, the determination of a methodology to predict a speech quality with a higher accuracy in telephone services is relevant. In this context, this letter introduces a novel nonintrusive speech quality classifier (SQC) model based on deep belief networks (DBN), in which the support vector machine with radial basis function kernel is the classifier applied in DBN, in order to identify four speech quality classes. A speech database was built, based on unimpaired speech files of public databases, in which different PLR models and values are applied, and a standardized intrusive method is used to calculate the index quality of each file. Results show that SQC largely overcomes the results obtained by ITU-T Recommendation P.563. Also, subjective tests are performed to validate the SQC performance, and it reached an accuracy of 95% on speech quality classification. Furthermore, a solution architecture is introduced, demonstrating the usefulness and flexibility of the proposed SQC.

Full Text