Abstract

Modern packet-switched networks are increasingly capable of offering high-quality voice services such as Voice over LTE (VoLTE) which have the potential to surpass the Public Switched Telephone Network (PSTN) in terms of quality. To ensure this development is sustained, it is important that suitable quality evaluation methods exist in order to help measure and identify the effect of network impairments on voice quality. In this paper, a single-ended, objective voice quality evaluation model is proposed, utilizing a Convolutional Neural Network with regression-style output (CQCNN) to predict mean opinion scores (MOS) of speech samples impaired by a VoLTE network emulation. The results of this experiment suggest that a deep-learning approach using CNNs is highly successful at predicting MOS values for both narrowband (NB) and super-wideband (SWB) samples with an accuracy of 91.91% and 82.50% respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call