Abstract

In this paper, we present a novel neural network-based predictor for subjective quality of speech signals. The output from the predictor is the estimated subjective quality or mean opinion score (MOS). The internal representation of signals is calculated using a model for the human auditory system. The perceptual distance between the reference speech and the speech sample under test is used as input to the neural network, which is then trained to model the underlying relationship between this perceptual distance and its subjective quality (MOS). Accurate MOS predictions have been demonstrated for speech coders used in common wireless applications including AMPS, TDMA, GSM and CDMA. MOS values predicted by the neural network MOS machine (NN-MM) were validated for clean and corrupted channels as well as for background noise conditions. Prediction accuracy is an order of magnitude better than anything previously reported, with worst case errors on the order of 0.05 MOS point.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.