Abstract

The output-based speech quality assessment method has been widely used and received increasing attention since it does not need undistorted signals as reference. In order to obtain a high correlation between the predicted scores and subjective results, this paper presents a new speech quality assessment method to estimate the quality of degraded speech without the reference speech. Bottleneck features are extracted with autoencoder and support vector regression is chosen as mapping model from objective representation to subjective scores. Experiments are conducted in a dataset containing various degraded speech signals and subjective listening scores. The proposed method takes advantage of autoencoder in forming a good representation of its input which can be better mapped to Mean Opinion Score. The experimental results show that compared with the standardization ITU-T P.563 and another deep learning-based assessment method, the proposed method brings about a higher correlation coefficient between predicted scores and subjective scores.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call