Abstract

In this work, multiple time-scale estimates of auditory features have been introduced to capture the effect of short time-transients additive noise present over some specific active regions in a speech utterance and these multiple time-scale auditory features have been used for non-intrusive speech quality measurement. The use of single time-scale auditory features is not accurate in capturing the time localized information of short-time transient distortions and their distinction from plosive sounds of speech. Hence, the importance of estimating auditory features at multiple time-scales that is relevant for objective non-intrusive speech quality estimation. The different active speech segments obtained from voice activity detection (VAD) algorithm of speech utterance are combined across the segments in increasing number of active speech segments till all the segments of complete speech utterance are accounted for. The Lyon's auditory features of the combination of active speech segments are computed on frame by frame basis. The mean, variance, skewness and kurtosis over the frames of the auditory features are computed and concatenated to obtain multiple time-scale estimates of auditory features for the different combination active speech segments. These multiple time-scale auditory features are probabilistically modeled using Gaussian Mixture Model (GMM) to map into mean opinion score (MOS) value for each combination of active speech segments. The overall objective MOS of the degraded speech is obtained by taking average of MOS values of the combination of active speech segments. A detailed result comparison has been done with the ITU-T Recommendation P.563 for telephone band speech.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.