Abstract

With the introduction of fullband speech coding the question arises what role frequency components above 14 kHz play in speech quality assessment. On the one hand, our results show that bandwidth limitation from 24 kHz down to 14 kHz is not audible to even the most critical subject. On the other hand, 14-24 kHz band limited, audible levels of noise clearly decrease the perceived quality, especially for young subjects with healthy ears. Furthermore, modern high-quality voice links, using the latest speech codecs, often apply advanced buffering schemes that introduce a new type of audible degradation: micropauses. We investigated the impact of i) bandwidth limitation, ii) coding schemes, iii) micropause, and iv) noise on the perceived quality. Subjective results and objective predictions based on ITU-T recommendation P.863 POLQA are compared. For accurate prediction of the impact of micropauses and noise degradations small model adaptations are suggested. In contrast codec degradations and bandwidth limitation are already predicted with very high accuracy by POLQA: r = 0.98, RMSE <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</sup> = 0.05 Mean Opinion Score (MOS).

Highlights

  • I N THE last decade there has been a trend to extend the bandwidth used in speech coding standards from the classical 3.5 kHz narrowband (NB) via 7 kHz wideband (WB) to 14 kHz super wideband (SWB) and >20 kHz fullband (FB) [1]–[5]

  • Comparison between subjective results and objective POLQA predictions show that codec and bandwidth limitation conditions are predicted with high accuracy

  • A major conclusion based on the carried out ITU-T P.800 absolute category rating experiment, is that there is no significant difference between a clean fullband speech reference file with frequency components up to 24 kHz and a super wideband representation that is bandlimited to 14 kHz

Read more

Summary

INTRODUCTION

I N THE last decade there has been a trend to extend the bandwidth used in speech coding standards from the classical 3.5 kHz narrowband (NB) via 7 kHz wideband (WB) to 14 kHz super wideband (SWB) and >20 kHz fullband (FB) [1]–[5]. These new coding trends are reflected in the updating of the ITU-T recommendations for the objective assessment of speech quality, P.862. In 2011 Ekman et al showed that the impact of degradations in super wideband speech signals cannot be accurately predicted by the extension of PESQ towards wideband [18]. BEERENDS et al.: SUBJECTIVE AND OBJECTIVE ASSESSMENT OF FULL BANDWIDTH SPEECH QUALITY an impact on the perceptual modelling as used in POLQA and other objective perceptual measurement methods. Another trend in modern, high quality, speech coding are advanced buffering schemes. A discussion of both the subjective and objective results is given in Section V and Section VI provides the final conclusions

SOURCE MATERIAL
Experimental Procedure
Training of the Subjects
Subjective Test Results
Objective Quality Assessment Approach
Findings
DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call