Abstract

This paper compares the results of subjective and objective assessments of the quality of speech and music signals distorted during clipping when large instantaneous signal values are replaced by a certain threshold constant or by values close to it. It was proposed in recent works to use kurtosis and some of its simple functional transforms such as reciprocal of kurtosis and square root of reciprocal of kurtosis as objective (instrumental) clipping value measures. This paper clarifies the results of a subjective assessment of the quality of speech and music signals distorted by clipping. A comparison of the obtained estimates allows one to conclude that the human auditory system is slightly more sensitive to the clipping of musical signals than to the clipping of speech signals, but this difference is small. Similarly, objective quality measures of clipped signals are almost equally sensitive to the clipping value of speech and music signals. An analysis of the variability of the kurtosis estimates, depending on the time of estimation, showed that the relative standard deviation of the kurtosis estimates is close to 10% for the analysis time interval of 1–40 s.

Highlights

  • Full use of the dynamic range when speech or music signals are transmitted or recorded is highly desirable, since it allows minimizing effects of background noise

  • Averaged Degradation Mean Opinion Score (DMOS) estimates both over listeners and speech samples of signals are represented by solid lines, and 95% confidence intervals are indicated by segments of vertical dashed lines

  • Summarizing the results presented above, we can conclude that the human auditory system is slightly more sensitive to the clipping of musical signals than to the clipping of speech signals, but this difference is small

Read more

Summary

INTRODUCTION

Full use of the dynamic range when speech or music signals are transmitted or recorded is highly desirable, since it allows minimizing effects of background noise Such mode involves risk of nonlinear signal distortion due to clipping, when large instantaneous signal values x(n) are replaced by a certain threshold constant: y(n) = {xA(⋅ns)i,g n|x[x| (

SOME FEATURES OF STUDIED PARAMETERS
EXPERIMENTAL SETUP
SUBJECTIVE ASSESSMENT
OBJECTIVE ASSESSMENT
ESTIMATES VARIABILITY
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call