Abstract

Perceived speech quality is most directly measured by subjective listening tests. These tests are often slow and expensive, and numerous attempts have been made to supplement them with objective estimators of perceived speech quality. These attempts have found limited success, primarily in analog and higher-rate, error-free digital environments where speech waveforms are preserved or nearly preserved. The objective estimation of the perceived quality of highly compressed digital speech, possibly with bit errors or frame erasures has remained an open question. We report our findings regarding two essential components of objective estimators of perceived speech quality: perceptual transformations and distance measures. A perceptual transformation modifies a representation of an audio signal in a way that is approximately equivalent to the human hearing process. A distance measure reflects the magnitude of a perceived distance between two perceptually transformed signals. We then describe a new objective estimation approach that uses a simple but effective perceptual transformation and a distance measure that consists of a hierarchy of measuring normalizing blocks. Each measuring normalizing block integrates two perceptually transformed signals over some time or frequency interval to determine the average difference across that interval. This difference is then normalized out of one signal, and is further processed to generate one or more measurements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call