Abstract
In this thesis, a parametric audio coding system for very low bit rates is presented. It is based on a generalized framework that combines different source models into a hybrid model and thereby permits flexible utilization of a broad range of source and perceptual models. The developed parametric audio coding system allows efficient coding of arbitrary audio signals at bit rates in the range of approximately 6 to 16 kbit/s. The use of a hybrid source model requires that the audio signal is being decomposed into a set of components, each of which can be adequately modeled by one of the available source models. Each component is described by a set of model parameters of its source model. The parameters of all components are quantized and coded and then conveyed as bit stream from the encoder to the decoder. In the decoder, the component signals are resynthesized according to the transmitted parameters. By combining these signals, the output signal of the parametric audio coding system is obtained. The hybrid source model developed here combines sinusoidal trajectories, harmonic tones, and noise components and includes an extension to support fast signal transients. The encoder employs robust algorithms for the automatic decomposition of the input signal into components and for the estimation of the model parameters of these components. A perceptual model in the encoder guides signal decomposition and selects the perceptually most relevant components for transmission. Advanced coding schemes exploit the statistical dependencies and properties of the quantized parameters for efficient transmission. The parametric approach facilitates extensions of the coding system that provide additional functionalities. Independent time-scaling and pitch-shifting is supported by the signal synthesis in the decoder. Bit rate scalability is achieved by transmitting the perceptually most important components in a base layer bit stream and further components in one or more enhancement layers. Error robustness for operation over error-prone transmission channels is achieved by unequal error protection and by techniques to minimize error propagation and to provide error concealment. The resulting coding system was standardized as Harmonic and Individual Lines plus Noise (HILN) parametric audio coder in the international MPEG-4 Audio standard. Listening tests show that HILN achieves an audio quality comparable to that of established transform-based audio coders at 6 and 16 kbit/s.
Paper version not known (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.