Formal Subjective Tests Research Articles

This paper addresses the problem of single-microphone speech enhancement in noisy environments. State-of-the-art short-time noise reduction techniques are most often expressed as a spectral gain depending on the signal-to-noise ratio (SNR). The well-known decision-directed (DD) approach drastically limits the level of musical noise, but the estimated a priori SNR is biased since it depends on the speech spectrum estimation in the previous frame. Therefore, the gain function matches the previous frame rather than the current one which degrades the noise reduction performance. The consequence of this bias is an annoying reverberation effect. We propose a method called two-step noise reduction (TSNR) technique which solves this problem while maintaining the benefits of the decision-directed approach. The estimation of the a priori SNR is refined by a second step to remove the bias of the DD approach, thus removing the reverberation effect. However, classic short-time noise reduction techniques, including TSNR, introduce harmonic distortion in enhanced speech because of the unreliability of estimators for small signal-to-noise ratios. This is mainly due to the difficult task of noise power spectrum density (PSD) estimation in single-microphone schemes. To overcome this problem, we propose a method called harmonic regeneration noise reduction (HRNR). A nonlinearity is used to regenerate the degraded harmonics of the distorted signal in an efficient way. The resulting artificial signal is produced in order to refine the a priori SNR used to compute a spectral gain able to preserve the speech harmonics. These methods are analyzed and objective and formal subjective test results between HRNR and TSNR techniques are provided. A significant improvement is brought by HRNR compared to TSNR thanks to the preservation of harmonics

Read full abstract

Traditional pitch-excited linear predictive coding (LPC) vocoders use a fully parametric model to efficiently encode the important information in human speech. These vocoders can produce intelligible speech at low data rates (800-2400 b/s), but they often sound synthetic and generate annoying artifacts such as buzzes, thumps, and tonal noises. These problems increase dramatically if acoustic background noise is present at the speech input. This paper presents a new mixed excitation LPC vocoder model that preserves the low bit rate of a fully parametric model but adds more free parameters to the excitation signal so that the synthesizer can mimic more characteristics of natural human speech. The new model also eliminates the traditional requirement for a binary voicing decision so that the vocoder performs well even in the presence of acoustic background noise. A 2400-b/s LPC vocoder based on this model has been developed and implemented in simulations and in a real-time system. Formal subjective testing of this coder confirms that it produces natural sounding speech even in a difficult noise environment. In fact, diagnostic acceptability measure (DAM) test scores show that the performance of the 2400-b/s mixed excitation LPC vocoder is close to that of the government standard 4800-b/s CELP coder.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

Read full abstract

Formal Subjective Tests Research Articles

Related Topics

Articles published on Formal Subjective Tests

Improved Signal-to-Noise Ratio Estimation for Speech Enhancement

Subjective evaluation of MPEG-4 video codec proposals: Methodological approach and test procedures

A mixed excitation LPC vocoder model for low bit rate speech coding

Design and performance of an analysis-by-synthesis class of predictive speech coders

Transform coding of audio signals using perceptual noise criteria

Subjective Effects of Variable Delay and Speech Clipping in Dynamically Managed Voice Systems

A note on complexity reduction for linear predictive speech synthesis

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Formal Subjective Tests Research Articles

Related Topics

Articles published on Formal Subjective Tests

Improved Signal-to-Noise Ratio Estimation for Speech Enhancement

Subjective evaluation of MPEG-4 video codec proposals: Methodological approach and test procedures

A mixed excitation LPC vocoder model for low bit rate speech coding

Design and performance of an analysis-by-synthesis class of predictive speech coders

Transform coding of audio signals using perceptual noise criteria

Subjective Effects of Variable Delay and Speech Clipping in Dynamically Managed Voice Systems

A note on complexity reduction for linear predictive speech synthesis