Abstract

The measurement of the harmonics-to-noise ratio (HNR) in speech signals gives an indication of the aperiodicity of the speech waveform. This may be due to the presence of jitter, shimmer, additive noise, waveshape change, or some unknown combination of these factors. In order to estimate the HNR as a measure of the additive noise component only, the contaminating effects of the other contributory components must first be removed. A pitch synchronous harmonic analysis is proposed to overcome this problem. The procedure takes advantage of the time scale compression-frequency expansion property of the Fourier series in order to eliminate jitter and shimmer. Successive spectra are added by harmonic number as opposed to frequency location, and perturbation is removed due to the fact that the relative heights of the harmonic components remain the same for scaled signals. The technique is examined on synthetically generated voice signals. A discussion of the results is given in terms of human voice signals, characterization of jitter, vocal tract filtering effects, perturbation mechanisms, nonlinear dynamics, and the possibility of developing the method for use with inverse filtering strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call