Abstract

Harmonic grouping is a frequently applied technique in computational auditory scene analysis and automatic speech recognition systems. However, grouping is easily disrupted by noise and reverberation. For instance, a noise induced signal component positioned roughly between two harmonics, might undesirably be assigned to the harmonic complex (HC) as well. This results in an octave error: harmonics in an HC are assigned to harmonic numbers twice as high as the correct values. We propose a cost function based method to correct these octave errors. This function is designed to, on the one hand, improve the balance between odd and even harmonic numbers, and, on the other hand, minimize the amount of signal components to be rejected. As a preprocessing step we applied short-time Fourier analysis to derive an instantaneous frequency representation from which we obtained the signal components. We used these as input for our harmonic grouping algorithm to obtain the HCs. Then we selected the optimal solution from the cost function and modified the composition of the HCs accordingly. As long as enough harmonics are sufficiently above the local noise level, this octave error correction mechanism works well for various sorts of harmonic sounds including speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call