Abstract

A representation of the speech signal as a sum of elementary waveforms (Elementary Waveform Speech Model or EWSM) is introduced and some of its features for modifying localized time-frequency events are demonstrated. The elementary waveforms model the local spectro-temporal maxima of energy within the speech signal thanks to the use of simple mathematical functions. An automatic analysis-synthesis system allows for waveforms parameters estimation, using frame-by-frame processing: spectral modelling and segmentation using short-time Fourier transform and LPC spectrum, Fourier filtering according to this segmentation, waveform spotting in each channel, waveform modelling using simple functions. The classical theory of speech production proves the validity of the EWSM parameters; their modifications yield well-localized time-frequency transformations, including frequency compression/expansion, pitch, formant and noise modification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call