Abstract

The paper proposes a speech signal analysis approach that uses an exponential autoregressive (ExpAR) model. In real speech signals, the amplitude and frequency fluctuate randomly. These fluctuations are non-Gaussian and have nonlinear dynamics. This means that they cannot be modeled adequately with linear AR models or compositions of sine/cosine waves, as these analysis methods are known to be affected by such fluctuations. Our proposed approach, using the ExpAR model, can deal with such fluctuations, and it is autoregressive in form with amplitude dependent exponential coefficients. Studies to fit the ExpAR model to real speech data have shown that AIC (Akaike's information criteria) values achieved by the ExpAR model are better (lower) than those obtained with a linear AR model, and that the ExpAR model provides a good model of speech fluctuations as movements of the position of its poles. The coefficients change with time depending on the amplitude of the speech signals, and so this model is also capable of realizing a fine instantaneous spectral estimation. The modeling of such speech fluctuations has the potential to be used for improving automatic speech recognition performance in clean or noisy environments, and the naturalness of synthesized speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call