Abstract

The paper proposes a speech signal analysis approach that uses an exponential autoregressive (ExpAR) model. In real speech signals, the amplitude and frequency fluctuate randomly. These fluctuations are non-Gaussian and have nonlinear dynamics. This means that they cannot be modeled adequately with linear AR models or compositions of sine/cosine waves, as these analysis methods are known to be affected by such fluctuations. Our proposed approach, using the ExpAR model, can deal with such fluctuations, and it is autoregressive in form with amplitude dependent exponential coefficients. Studies to fit the ExpAR model to real speech data have shown that AIC (Akaike's information criteria) values achieved by the ExpAR model are better (lower) than those obtained with a linear AR model, and that the ExpAR model provides a good model of speech fluctuations as movements of the position of its poles. The coefficients change with time depending on the amplitude of the speech signals, and so this model is also capable of realizing a fine instantaneous spectral estimation. The modeling of such speech fluctuations has the potential to be used for improving automatic speech recognition performance in clean or noisy environments, and the naturalness of synthesized speech.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.