Abstract
A nonlinear Hammerstein model is proposed for coding speech signals. Using Tsay's nonlinearity test, we first show that the great majority of speech frames contain nonlinearities (over 80% in our test data) when using 20-millisecond speech frames. Frame length correlates with the level of nonlinearity: the longer the frames the higher the percentage of nonlinear frames. Motivated by this result, we present a nonlinear structure using a frame-by-frame adaptive identification of the Hammerstein model parameters for speech coding. Finally, the proposed structure is compared with the LPC coding scheme for three phonemes /a/, /s/, and /k/ by calculating the Akaike information criterion of the corresponding residual signals. The tests show clearly that the residual of the nonlinear model presented in this paper contains significantly less information compared to that of the LPC scheme. The presented method is a potential tool to shape the residual signal in an encode-efficient form in speech coding.
Highlights
Due to the solid theory underlying linear systems, the most widely used methods for speech coding up to the present day have been the linear ones
Based on the ideas presented above, a parametric model consisting of a weighted combination of linear and nonlinear features and capable of identifying the model parameters from the speech data could be useful in speech coding
We present the use of a noniterative Hammerstein model parameter identification applied to speech modeling in coding purposes
Summary
Due to the solid theory underlying linear systems, the most widely used methods for speech coding up to the present day have been the linear ones. In [10] nonlinear artificial excitation is modulated with a linear filter in an analysis-synthesis system while in [11, 12] Teager energy operator has been found to give good results in different speech processing contexts Another approach to dealing with nonlinearities in speech is to use systems that can be trained according to some training data. Based on the ideas presented above, a parametric model consisting of a weighted combination of linear and nonlinear features and capable of identifying the model parameters from the speech data could be useful in speech coding One such model is the Hammerstein model that has been used in different types of contexts, for example, in biomedical signal processing and noise reduction in radio transmission, but not for speech modeling in the context of coding. We present the use of a noniterative Hammerstein model parameter identification applied to speech modeling in coding purposes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.