Abstract

Speech is a highly complex and dynamic acoustic wave produced by the vocal tract as a result of the excitation in the form of air expelled from lungs. The vocal tract characteristics vary in different manner during production of various speech categories. This time variant acoustic filter has been represented by a Linear Prediction (LP) filter in Speech Production Model based on which Code Excited Linear Prediction (CELP) and many other speech coders are built. The periodic nature of voiced speech due to vocal chord vibration causes slow variation for vocal tract characteristics and thus, similarity exists among nearby portions of voiced speech. This similarity property is explored to reduce the count of transmitted Linear Predictive Coding (LPC) coefficients and excitation that are bit consuming and also significant parameters of LP filter. This has been implemented in 7.3 kbps CELP by determining appropriate threshold for similarity values of both parameters. The proposed method has brought down the bit rate of CELP to 6.4 kbps or reduced the bit requirement by 12% without compromising on the perceptual quality of reconstructed speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call