Abstract

Linear prediction is the kernel technology in speech processing. It has been widely applied in speech recognition, synthesis, and coding, and can efficiently and correctly represent the speech frequency spectrum with only a few parameters. Line Spectrum Pairs (LSPs) frequencies, as an alternative representation of Linear Predictive Coding (LPC), have the advantages of good quantization accuracy and low spectral sensitivity. However, computing the LSPs frequencies takes a long time. To address this issue, a fast computation algorithm, based on the Bairstow method for computing LSPs frequencies from linear prediction coefficients, is proposed in this paper. The algorithm process first transforms the symmetric and antisymmetric polynomial to general polynomial, then extracts the polynomial roots. Associated with the short-term stationary property of speech signal, an adaptive initial method is applied to reduce the average iteration numbers by 26%, as compared to the statics in the initial method, with a Perceptual Evaluation of Speech Quality (PESQ) score reaching 3.46. Experimental results show that the proposed method can extract the polynomial roots efficiently and accurately with significantly reduced computation complexity. Compared to previous works, the proposed method is 17 times faster than Tschirnhus Transform, and has a 22% PESQ improvement on the Birge-Vieta method with an almost comparable computation time.

Highlights

  • Linear predictive analysis of the speech signal is one of the most powerful speech analysis techniques, which can extract the short-time spectral envelope information of speech signals efficiently, and is widely used in the fields of speech representing for low bit rate transmission or storage, automatic speech and speaker recognition [1,2,3,4,5]

  • Considering the above requirements, we proposed a fast Linear Prediction Coding (LPC) computation method, based on the Bairstow method, which can extract the roots of LPC polynomials without using complex operations, a fine grid and trigonometric functions

  • In order to evaluate the performance of the proposed method, such as computing time, convergence speed, and its impact on speech quality, we choose ITU-T G.729, which is a data compression algorithm using conjugate-structure algebraic-code-excited linear prediction, as a verification platform

Read more

Summary

Introduction

Linear predictive analysis of the speech signal is one of the most powerful speech analysis techniques, which can extract the short-time spectral envelope information of speech signals efficiently, and is widely used in the fields of speech representing for low bit rate transmission or storage, automatic speech and speaker recognition [1,2,3,4,5]. Wu and Chen [11] first proposed to transform LPC polynomial into a pair of general-form polynomials, used the closed-form formulas and the modified Newton-Raphson method to extract polynomial roots It can avoid the computation of trigonometric functions with a fine grid. Experiments show that, compared with other methods, including those of Soong and Juang [8], Kabal and Ramachandram [10], the Chen and Wu method [11], the modified Ferrari’s formula [12], Tschirnhaus transform [13], the Birge-Vieta method [14] and the previous Braistow-base LPC analysis methods [15,16], the proposed method can estimate the LSPs frequencies accurately with the lowest computational complexity, together with high speech quality.

LPC Polynomial
Polynomial Solution Based on Bairstow Method
Experimental Environment
Selection and Update of Initial Values
Initial Method
Solve the 10-Order LSP Using the Proposed Method
Histograms
Solve the 10-order LSP Using the Proposed Method
Performance of the Proposed Method
Methods
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.