Abstract

A dynamic programming-based optimization strategy for a temporal decomposition (TD) model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL) TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably good results, there was no assurance on the optimality of the event locations. In the present work, we have optimized the event localizing task using a dynamic programming-based optimization strategy. Simulation results show that an improved TD model accuracy can be achieved. A methodology of incorporating the optimized TD algorithm within the standard MELP speech coder for the efficient compression of speech spectral information is also presented. The performance evaluation results revealed that the proposed speech coding scheme achieves 50%-60% compression of speech spectral information with negligible degradation in the decoded speech quality.

Highlights

  • While practical issues such as delay, complexity, and fixed rate of encoding are important for speech coding applications in telecommunications, they can be significantly relaxed for speech storage applications such as store-forward messaging and broadcasting systems

  • We have proposed a dynamic programming-based optimization strategy for a modified temporal decomposition (TD) model of speech

  • Model accuracy control through TD resolution, and overlapping speech parameter buffering technique for continuous speech analysis can be highlighted as the main features of the proposed method

Read more

Summary

INTRODUCTION

While practical issues such as delay, complexity, and fixed rate of encoding are important for speech coding applications in telecommunications, they can be significantly relaxed for speech storage applications such as store-forward messaging and broadcasting systems. Where the kth column of matrix A contains the kth event target vector, ak, and the nth column of the matrix Y (approximation of Y) contains the nth speech parameter frame, y(n), produced by the TD model. The results of the spectral stability-based event localizing (SBEL) TD [9, 10] and Atal’s original algorithm [6] for TD analysis show that event function overlapping beyond two adjacent event functions occurs very rarely, in the generalized TD model overlapping is allowed to any extent.

MODIFIED TD MODEL OF SPEECH
Speech parameter buffering
Event function evaluation
Optimization of event localization task
Dynamic programming formulation
Refinement of event targets
Overlapping buffering technique
Speech data and performance measure
Performance evaluation
Performance comparison with SBEL-TD
Coder schematics
Event function quantization
Event target quantization
Objective quality evaluation
Results of evaluation
Performance comparison
Subjective quality evaluation
Experimental design
Results and analysis
CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.