Abstract

Speech synthesis and recognition are the basic techniques used for man -machine communication. This type of communication is valuable when our hands and eyes are busy in some other task such as driving a vehicle, performing surgery, or firing weapons at the enemy. Dynamic time warping (DTW) is mostly used for aligning two given multidimensional sequences. It finds an optimal match between the given sequences. The distance between the aligned sequences shouldbe relatively lesser as compared to unaligned sequences. The improvement in the alignment may be estimated from the corresponding distances. This technique has applications in speech recognition, speech synthesis, and speaker transformation. The objective of this research is to investigate the amount of improvement in the alignment corresponding to the sentence based and phoneme based manually aligned phrases. The speech signals in the form of twenty five phrases were recorded from each of six speakers (3 m ales and 3 females). The recorded material was segmented manually and aligned at sentence and phoneme level. The aligned sentences of different speaker pairs were analyzed using HNM and the HNM parameters were further aligned at frame level using DTW. Mahalanobis distances were computed for each pair of sentences. The investigations have shown more than 20 % reduction inthe average Mahalanobis distance s.

Highlights

  • Speech signal is generated as a consequence of exciting a dynamic vocal tract system with time varying excitation

  • Speech recognition known as automatic speech recognition (ASR) converts spoken language in text

  • The HNM parameters are converted to line spectral frequencies (LSF) and applied to the block for Dynamic time warping (DTW) [19]

Read more

Summary

Introduction

Speech signal is generated as a consequence of exciting a dynamic vocal tract system with time varying excitation. Speech recognition is more difficult than speech generation, in spite of the fact that computers can store and recall enormous amounts of data, perform mathematical computations at very high speed, and do repetitive tasks without losing any type of efficiency. The reason for this may be attributed to the lack of general knowledge in the computers.

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.