Abstract

In order to improve the recognition performance, the articulation of the transcription is very important in the process of training. For continuous speech, the essential characteristics of various speakers are pronunciation variation, over focused or inadequately highlighted words can results the waveform misalignment in the sub word unit margin. Because of the deviation in the articulation leads into misalignment when this is compared with articulation dictionary. So the deletion or insertion of the sub word is necessary. This happens because for each expression, the transcription is not precise. This paper presents the corrections in the transcription at the sub word level utilizing sound prompts that are presented in the waveform. The transcription of a word is fixed Utilizing sentence-level transcriptions with reference to the phonemes that create the word. Specifically, it clarifies that vowels are either deleted or inserted. To help the proposed contention, errors in persistent discourse are validated utilizing machine learning and signal processing tools. A programmed information driven annotator abusing the inductions drawn from the examination is utilized to address transcription errors. The outcomes show that rectified pronunciations lead to higher probability for train expressions in the TIMIT corpus.

Highlights

  • The speech sounds are generally divided into two main types: vowels and consonants

  • This paper presents the corrections in the transcription at the sub word level utilizing sound prompts that are presented in the waveform

  • While producing the master label file (MLF), the consonants of inadequately highlighted syllables are inserted in proper position

Read more

Summary

Introduction

The speech sounds are generally divided into two main types: vowels and consonants. Vowels are usually associated with high energy and strong time intervals. In order to improve the recognition performance, the articulation of the transcription is very important in the process of training. The essential characteristics of various speakers are pronunciation variation, over focused or inadequately highlighted words can results the waveform misalignment in the sub word unit margin.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.