Abstract
In automatic speech recognition, the phone has probably been a dominating sub-word unit for more than one decade. Context Dependent phone or triphone modeling accounts for contextual variations between adjacent phones and state tying addresses modeling of triphones that are not seen during training. Recently, syllable is gaining momentum as a new sub-word unit. Syllable being a larger unit than a phone addresses the severe contextual variations between phones within it. Therefore, it is more stable than a phone and models pronunciation variability in a systematic way. Tamil language has challenging features like agglutination and morpho-phonology. In this paper, attempts have been made to provide solutions to these issues by using the syllable as a sub-word unit in an acoustic model. Initially, a small vocabulary context independent word models and a medium vocabulary context dependent phone models are developed. Subsequently, an algorithm based on prosodic syllable is proposed and two experiments have been conducted. First, syllable based context independent models have been trained and tested. Despite large number of syllables, this system has performed reasonably well compared to context independent word models in terms of word error rate and out of vocabulary words. Subsequently, in the second experiment, syllable information is integrated in conventional triphone modeling wherein cross-syllable triphones are replaced with monophones and the number of context dependent phone models is reduced by 22.76% in untied units. In spite of reduction in the number of models, the accuracy of the proposed system is comparable to that of the baseline triphone system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.