Abstract
For large vocabulary continuous speech recognition of highly inflected languages, it is the first step to determine an appropriate speech recognition unit to reduce high out-of-vocabulary rate. We investigate two kinds of approaches to select recognition units. In the morpheme-based approach, we use morpheme as basic recognition unit and merge frequent morpheme pairs into phrases by rule-based method or statistical unit merging method. In statistical unit merging, we investigate the effects of part-of-speech constraints used in selecting merging candidates. In the syllable-based approach, assuming that only text data and pronunciation are available, we obtain merged syllables by using the same statistical merging method where pronunciation variation is taken into account. The experimental results showed that the statistical merging method with appropriate linguistic constraints yields best recognition accuracy. Although the syllable-based approach did not show comparable performance, it has the advantage that it does not require a part-of-speech tagging system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.