Abstract
This paper describes the relation between vocabulary sizes and detection errors of unknown words in large vocabulary speech recognition through recognition and detection experiments. Although the relation between vocabulary sizes and recognition performances has been reported, the relation between vocabulary sizes and detection performances has not yet been studied. Especially, it has not for the cases of vocabulary sizes of over 1, 000 words. Experiments were conducted using the speech material of speaker MAU's ATR word speech database. The entries of the dictionary used is 40, 000 words from the Shinmeikai Japanese Language Dictionary. It is shown that when the vocabulary size increases from 1, 000 words to 40, 000 words, the relation between vocabulary sizes and detection errors has a similar tendency with the relation between vocabulary sizes and recognition errors. And increases of detection errors caused by increases of vocabulary sizes are shown to be small for the case of within vocabulary, compared with increases of detection errors for the case of out of vocabulary. These results should be taken into accounts in designing large vocabulary speech recognition systems including unknown word processing.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.