Abstract

In recognizing spontaneous speech, the performance of typical speech recognizers tends to be degraded by filled and silent pauses, which are hesitation phenomena frequently occurred in such speech. In this paper, we present a method for improving the performance of a speech recognizer by detecting and handling both filled pauses (lengthened vowels) and silent (unfilled) pauses. Our method automatically detects these pauses by using a bottom-up acoustical analysis in parallel with a typical speech decoding process, and then incorporates the detected results into the decoding process. From the results of experiments conducted using the CIAIR spontaneous speech corpus, the effectiveness of the proposed method was confirmed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call