Abstract
An N-best keyword search algorithm was developed in a continuous speech recognizer which models vocabulary words as well as extraneous sounds and noise, to achieve high sentence accuracy. The continuous speech recognizer was developed for telecommunication-based applications which typically demand high sentence accuracy. Possible approaches for achieving high sentence accuracy include applying complicated speech modeling techniques or employing more knowledge sources when conducting the recognition search. An alternative solution is to first apply an N-best decoding search to obtain N sentence hypotheses using pre-selected knowledge source(s) and then re-score those hypotheses using other knowledge source(s) or models. The proposed N-best keyword search algorithm derives all keyword sentence hypotheses and the corresponding likelihood scores time-synchronously. We show that the algorithm guarantees to find all sentence hypotheses. To reduce the exponentially growing number of hypotheses, in practical implementation we applied empirically derived thresholds to prune the search. Recognition experiments were conducted on two speech corpora: TI Connected Digit Corpus and Road Rally Corpus, to show the effectiveness of the proposed method. >
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have