Abstract

This paper describes an automatic caption-superimposing system with a new continuous speech recognizer for efficient production of TV programs. The system which we have developed can recognize continuous speech announced in a hall of Japanese 'sumo' wrestling and automatically superimpose the recognition results of wrestlers' names and winning tricks as captions on a TV display. The announcements consist of sentences to inform which wrestler has won a match with what kind of winning trick. They are formed out of small-sized vocabulary with a specific uttered style and are spoken nearly at a Japanese 'bunsetsu' unit like a phrase only by some specific speakers. We designed the system to work with the following features: (a) recognition of continuous speech with a specific uttered style; (b) an easy change of vocabulary to be recognized; (c) no requirement of pre-registration of any particular utterances; (d) implementation on multi-microprocessors with high computing speed. The proposed recognizer utilizes general intra-'bunsetsu' grammar which is applicable to various recognition tasks, while conventional Japanese continuous speech recognizers use intra-'bunsetsu' grammar which depends on applied recognition tasks. In a recognition experiment on 40 sentences of 'sumo' announcements by two speakers, the system attained 'bunsetsu' accuracy of 91.0% with quasi-real-time processing.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call