Abstract

This paper proposes new interfaces using semi-synchronous speech and pen input for mobile environments. A user speaks while writing, and the pen input complements the speech so that recognition performance will be higher than with speech alone. Since the input speed and input information are different between the two modes, speaking and writing, a time lag always exists between them. Therefore, conventional multi-modal recognition algorithms cannot be directly applied to this interface. To tackle this problem, we developed a multi-modal recognition algorithm that can handle this asynchronicity (time-lag) by using a segment-based unification scheme and a method of adapting to the time-lag characteristics of individual users. Five different pen-input interfaces, each of which is assumed to be given for a phrase unit in speech, were evaluated in speech recognition experiments using noisy speech data. The recognition accuracy of the proposed method was higher than that of speech alone in all five interfaces. We also carried out a subjective test to examine the usability of each interface. We found a trade-off between usability and improvement in recognition performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.