Abstract

This paper presents the implementation of real-time automatic speech recognition (ASR) for portable devices. The speech recognition is performed offline using PocketSphinx which is the implementation of Carnegie Mellon University's Sphinx speech recognition engine for portable devices. In this work, machine Learning approach is used which converts graphemes into phonemes using the TensorFlow's Sequence-to-Sequence model to produce the pronunciations of words. This paper also explains the implementation of statistical language model for ASR. The novelty of ASR is its offline speech recognition and thus requires no Internet connection compared to other related works. A speech recognition service currently provides the cloud based processing of speech and therefore has access to the speech data of users. However, the speech is processed on the handheld device in offline ASR and therefore enhances the privacy of users.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call