Automatic Speech Recognition Using Deep Neural Network

S Sharmadha,S Kavitha,K Shruthi,K Shivani,B Bharathi

doi:10.1007/978-981-15-2475-2_33

Abstract

Automatic speech recognition acknowledges the spoken words and converts them to a machine-readable format of text. By converting spoken audio into text, this technology allows users to control digital devices by speaking instead of using conventional tools like keystrokes and buttons. The challenges in speech recognition are the improvisation of the accuracy, varying user responsiveness, performance, reliability and fault tolerance. The audio signal quality affects the recognition accuracy rate. Delayed speech recognition is used to overcome the issues by user responsiveness. This is because the pronunciation of a word differs when used under different contexts. Since the world is moving at a rapid pace towards digitisation, new technologies are being developed to make lives easy. Interactive Voice Response System is an example. The Interactive Voice Response System allows the computer to interact with human by using their voices. We have proposed an Interactive Voice Response System for railway reservation system. The proposed approach uses LSTM with CTC to recognise the spoken word. The methods used in the creation of this model outperform other models where testing is done to arrive at the resultant with a better accuracy.

Full Text