Abstract
Right from the beginning of previous century, researchers have shown interest in areas like Automatic Speech Recognition, Image Processing and Natural Language Processing. The area of Automatic Speech Recognition (ASR) has received attention over the past five decades due to its application in both commercial and military. In the recent times this can be attributed to the advancements in Artificial Intelligence and Advanced Algorithms. ASR takes speech as input and converts it in to text. ASR is employed in electronic dictionaries, Customer Call Centers, Voice Dictation and Query based Information Systems, Speech Transcription, Avionics, Smart Houses and Access Systems and many more areas. ASR can also be used to interact with handicapped people. ASR enables human beings interact with computers using speech rather than using keyboards & mouse (Vimalaand Radha V., 2012). ASR aims to provide natural machine interface where in speech acts input to the machine. Generally, ASR is based on two tasks viz. Identification of Phoneme and Whole-Word Decoding. A relationship between speech signal and speech segment that has dissimilar physical or perceptual features usually termed as phones is established in two steps. The first step deals with dimensionality reduction and second step deals with the estimation of likelihood of each phoneme. In the dimensionality reduction phase, the volume of the speech signal is decreased by extracting the relevant information using task-specific knowledge. In the next phase, the system recognizes the word sequence using a discriminative program. Traditionally ASR systems preferred the Mel frequency Cepstral coefficients (MFCC) for the first phase and Discriminative techniques for the second phase. Over the years ASR systems have evolved from being an integration of multiple trained components to “end-to-end” Deep neural architectures that link speech to text directly. The proposed work implements an MLP with AdaBoost Classifier. The MLP will be used to extract discriminative features from the speech data. Later AdaBoost classifier will map these features to the relevant set of words.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have