Abstract

Off-line pattern recognition in speech signals is a complex task. Yet, this task becomes harder when the recognition result is required online or in real-time. The present work proposes an online identification of the Portuguese language phonemes using a non-linear autoregressive model with exogenous inputs, commonly called NARX. The process first conditions the input speech signal, and extracts its frequency characteristics. Then it pre-classifies the extracted features into one of the ten possible groups of phonemes, as available in the Portuguese language. This pre-classification is done using a multilayer perceptron network (MLP) with a supervised learning. Subsequently, the MLP output vector, together with the vector that carries the input frequencies, feeds a NARX neural network by means of a temporal delay of four times and feed-backward recurrent links that encompass the results of all hidden layers of the network. As a result of this process, the proposed phoneme recognition process improves the accuracy of an online identification of the Portuguese spoken phonemes during a natural conversation. When the phoneme input signal is well conditioned and continuous over time, the proposed recognition process can provide the correct classification in real-time, with an acceptable accuracy rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call