Abstract

Automatic speech recognition is one of the most active research areas as it offers a dynamic platform for human-machine interaction. The robustness of speech recognition systems is often degraded in real time applications, which are often accompanied by environmental noises. In this work, we have investigated the efficiency of combining wave atoms transform (WAT) with Mel-Frequency Cepstral Coefficients (MFCC) using Support Vector Machine (SVM) as classifier in different noisy conditions. A full experimental evaluation of the proposed model has been conducted using Arabic speech database (ARADIGIT) and corrupted with “NOISEUS database” noises at different levels of SNR ranging from -5 to 15dB. The results of Simulation have indicated that the proposed algorithm has improved the recognition rate (99.9%) at 15 dB of SNR. A comparative study was conducted by applying the proposed WAT-MFCC features to multilayer perceptron (MLP) and hidden Markov model (HMM) in order to prove the efficiency and the robustness of the proposed system.

Highlights

  • Automatic speech recognition allows the machine to understand and process information provided orally by a human user

  • A new speech recognition system based on wave atoms transform (WAT)-Mel-Frequency Cepstral Coefficients (MFCC) and Support Vector Machine (SVM) was developed in this paper to improve the accuracy of recognition

  • Despite worst performances have been obtained using multilayer perceptron (MLP) based MFCC soft with an achieved rate of 84.2%; the use of WAT-MFCC has registered an acceptable accuracy 92.4%

Read more

Summary

INTRODUCTION

Automatic speech recognition allows the machine to understand and process information provided orally by a human user. By simplifying the human-machine dialogue protocol, the automatic speech processing aims to gain productivity since it is the machine that adapts to humans to communicate, not the other way around It makes possible the simultaneous use of the eyes or hands to another task. A good speech recognition rates have been mostly reached using small vocabularies This result is considered to be sufficient for the implementation of the most voice control devices. The adopted approach has been tested on Arabic language database in both clean and noisy conditions This manuscript is structured as follows: In Section II, a brief literature review of ASR Systems is presented.

RELATED WORKS
THE PROPOSED SPEECH RECOGNITION SYSTEM
Feature Extraction Stage
Ns ys s S m ym xm
Speech Database
Experimental Results
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call