Incorporating Noise Robustness in Speech Command Recognition by Noise Augmentation of Training Data.

Ayesha Pervaiz,Farruh Ishmanov,Muhammad Ali Tahir,Naveed Khan Baloch,Huma Israr,Fawad Hussain,Fawad Riasat Raja,Yousaf Bin Zikria

doi:10.3390/s20082326

Abstract

The advent of new devices, technology, machine learning techniques, and the availability of free large speech corpora results in rapid and accurate speech recognition. In the last two decades, extensive research has been initiated by researchers and different organizations to experiment with new techniques and their applications in speech processing systems. There are several speech command based applications in the area of robotics, IoT, ubiquitous computing, and different human-computer interfaces. Various researchers have worked on enhancing the efficiency of speech command based systems and used the speech command dataset. However, none of them catered to noise in the same. Noise is one of the major challenges in any speech recognition system, as real-time noise is a very versatile and unavoidable factor that affects the performance of speech recognition systems, particularly those that have not learned the noise efficiently. We thoroughly analyse the latest trends in speech recognition and evaluate the speech command dataset on different machine learning based and deep learning based techniques. A novel technique is proposed for noise robustness by augmenting noise in training data. Our proposed technique is tested on clean and noisy data along with locally generated data and achieves much better results than existing state-of-the-art techniques, thus setting a new benchmark.

Highlights

Automatic speech recognition (ASR) is the recognition and translation of spoken language into text.An ASR system is used to estimate the most likely sequence of words for a given speech input
As we trained different models under two categories (i) Gaussian mixture model (GMM) based techniques and (ii) deep learning based techniques, the results were tabulated in the same manner
Our results showed that the performance of the tri3b model was better than the mono, tri1, and tri2 models on all three test sets

Summary

Introduction

Automatic speech recognition (ASR) is the recognition and translation of spoken language into text.An ASR system is used to estimate the most likely sequence of words for a given speech input. The technology is getting more mature and more natural to integrate into smart devices; the use of ASR is increasing in different applications. To provide the best experience to the users while interacting with more advanced smart devices, it is necessary to have more robust and efficient interfaces for human-machine interaction. This will only be possible when we have standardized models for speech recognition, and such systems will facilitate all kinds of users regardless of their background, education, and lifestyle to have a natural interaction with devices

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Apr 19, 2020
Citations: 34	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Incorporating Noise Robustness in Speech Command Recognition by Noise Augmentation of Training Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Recognizing Command Words using Deep Recurrent Neural Network for Both Acoustic and Throat Speech
Sadi M Redwan ... Md Rashed-Al-Mahfuz
European Journal of Information Technologies and Computer Science | VOL. 3
Sadi M Redwan, et. al.Sadi M Redwan ... Md Rashed-Al-Mahfuz
22 May 2023
European Journal of Information Technologies and Computer Science | VOL. 3

Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition
Masoud Geravanchizadeh ... Meysam Bashirpour
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2021
Masoud Geravanchizadeh, et. al.Masoud Geravanchizadeh ... Meysam Bashirpour
04 Aug 2021
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2021

Feature Level Solution to Noise Robust Speech Recognition in the context of Tonal Languages
Utpal Bhattacharjee ... Jyoti Mannala
International Journal of Engineering and Advanced Technology | VOL. 9
Utpal Bhattacharjee, et. al.Utpal Bhattacharjee ... Jyoti Mannala
30 Dec 2020
International Journal of Engineering and Advanced Technology | VOL. 9

Hyperparameter Optimization of CNN Using Genetic Algorithm for Speech Command Recognition
Sandipan Dhar ... Avirup Mazumder
-
Sandipan Dhar, et. al.Sandipan Dhar ... Avirup Mazumder
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Incorporating Noise Robustness in Speech Command Recognition by Noise Augmentation of Training Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors