A Hybrid Speech Enhancement Algorithm for Voice Assistance Application.

Jenifa Gnanamanickam,Yuvaraj Natarajan,Sri Preethaa K R

doi:10.3390/s21217025

Jenifa Gnanamanickam, Yuvaraj Natarajan + Show 1 more

Open Access

https://doi.org/10.3390/s21217025

Copy DOI

Journal: Sensors	Publication Date: Oct 23, 2021
Citations: 29	License type: CC BY 4.0

Affiliation: KPR Institute of Engineering and Technology

Abstract

In recent years, speech recognition technology has become a more common notion. Speech quality and intelligibility are critical for the convenience and accuracy of information transmission in speech recognition. The speech processing systems used to converse or store speech are usually designed for an environment without any background noise. However, in a real-world atmosphere, background intervention in the form of background noise and channel noise drastically reduces the performance of speech recognition systems, resulting in imprecise information transfer and exhausting the listener. When communication systems’ input or output signals are affected by noise, speech enhancement techniques try to improve their performance. To ensure the correctness of the text produced from speech, it is necessary to reduce the external noises involved in the speech audio. Reducing the external noise in audio is difficult as the speech can be of single, continuous or spontaneous words. In automatic speech recognition, there are various typical speech enhancement algorithms available that have gained considerable attention. However, these enhancement algorithms work well in simple and continuous audio signals only. Thus, in this study, a hybridized speech recognition algorithm to enhance the speech recognition accuracy is proposed. Non-linear spectral subtraction, a well-known speech enhancement algorithm, is optimized with the Hidden Markov Model and tested with 6660 medical speech transcription audio files and 1440 Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio files. The performance of the proposed model is compared with those of various typical speech enhancement algorithms, such as iterative signal enhancement algorithm, subspace-based speech enhancement, and non-linear spectral subtraction. The proposed cascaded hybrid algorithm was found to achieve a minimum word error rate of 9.5% and 7.6% for medical speech and RAVDESS speech, respectively. The cascading of the speech enhancement and speech-to-text conversion architectures results in higher accuracy for enhanced speech recognition. The evaluation results confirm the incorporation of the proposed method with real-time automatic speech recognition medical applications where the complexity of terms involved is high.

Highlights

This paper proposes hybridization of Nonlinear Spectral Subtraction (NSS) and Iterative Signal Enhancement Algorithm (ISE) methods for further enhancing the speech signals
Speech enhancement is an essential factor in speech recognition as it can be used as a pre-processor to enhance speech
Results and Discussion reducing the noise in the surrounding

Summary

Introduction

Speech-to-text transcription has gained importance in many applications and benefits in research, the military, medical sector, smart homes, transportation systems, automatic transcription on lectures, conversations, record-making [1]. Speech recognition technology (SRT) involves the identification of patterns in audio waves and matching them with phonetics of speech to convert them into text. The accuracy of SRT dramatically depends on the quality of audio. The presence of background noises, multiple speakers, or the speaker’s accent provides erroneous transcription. Speech enhancement is a significant problem in communications at airports, medical centers, and other familiar public places

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Hybrid Speech Enhancement Algorithm for Voice Assistance Application.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

Real time and embedded implementation of hybrid algorithm for speech enhancement
J.H Shah ... S.K Shah
-
J.H Shah, et. al.J.H Shah ... S.K Shah
01 Dec 2011
01 Dec 2011

Comparative study of speech enhancement algorithms and their effect on speech intelligibility
Tusar Kanti Dash ... Sandeep Singh Solanki
-
Tusar Kanti Dash, et. al.Tusar Kanti Dash ... Sandeep Singh Solanki
01 Oct 2017
01 Oct 2017

Automatic segmentation of speech recorded in unknown noisy channel characteristics
Bryan L Pellom ... John H.L Hansen
Speech Communication | VOL. 25
Bryan L Pellom, et. al.Bryan L Pellom ... John H.L Hansen
01 Aug 1998
Speech Communication | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Speech Enhancement Algorithm for Voice Assistance Application.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors