SSQA: Speech Signal Quality Assessment Method using Spectrogram and 2-D Convolutional Neural Networks for Improving Efficiency of ASR Devices

Pooja Kumawat,M Sabarimalai Manikandan

doi:10.1109/icdipc.2019.8723681

Abstract

Most on-device and cloud processing based automatic speech recognition (ASR) systems had poor recognition performance due to the noisy speech signals corrupted by various kinds of background noises such as vehicle, train, aircraft, fan, wind, rain, air-conditioner, and machinery noises which are unavoidable realistic scenarios. In this paper, we propose a novel speech signal quality assessment (SSQA) method for automatically assessing the quality of a recorded speech signal before processing on-device and sending the recorded data to the cloud server. The proposed method is based on the spectrogram feature and two-dimensional convolutional neural networks (2D-CNNs). The proposed SSQA method is evaluated using a large scale of noise-free speech and noisy speech signals which are corrupted with various kinds of noises with different noise levels. Results show that the 2D-CNN based method had an average Se=90.92%, Sp=98.44% and OA =96.44%. The method had better results in detecting the noisy speech segments. Results showed that there is confusion in performing the manual labelling of noise-free and noisy speech segments. Therefore, the noise-free and noisy speech signals are given to the publicly available ASR system to obtain the corresponding text. Then the word error rate (WER) and character error rate (CER) metrics were used to know the level of noise wherein the ASR system fails to correctly recognize its text. In this way, the noise level is determined for each of the noises to label the recorded speech signal into acceptable and unacceptable speech segments. The proposed quality-aware ASR system has great potential in improving the lifetime of the battery of the portable ASR devices and reducing the bandwidth and speech recognition software utilization costs in the case of cloud processing based ASR system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SSQA: Speech Signal Quality Assessment Method using Spectrogram and 2-D Convolutional Neural Networks for Improving Efficiency of ASR Devices

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Interaction between people with dysarthria and speech recognition systems: A review
Aisha Jaddoh ... Omer Rana
Assistive Technology | VOL. 35
Aisha Jaddoh, et. al.Aisha Jaddoh ... Omer Rana
16 Apr 2022
Assistive Technology | VOL. 35

Theoretical Analysis of Diversity in an Ensemble of Automatic Speech Recognition Systems
Kartik Audhkhasi ... Shrikanth S Narayanan
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Kartik Audhkhasi, et. al.Kartik Audhkhasi ... Shrikanth S Narayanan
01 Mar 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

Using Auxiliary Sources of Knowledge for Automatic Speech Recognition

-

01 Jan 2004
01 Jan 2004

Autocorrelation-based Methods for Noise-Robust Speech Recognition
Gholamreza Farahani ... Mohammad Mehdi
-
Gholamreza Farahani, et. al.Gholamreza Farahani ... Mohammad Mehdi
01 Jun 2007
01 Jun 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SSQA: Speech Signal Quality Assessment Method using Spectrogram and 2-D Convolutional Neural Networks for Improving Efficiency of ASR Devices

Abstract

Talk to us

Similar Papers