Direct Recovery of Clean Speech Using a Hybrid Noise Suppression Algorithm for Robust Speech Recognition System

Peng Dai,Ing Yann Soon,Rui Tao

doi:10.5402/2012/306305

Abstract

A new log-power domain feature enhancement algorithm named NLPS is developed. It consists of two parts, direct solution of nonlinear system model and log-power subtraction. In contrast to other methods, the proposed algorithm does not need prior speech/noise statistical model. Instead, it works by direct solution of the nonlinear function derived from the speech recognition system. Separate steps are utilized to refine the accuracy of estimated cepstrum by log-power subtraction, which is the second part of the proposed algorithm. The proposed algorithm manages to solve the speech probability distribution function (PDF) discontinuity problem caused by traditional spectral subtraction series algorithms. The effectiveness of the proposed filter is extensively compared using the standard database, AURORA2. The results show that significant improvement can be achieved by incorporating the proposed algorithm. The proposed algorithm reaches a recognition rate of over 86% for noisy speech (average from SNR 0 dB to 20 dB), which means a 48% error reduction over the baseline Mel-frequency Cepstral Coefficient (MFCC) system.

Highlights

The main objective of speech recognition is to get a higher recognition rate
Comparison is made against Mel-frequency Cepstral Coefficient (MFCC), minimum mean square error (MMSE)-short-time spectral amplitude (STSA) [3], Spectral Subtraction (SS) [8], Cepstral Mean Variance Normalization (CMVN), AFE [9], and Mean Variance Normalization and ARMA filtering (MVA) [10]
A novel algorithm for robust speech recognition system is presented with its detailed derivation, implementation, and evaluation

Summary

Introduction

The main objective of speech recognition is to get a higher recognition rate. lots of factors tend to degrade the performance of automatic speech recognition (ASR) system, such as environmental noise, channel distortion, and speaker variability [1, 2]. Noise reduction or clean speech estimation is a straight forward “feature” approach to improve the performance of ASR systems. Ephraim derived the short-time spectral amplitude (STSA) estimator using minimum mean square error (MMSE) in 1984 [3], which has become a standard approach for clean speech estimation in speech processing. The success of MMSE in previous implementation reveals that it is one of the means to improve the performance of an automatic speech recognition system (ASR). It is not necessarily the only one.

System Model

Iterative Root-Finding

Experiment Setup

Results and Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISRN Signal Processing	Publication Date: Dec 26, 2012
Citations: 9	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Direct Recovery of Clean Speech Using a Hybrid Noise Suppression Algorithm for Robust Speech Recognition System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISRN Signal Processing

Lead the way for us

Similar Papers

Improved Speech-Signal Based Frequency Warping Scale for Cepstral Feature in Robust Speaker Verification System
Susanta Kumar Sarangi ... Goutam Saha
Journal of Signal Processing Systems | VOL. 92
Susanta Kumar Sarangi, et. al.Susanta Kumar Sarangi ... Goutam Saha
11 Mar 2020
Journal of Signal Processing Systems | VOL. 92

FEATURE EXTRACTION ALGORITHM USING NEW CEPSTRAL TECHNIQUES FOR ROBUST SPEECH RECOGNITION
Mohamed Cherif Amara Korba ... Houcine Bourouba
Malaysian Journal of Computer Science | VOL. 33
Mohamed Cherif Amara Korba, et. al.Mohamed Cherif Amara Korba ... Houcine Bourouba
24 Apr 2020
Malaysian Journal of Computer Science | VOL. 33

Combining Evidence from Temporal and Spectral Features for Person Recognition Using Humming
Hemant A Patil ... Maulik C Madhavi
-
Hemant A Patil, et. al.Hemant A Patil ... Maulik C Madhavi
01 Jan 2012
01 Jan 2012

Feature Data Reduction of MFCC Using PCA and SVD in Speech Recognition System
Anggun Winursito ... Muhammad Nur Yasir Utomo
-
Anggun Winursito, et. al.Anggun Winursito ... Muhammad Nur Yasir Utomo
01 Jul 2018
01 Jul 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Direct Recovery of Clean Speech Using a Hybrid Noise Suppression Algorithm for Robust Speech Recognition System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISRN Signal Processing