Convolutional neural network acoustic model for robust Indonesian speech recognition in noisy environment

M J Budiman,D P Lestari

doi:10.1088/1757-899x/803/1/012027

Abstract

Noise causes the decreasing accuracy of automatic speech recognition (ASR). Several techniques have been developed and proposed to overcome this problem. Using artificial neural network (ANN) as acoustic model is one of the techniques. Convolutional neural network (CNN) is a variant of ANN that has been used for acoustic modeling. Another approach is to do pre-processing to the speech signal or to the extracted acoustic feature from speech signal, such as cepstral mean and variance normalization (CMVN). On this work, CNN acoustic models were trained by using CMVN pre-processed acoustic feature to make a noise-robust speech recognition system. Two group of models were made, each to handle 2 kinds of noise (babble noise and street noise). Those acoustic models were tested with noisy speech at different SNR (signal-to-noise ratio) value. Testing results from CNN acoustic models were compared with the ones from Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) acoustic models. Testing results showed the increasing accuracy scores of acoustic models when models were trained using more variation of training data. CNN acoustic models that were trained using FBANK feature have higher accuracy scores than GMM-HMM models that were built using the same feature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IOP Conference Series: Materials Science and Engineering	Publication Date: Apr 1, 2020
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Convolutional neural network acoustic model for robust Indonesian speech recognition in noisy environment

Abstract

Talk to us

Similar Papers

More From: IOP Conference Series: Materials Science and Engineering

Lead the way for us

Similar Papers

Unsupervised Equalization of Lombard Effect for Speech Recognition in Noisy Adverse Environments
Hynek Boril ... John H L Hansen
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18
Hynek Boril, et. al.Hynek Boril ... John H L Hansen
01 Aug 2010
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18

Acoustic feature conversion using a polynomial based feature transferring algorithm
Syu-Siang Wang ... Hsin-Te Hwang
-
Syu-Siang Wang, et. al.Syu-Siang Wang ... Hsin-Te Hwang
01 Sep 2014
01 Sep 2014

Improved cepstral mean and variance normalization using Bayesian framework
N Vishnu Prasad ... S Umesh
-
N Vishnu Prasad, et. al.N Vishnu Prasad ... S Umesh
01 Dec 2013
01 Dec 2013

Optimization of Temporal Filters in the Modulation Frequency Domain via Constrained Linear Discriminant Analysis (C-LDA) for Constructing Robust Features in Speech Recognition
Jeih-Weih Hung
-
Jeih-Weih HungJeih-Weih Hung
01 Apr 2007
01 Apr 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Convolutional neural network acoustic model for robust Indonesian speech recognition in noisy environment

Abstract

Talk to us

Similar Papers

More From: IOP Conference Series: Materials Science and Engineering