Training augmentation with TANDEM acoustic modelling in Punjabi adult speech recognition system

Virender Kadyan,Puneet Bawa,Shashi Bala

doi:10.1007/s10772-021-09797-0

Abstract

Processing of low resource pre and post acoustic signals always faced the challenge of data scarcity in its training module. It’s difficult to obtain high system accuracy with limited corpora in train set which results into extraction of large discriminative feature vector. These vectors information are distorted due to acoustic mismatch occurs because of real environment and inter speaker variations. In this paper, context independent information of an input speech signal is pre-processed using bottleneck features and later in modeling phase Tandem-NN model has been employ to enhance system accuracy. Later to fulfill the requirement of train data issues, in-domain training augmentation is perform using fusion of original clean and artificially created modified train noisy data and to further boost this training data, tempo modification of input speech signal is perform with maintenance of its spectral envelope and pitch in corresponding input audio signal. Experimental result shows that a relative improvement of 13.53% is achieved in clean and 32.43% in noisy conditions with Tandem-NN system in comparison to that of baseline system respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Training augmentation with TANDEM acoustic modelling in Punjabi adult speech recognition system

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Feb 2, 2021
Citations: 7

Similar Papers

Bottleneck features based on gammatone frequency cepstral coefficients
Jun Qi ... Javier Tejedor
-
Jun Qi, et. al.Jun Qi ... Javier Tejedor
25 Aug 2013
25 Aug 2013

Multi-Task Adversarial Network Bottleneck Features for Noise-Robust Speaker Verification
Hong Yu ... Zheng-Hua Tan
-
Hong Yu, et. al.Hong Yu ... Zheng-Hua Tan
01 Aug 2018
01 Aug 2018

Subspace models for bottleneck features
Jun Qi ... Javier Tejedor
-
Jun Qi, et. al.Jun Qi ... Javier Tejedor
25 Aug 2013
25 Aug 2013

BLTRCNN-Based 3-D Articulatory Movement Prediction: Learning Articulatory Synchronicity From Both Text and Audio Inputs
Lingyun Yu ... Qiang Ling
IEEE Transactions on Multimedia | VOL. 21
Lingyun Yu, et. al.Lingyun Yu ... Qiang Ling
01 Jul 2019
IEEE Transactions on Multimedia | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Training augmentation with TANDEM acoustic modelling in Punjabi adult speech recognition system

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology