Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition

Khe Chai Sim,Gautam Mantena,Yanmin Qian,Souvik Kundu,Tian Tan,Lahiru Samarakoon

doi:10.1007/978-3-319-64680-0_9

Abstract

Deep neural networks (DNNs) have been successfully applied to many pattern classification problems, including acoustic modelling for automatic speech recognition (ASR). However, DNN adaptation remains a challenging task. Many approaches have been proposed in recent years to improve the adaptability of DNNs to achieve robust ASR. This chapter will review the recent adaptation methods for DNNs, broadly categorising them into constrained adaptation, feature normalisation, feature augmentation and structured DNN parameterisation. Specifically, we will describe various methods of estimating reliable representations for feature augmentation, focusing primarily on comparing i-vectors and other bottleneck features. Moreover, we will also present an adaptable DNN layer parameterisation scheme based on a linear interpolation structure. The interpolation weights can be reliably adjusted to adapt the DNN to different conditions. This generic scheme subsumes many existing DNN adaptation methods, including speaker-code adaptation, learning hidden unit contribution factorised hidden layer and cluster adaptive training for DNNs.

Full Text