Proposing two speaker adaptaion methods for deep neural network based speech recognition systems

Zohreh Ansari,Seyyed Ali Seyyed Salehi

doi:10.1109/istel.2014.7000746

Abstract

Many researches have done to develop speech recognition systems in the past decades. However, their performance in speaker variabilities lags behind that of human recognition system. In order to solve this problem, speaker adaptation methods have proposed. These methods adapt either the acoustic model parameters or the input features of the speech recognition systems to improve their performance. In this article, two speaker adaptation methods for deep neural network based speech recognition systems are proposed. In the first method, feature vectors of each speaker are adapted nonlinearly after some forward-backward iterations. In the other one, the speech recognition system is modified in order to be able to adapt dynamically in speaker variabilities. This method, unlike other model adaptation methods, does not need to any adaptation data and adapts the model online. Experiments on FARSDAT dataset demonstrate that these methods improve phone recognition accuracy rate by 2% and 6%.

Full Text