Addressing data sparsity in DNN acoustic modeling

Seeram Tejaswi,S Umesh

doi:10.1109/ncc.2017.8077041

Abstract

This paper presents our work on developing acoustic models using deep neural networks (DNN) for low resource languages. This is considered one of the challenging problems in automatic speech recognition (ASR) as DNNs need large amount of data for building efficient models. The techniques explored in this approach use a common idea of transferring knowledge from models of high resource language to a low resource language. These methods include: (i) cross-lingual approach of building a common DNN model with data from a low resource language pooled with a linguistically close high resource language, (ii) transfer learning approach by adapting the top one or more layers of a pooled DNN model with low resource data, (iii) multitask learning approach by treating each language as a separate task for a common model trained over multiple related tasks simultaneously, which alleviates over-fitting problems. The multi-task method was found to be the most effective of all. Experiments were done on Kannada (low resource) and Telugu data and 46% relative improvement is observed in multi-task framework over its mono-lingual DNN.

Full Text