Boosting universal speech attributes classification with deep neural network for foreign accent characterization

Ville Hautamäki,Valerio Mario Salerno,Hamid Behravan,Sabato Marco Siniscalchi,Ivan Kukanov

doi:10.21437/interspeech.2015-165

Abstract

We have recently proposed a universal acoustic characterisation to foreign accent recognition, in which any spoken foreign accent was described in terms of a common set of fundamental speech attributes. Although experimental evidence demonstrated the feasibility of our approach, we belive that speech attributes, namely manner and place of articulation, can be better modelled by a deep neural network. In this work, we propose the use of deep neural network trained on telephone bandwidth material from different languages to improve the proposed universal acoustic characterisation. We demonstrate that deeper neural architectures enhance the attribute classification accuracy. Furthermore, we show that improvements in attribute classification carry over to foreign accent recognition by producing a 21% relative improvement over previous baseline on spoken Finnish, and a 5.8% relative improvement on spoken English. Index Terms: Deep neural networks, data-driven speech attributes, manner of articulation, place of articulation, i-Vector, foreign accent recognition

Full Text