Deep maxout neural networks for speech recognition

Meng Cai,Jia Liu,Yongzhe Shi

doi:10.1109/asru.2013.6707745

Meng Cai, Jia Liu + Show 1 more

https://doi.org/10.1109/asru.2013.6707745

Copy DOI

Export

Save

Cite

Publication Date: Dec 1, 2013

Citations: 67

Affiliation: Tsinghua University

Abstract
Full-Text
Similar Papers

Abstract

Listen

A recently introduced type of neural network called maxout has worked well in many domains. In this paper, we propose to apply maxout for acoustic models in speech recognition. The maxout neuron picks the maximum value within a group of linear pieces as its activation. This nonlinearity is a generalization to the rectified nonlinearity and has the ability to approximate any form of activation functions. We apply maxout networks to the Switchboard phone-call transcription task and evaluate the performances under both a 24-hour low-resource condition and a 300-hour core condition. Experimental results demonstrate that maxout networks converge faster, generalize better and are easier to optimize than rectified linear networks and sigmoid networks. Furthermore, experiments show that maxout networks reduce underfitting and are able to achieve good results without dropout training. Under both conditions, maxout networks yield relative improvements of 1.1-5.1% over rectified linear networks and 2.6-14.5% over sigmoid networks on benchmark test sets.

Full Text