We propose sparse banded acoustic models to significantly improve the recognition accuracy and reduce the computational cost of speech recognition systems. The sparse banded models are trained using a weighted lasso regularization. In addition, we propose new feature orders to reduce the bandwidth of sparse banded models in order to speed up computation. Experimental results on the Wall Street Journal data set show that sparse banded models significantly outperform diagonal and full covariance models by 9.5% and 15.1% relatively. Sparse banded models also run the fastest. The advantages of sparse banded models are also demonstrated on the collected Cantonese data set.
Read full abstract