Application of Morphosyntactic and Class-Based Language Models in Automatic Speech Recognition of Polish

Aleksander Smywinski-Pohl,Bartosz Ziółko

doi:10.1142/s0218213016500068

Application of Morphosyntactic and Class-Based Language Models in Automatic Speech Recognition of Polish

Aleksander Smywinski-Pohl, Bartosz Ziółko

https://doi.org/10.1142/s0218213016500068

Copy DOI

Journal: International Journal on Artificial Intelligence Tools	Publication Date: Apr 1, 2016
Citations: 1

Affiliation: Jagiellonian University, AGH University of Krakow, Lukasiewicz Research Network - Krakow Institute of Technology

#Class-Based Models #Model For Automatic Speech Recognition + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this paper we investigate the usefulness of morphosyntactic information as well as clustering in modeling Polish for automatic speech recognition. Polish is an inflectional language, thus we investigate the usefulness of an N-gram model based on morphosyntactic features. We present how individual types of features influence the model and which types of features are best suited for building a language model for automatic speech recognition. We compared the results of applying them with a class-based model that is automatically derived from the training corpus. We show that our approach towards clustering performs significantly better than frequently used SRI LM clustering method. However, this difference is apparent only for smaller corpora.

Full Text