Exploring CTC Based End-To-End Techniques for Myanmar Speech Recognition

Khin Me Me Chit,Laet Laet Lin

doi:10.1007/978-3-030-68154-8_87

Exploring CTC Based End-To-End Techniques for Myanmar Speech Recognition

Khin Me Me Chit, Laet Laet Lin

Open Access

https://doi.org/10.1007/978-3-030-68154-8_87

Copy DOI

Publication Date: Jan 1, 2021

Citations: 2

Affiliation: University Of Information Technology

#Syllable Error Rate #Connectionist Temporal Classification + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In this work, we explore a Connectionist Temporal Classification (CTC) based end-to-end Automatic Speech Recognition (ASR) model for the Myanmar language. A series of experiments is presented on the topology of the model in which the convolutional layers are added and dropped, different depths of bidirectional long short-term memory (BLSTM) layers are used and different label encoding methods are investigated. The experiments are carried out in low-resource scenarios using our recorded Myanmar speech corpus of nearly 26 hours. The best model achieves character error rate (CER) of 4.72% and syllable error rate (SER) of 12.38% on the test set.

Full Text