Improving the Robustness of Persian Spoken Isolated Digit Recognition based on LSTM

Mohammad Mehdi Naseri,Shima Tabibian

doi:10.1109/icspis51611.2020.9349539

Mohammad Mehdi Naseri, Shima Tabibian

https://doi.org/10.1109/icspis51611.2020.9349539

Copy DOI

Export

Save

Cite

Publication Date: Dec 23, 2020

Citations: 4

Affiliation: Shahid Beheshti University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Speech-based interaction between human and machines is one of the most significant progresses in recent years. Digits are one of the aspects of this interaction used in various systems such as voice call systems, bank transaction systems and etc. One of the major problems in spoken digit recognition applications such as all other speech recognition applications is degrading the true detection rate of spoken digits in noisy conditions. Although there are a lot of researches in the field of Persian spoken digit recognition, but there isn’t any significant work in noisy conditions. Therefore, the main goal of this paper is choosing a robust isolated digit recognition approach for Persian language. The proposed method is based on Long Short-Term Memory (LSTM) which is compared with a Hidden Markov Model (HMM)-based method (which is the most common method in the field of Persian spoken digit recognition). The experimental results showed that the LSTM-based spoken digit recognizer is in average 18% more robust than the HMM-based approach in all kinds of additive noisy conditions (five different noises and four different Signal to Noise Ratio (SNR)s for each of them).

Full Text