Abstract

Speech-based interaction between human and machines is one of the most significant progresses in recent years. Digits are one of the aspects of this interaction used in various systems such as voice call systems, bank transaction systems and etc. One of the major problems in spoken digit recognition applications such as all other speech recognition applications is degrading the true detection rate of spoken digits in noisy conditions. Although there are a lot of researches in the field of Persian spoken digit recognition, but there isn’t any significant work in noisy conditions. Therefore, the main goal of this paper is choosing a robust isolated digit recognition approach for Persian language. The proposed method is based on Long Short-Term Memory (LSTM) which is compared with a Hidden Markov Model (HMM)-based method (which is the most common method in the field of Persian spoken digit recognition). The experimental results showed that the LSTM-based spoken digit recognizer is in average 18% more robust than the HMM-based approach in all kinds of additive noisy conditions (five different noises and four different Signal to Noise Ratio (SNR)s for each of them).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call