Comparison of Speech Recognition Performance Between Kaldi and Google Cloud Speech API

Takashi Kimura,Shinji Hirooka,Akinori Ito,Takashi Nose,Yuya Chiba

doi:10.1007/978-3-030-03748-2_13

Abstract

In recent years, many systems having a speech interface have grown. The speech interface includes spoken dialogue function and high performance of a spoken dialogue system has been required. The spoken dialogue system consists of a speech recognition module. In this study, we focus on the speech recognition module of the spoken dialogue system and aim for improving the spoken dialogue system by enhancing the performance of the speech recognition system. Among several speech recognition systems, Kaldi is a widely used speech recognition system in many kinds of researches. On the other hand, several speech recognition services that are Web API is also provided, such as IBM Watson Speech to Text, Microsoft Bing Speech API, and Google Cloud Speech API, which is known that it has high performance. This paper compares speech recognition performance between Kaldi and Google Cloud Speech API in WER and RTF and confirms the recognition performance of each recognition system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparison of Speech Recognition Performance Between Kaldi and Google Cloud Speech API

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

Feature Level Solution to Noise Robust Speech Recognition in the context of Tonal Languages
Utpal Bhattacharjee ... Jyoti Mannala
International Journal of Engineering and Advanced Technology | VOL. 9
Utpal Bhattacharjee, et. al.Utpal Bhattacharjee ... Jyoti Mannala
30 Dec 2020
International Journal of Engineering and Advanced Technology | VOL. 9

Time scale modification and vocal tract length normalization for improving the performance of Tamil speech recognition system implemented using language independent segmentation algorithm
S Saraswathi ... T V Geetha
International Journal of Speech Technology | VOL. 9
S Saraswathi, et. al.S Saraswathi ... T V Geetha
01 Dec 2006
International Journal of Speech Technology | VOL. 9

Automatic speech recognition using optimal selection of features based on hybrid ABC-PSO
Sunanda Mendiratta ... Neelam Turk
-
Sunanda Mendiratta, et. al.Sunanda Mendiratta ... Neelam Turk
01 Aug 2016
01 Aug 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparison of Speech Recognition Performance Between Kaldi and Google Cloud Speech API

Abstract

Talk to us

Similar Papers