Abstract

The automatic speech recognition (ASR) technology has become one of the fast-growing engineering technologies. The ASR accuracy is often reduced by environmental interference noise. Many studies attempted to improve the ASR word accuracy in noisy backgrounds, but few were concerned about estimating the influence of noise on the word accuracy of ASR systems. This study investigated the effect of stationary noise on the word accuracy for mandarin of two ASR systems (i.e., the Baidu Cloud and Microsoft Azure). We selected the speech material from an open Mandarin speech database. The interference noise contained pink noise, white noise, random noise with equivalent power spectrum to the average power spectrum of mandarin, band-stop filtered noise and several typical household appliances noise (i.e., hairdryer, range hood, blender and water purifier). The signal-to-noise ratios (SNRs) were set to the range −12–3 dB. The Azure ASR system was less affected by noise and tended to have a higher word accuracy under the same noise condition than the Baidu ASR system. We proposed an approach to associate the articulation index (AI) of interference noise with the speech recognition accuracy of ASR systems. A three-parameter logistic model could be established for a specific ASR system, and each parameter is linearly dependent on the AI of interference noise. The performance of ASR systems under different stationary noise conditions could be quantitatively assessed and compared through the approach. Different models should be established for different versions of the ASR systems through this AI-associating approach since the parameters of one model are only applicable to one particular version of an ASR system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.