Abstract

An Automatic Speech Recognition (ASR) System is a software tool that converts a speech audio waveform into its corresponding text transcription. ASR systems are usually built using Artificial Intelligence techniques, particularly Machine Learning algorithms like Deep Learning, to address the multi-faceted complexity and variability of human speech. This allows these systems to learn from extensive speech datasets, adapt to several languages and accents, and continuously improve their performance over time, making them each time more versatile and effective in their purpose of transcribing spoken language to text. Much in the same way, we argue that the noises commonly present in the different environments also need to be explicitly dealt with, and, when possible, modeled within specific datasets with proper training. Our motivation comes from the observation that noise removal techniques (commonly called denoising), are not always fully (and generically) efficient. For instance, noise degeneration due to communication interference, which is almost always present in radio transmissions, has peculiarities that a simple mathematical formulation cannot model. This work presents a modeling technique composed of an augmented dataset-building approach and a profile identifier that can be used to build ASRs for noisy environments that perform similarly to those used in noise-free environments. As a case study, we developed a specific ASR for the interference noise in radio transmissions with its specific dataset, while comparing our results with other state-of-the-art work. As a result, we report a Character Error Rate value of 0.3163 for the developed ASR under several different noise conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.