Abstract
Bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's skull and exhibit better noise-resistance capabilities than normal air-conduction microphones (ACMs) when transmitting speech signals. Because BCMs only capture the low-frequency portion of speech signals, their frequency response is quite different from that of ACMs. When replacing an ACM with a BCM, we may obtain satisfactory results with respect to noise suppression, but the speech quality and intelligibility may be degraded due to the nature of the solid vibration. The mismatched characteristics of BCM and ACM can also impact the automatic speech recognition (ASR) performance, and it is infeasible to recreate a new ASR system using the voice data from BCMs. In this study, we propose a novel deep-denoising autoencoder (DDAE) approach to bridge BCM and ACM in order to improve speech quality and intelligibility, and the current ASR could be employed directly without recreating a new system. Experimental results first demonstrated that the DDAE approach can effectively improve speech quality and intelligibility based on standardized evaluation metrics. Moreover, our proposed system can significantly improve the ASR performance by a notable 48.28% relative character error rate (CER) reduction (from 14.50% to 7.50%) under quiet conditions. In an actual noisy environment (sound pressure from 61.7 dBA to 73.9 dBA), our proposed system with a BCM outperforms an ACM, yielding an 84.46% reduction in the relative CER (proposed system: 9.13% and ACM: 58.75%).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.