Abstract

Smart glasses are often used in noisy public spaces or industrial settings. Voice commands and automatic speech recognition (ASR) are good user interfaces for such a form factor, but the background noise and interfering speakers pose important challenges. Typical signal processing techniques have limitations in performance and/or hardware resources. V-Speech is a novel solution that captures the voice signal with a vibration sensor located in the nasal pads of smart glasses. Although signal-to-noise ratio (SNR) is much higher with vibration sensor capture, it introduces a "nasal distortion," which must be dealt with. The second part of our proposed solution involves a voice transformation of the vibration signal using a neural network to produce an output that mimics the characteristics of a conventional microphone. We evaluated V-Speech in noise-free and very noisy conditions with 30 volunteer speakers uttering 145 phrases each, and validated its performance on ASR engines, with assessments of voice quality using the Perceptual Evaluation of Speech Quality (PESQ) metric, and with subjective listeners to determine intelligibility, naturalness and overall quality. The results show, in extreme noise conditions, a mean improvement of 50% for Word Error Rate (WER), 1.0 on a scale of 5.0 for PESQ, and speech regarded intelligible, with naturalness rated as fair to good. The output of V-Speech has low noise, sounds natural, and enables clear voice communication in challenging environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.