Abstract

People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker’s vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.

Highlights

  • Automatic speech recognition (ASR) technology has been in use since the mid-20th century and has gradually been applied in diverse fields

  • We propose a combination of signal processing methods to implement a contactless silent speech recognition system based on impulse radio ultra-wide band (IR-UWB) radar

  • The proposed algorithms demonstrated about a 85% word accuracy for 10 isolated words, which is comparable to the previous silent speech recognition results using contact sensors (e.g., electromagnetic articulography (EMA), EMG, and non-audible murmur (NAM)) and contains all essential radar subsystems within a 5 mm × 5 mm package [43]

Read more

Summary

Introduction

Automatic speech recognition (ASR) technology has been in use since the mid-20th century and has gradually been applied in diverse fields. ASR technology was used to perform simple tasks in applications such as automatic typewriters, automatic call center services, and computer interfaces [1]. With the improvement of its recognition performance, the scope of ASR applications has significantly expanded. There exists a risk that the user’s speech content can be accessible to other people in the immediate vicinity. For these reasons, several researchers have focused on the novel technology of silent speech recognition

Objectives
Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call