Abstract

Besides the recognition of audible speech, there is currently an increasing interest in the recognition of silent speech, which has a range of novel applications. A major obstacle for a wide spread of silent-speech technology is the lack of measurement methods for speech movements that are convenient, non-invasive, portable, and robust at the same time. Therefore, as an alternative to established methods, we examined to what extent different phonemes can be discriminated from the electromagnetic transmission and reflection properties of the vocal tract. To this end, we attached two Vivaldi antennas on the cheek and below the chin of two subjects. While the subjects produced 25 phonemes in multiple phonetic contexts each, we measured the electromagnetic transmission spectra from one antenna to the other, and the reflection spectra for each antenna (radar), in a frequency band from 2–12 GHz. Two classification methods ( k -nearest neighbors and linear discriminant analysis) were trained to predict the phoneme identity from the spectral data. With linear discriminant analysis, cross-validated phoneme recognition rates of 93% and 85% were achieved for the two subjects. Although these results are speaker- and session-dependent, they suggest that electromagnetic transmission and reflection measurements of the vocal tract have great potential for future silent-speech interfaces.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call