Abstract

We propose a new scheme for speaker-dependent silent speech recognition systems (SSRSs) using both single-trial electroencephalograms (EEGs) scalp-recorded and speech signals measured during overtly and covertly speaking “janken” and “season” in Japanese. This scheme consists of two phases. The learning phase specifies a Kalman filter using spectrograms of the speech signals and independent components (ICs), whose equivalent current dipole source localization (ECDL) solutions were located mainly at the Broca’s area, of the EEGs during the actual speech. In case of the “season” task, the speech signals were transformed into vowel and consonant sequences, and these relationships were learned by hidden Markov model (HMM) with Gaussian mixture densities. The decoding phase predicts spectrograms for the silent “janken” and “season” using the Kalman filter with the EEGs during the silent speech. For the silent “season”, the predicted spectrograms were inputted to the HMM, and which “season” was silently spoken was determined by the maximal log-likelihood among each HMM. Our preliminary results as training steps are as follows: the silent “jankens” were correctly discriminated; the silent “season”-HMMs worked well, suggesting that this scheme might be applied to the discrimination between all the pairs of the hiraganas

Highlights

  • The decipherment of human thought from brain activity, without recourse to speech or action, is one of the most attractive and challenging frontiers of modern science

  • We statistically examined the hypothesis in the Directions Into Velocities of Articulators (DIVA) model for our silent speech recognition systems (SSRSs)

  • With the minimal P-values, for all the tasks by all the subjects, between each formant frequency and the independent components (ICs), it followed that significant correlations were found for both F1 (r=0.62~0.88, p ≤ 6.00 × 10-15 for “rock”; r=0.64~0.94, p ≤ 7.64 × 10-13 “paper”; r=0.54~0.95, p ≤ 3.27 × 10-13 for “scissors”) and F2 (r=0.65~0.92, p ≤ 7.55 × 10-15 for “rock”; r=0.60~0.98, p ≤ 2.83 × 10-11 for “paper”; r=0.68~0.92, p ≤ 7.88 × 10-15 for “scissors”) except for one subject (DK) (Table 1), confirming the hypothesis

Read more

Summary

Introduction

The decipherment of human thought from brain activity, without recourse to speech or action, is one of the most attractive and challenging frontiers of modern science. In addition to “physical” SSRSs [2,3,4,5], in the “electrical” ones, articulation may be inferred from actuator muscle signals or predicted using command signals obtained directly from the brain. The latter could be speech prosthesis for individuals with severe communication impairments. We propose a new scheme for a speaker-dependent SSRS using singletrial scalp-recorded EEGs for silent vowel recognition, and generalize to consonant one in Japanese. In order to exemplify this scheme, we carried out two experiments (Experiments I and II)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call