Abstract

Speech imagery (SI) is a Brain-Computer Interface (BCI) paradigm based on EEG signals analysis where the user imagines speaking out a vowel, phoneme, syllable, or word without producing any sound or facial movements. This paradigm is ideal for developing interfaces for patients diagnosed with neurological disorders since it helps them communicate with their surroundings. This paper presents a prototypical network named Proto-Speech to classify vowels, syllables, and words acquired with the SI paradigm. The embeddings of the prototypical network are produced with a 1D convolutional layer and two bidirectional Gated Recurrent Unit (GRU) layers. The meta-training strategy of Proto-Speech considers the eleven classes of the KaraOne dataset, and the meta-testing is configured with five binary classification tasks commonly used in KaraOne, and with an extra multi-classification scheme. Also, a second publically available dataset named ASU is used in meta-testing to classify long words, short words, vowels, short-long words, and a multiclass approach. Both meta-training and meta-testing are implemented in a subject-independent strategy. Experimental results indicate the best average accuracy obtained with the binary classifications tasks, vowel/consonant, non-nasal/nasal, non-bilabial/bilabial, non-iy/iy, and non-uw/uw with the KaraOne dataset are 99.89%, 99.89%, 99.88%, 99.91%, and 99.92%, respectively. For the multi-classification task is obtained an average accuracy of 91.51%. The average classification results considering a multiclass evaluation is 93.70% with the ASU dataset. These results surpass state-of-the-art methods evaluated with subject-dependent and subject-independent strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call