Abstract

In this paper, we present the first evidence that perceived speech can be identified from the listeners' brain signals measured via functional-near infrared spectroscopy (fNIRS)—a non-invasive, portable, and wearable neuroimaging technique suitable for ecologically valid settings. In this study, participants listened audio clips containing English stories while prefrontal and parietal cortices were monitored with fNIRS. Machine learning was applied to train predictive models using fNIRS data from a subject pool to predict which part of a story was listened by a new subject not in the pool based on the brain's hemodynamic response as measured by fNIRS. fNIRS signals can vary considerably from subject to subject due to the different head size, head shape, and spatial locations of brain functional regions. To overcome this difficulty, a generalized canonical correlation analysis (GCCA) was adopted to extract latent variables that are shared among the listeners before applying principal component analysis (PCA) for dimension reduction and applying logistic regression for classification. A 74.7% average accuracy has been achieved for differentiating between two 50 s. long story segments and a 43.6% average accuracy has been achieved for differentiating four 25 s. long story segments. These results suggest the potential of an fNIRS based-approach for building a speech decoding brain-computer-interface for developing a new type of neural prosthetic system.

Highlights

  • The decoding of speech from brain signals has attracted the attention of researchers in recent years (Chakrabarti et al, 2015; AlSaleh et al, 2016; Herff and Schultz, 2016)

  • We applied machine learning to identify among several story segments the one that was listened by a subject based on the brain’s hemodynamic response measured with functional-near infrared spectroscopy (fNIRS)

  • An inter-subject correlation analysis revealed that the time courses of fNIRS were significantly correlated at parietal areas, suggesting that signal was most consistent at parietal optodes, fNIRS Based Speech Recognition and parietal optodes may have provided the most information for characterizing the story segments

Read more

Summary

Introduction

The decoding of speech from brain signals has attracted the attention of researchers in recent years (Chakrabarti et al, 2015; AlSaleh et al, 2016; Herff and Schultz, 2016). A device that can directly translate brain signals into texts that describe a person’s thoughts may help people with disabilities and verbal communication deficits and enable a new communication channel with the outside world. Such brain-computer interfacing device may help healthy people to directly interact with a machine without the need of using muscles and potentially expand the interaction bandwidth. Herff et al decoded words from fNIRS Based Speech Recognition continuously spoken speech from intracranial electrocorticographic (ECoG) signals recorded from epileptic patients based on the classification of phonemes (Herff et al, 2015). We refer to Chakrabarti et al (2015) and Herff and Schultz (2016) for a more comprehensive review of the state of art in the field

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call