Abstract

Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy.

Highlights

  • Brain damage or late-stage amyotrophic lateral sclerosis (ALS) eventually leads to a stage of paralysis called locked-in syndrome, where the patients become completely paralyzed while being otherwise cognitively intact [1]

  • We investigated the automatic, real-time detection of voice activity from neural (MEG) signals (NeuroVAD) using LSTM-RNN

  • We classified the speech and silence intervals from the MEG signals via an support vector machine (SVM) classifier with 90% accuracy which indicated that the prediction of start and end time points of speech-motor activity is possible

Read more

Summary

Introduction

Brain damage or late-stage amyotrophic lateral sclerosis (ALS) eventually leads to a stage of paralysis called locked-in syndrome, where the patients become completely paralyzed while being otherwise cognitively intact [1]. Assessing the neural pathways through brain-computer interface (BCI). EEG-based BCIs are primarily based on screen-based letter selection strategies using visual or attention correlates, resulting in a slow communication rate (a few words per minute) [4]. To address this issue, recent studies have explored direct mapping of neural signals to text (neural speech decoding) or speech acoustics (neural speech synthesis), which drives a speech synthesizer. The speech decoding paradigm has the potential for faster, next-generation BCIs. Numerous studies have proven the feasibility of neural speech decoding [5,6,7,8,9]. Recent studies have shown the possibility of phoneme, syllable, and word classification with electroencephalography (EEG) or electrocorticography

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.