Abstract

This work aimed to map electroencephalography (EEG) signals recorded during speech production to an intelligible speech. Experiments were designed to record EEG and spoken speech signals from normal participants. EEG features were processed with a Gaussian process regression method, and used to estimate multiple temporal amplitude envelopes of a spoken speech signal. The estimated envelopes were further applied to synthesize an intelligible speech signal by using a temporal envelope-based vocoder model. The performance of reconstructing the spoken speech signal was evaluated by the short-term objective intelligibility (STOI) index and the root mean square error (RMSE) between the reconstructed vocoded speech and the original spoken speech. Results showed a small RMSE between two sets of mel-frequency cepstral coefficients, and a STOI measurement up to 0.71. Both measures outperformed results from existing studies with similar tasks, indicating the potential in synthesizing an intelligible spoken speech with EEG signals in brain-computer interface based speech communication.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.