Wavelet feature selection of audio and imagined/vocalized EEG signals for ANN based multimodal ASR system

Mini P.P,Tessamma Thomas,R Gopikakumari

doi:10.1016/j.bspc.2020.102218

Abstract

Human-Machine Interaction (HMI) systems demand the use of multiple modalities for correct interaction. Research on these systems started with audio signals for speech recognition and now progressing towards co-operation of other biosignals. Thus, the paper presents an Automatic Speech Recognition (ASR) system based on a single and multiple modalities that include audio and Electroencephalogram (EEG) signals to explore speech recognition. It extracts speech information concealed in audio and ten channels of imagined EEG (EEG-i) & vocalized EEG (EEG-v) signals. Three Wavelet Transform (WT) methods - Discrete Wavelet Transform (DWT), Wavelet Packet Decomposition (WPD) & hybrid of DWT & WPD (DWPD) with four-level decomposition is used to transform the signals into WT coefficients. Then, six statistical parameters are computed from WT coefficients to generate 63 (26-1) feature vectors for each method. An exhaustive search from 63 feature vectors is conducted to determine the best parameter combination that attains good accuracy with ANN classifier. Then, accuracy is improved with 5 level decomposition on WPD coefficients along with the best parameter combination. Results include the accuracy of unimodal ASR & multimodal ASR. WPD method achieved best accuracy as 74.48%, 56.29%, 42.02%, 77.97% & 78.90% for multiclass classification of prompts+words based on audio, EEG-i, EEG-v, audio + EEG-i & audio + EEG-v respectively. It indicates that speech recognition is possible from EEG signals & the fusion of audio with EEG enhances the recognition rate of audio & EEG. The results also show that the proposed method outperforms other methods in the area.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Wavelet feature selection of audio and imagined/vocalized EEG signals for ANN based multimodal ASR system

Abstract

Talk to us

Similar Papers

More From: Biomedical Signal Processing and Control

Lead the way for us

Journal: Biomedical Signal Processing and Control	Publication Date: Sep 25, 2020
Citations: 6

Similar Papers

The Benefit Obtained from Visually Displayed Text from an Automatic Speech Recognizer During Listening to Speech Presented in Noise
Adriana A Zekveld ... Marcel S M G Vlaming
Ear & Hearing | VOL. 29
Adriana A Zekveld, et. al.Adriana A Zekveld ... Marcel S M G Vlaming
01 Dec 2008
Ear & Hearing | VOL. 29

A new concept for the distributions of wavelet packet decomposition coefficients in detail subbands
Deming Kong ... Lijun Xu
-
Deming Kong, et. al.Deming Kong ... Lijun Xu
01 Jul 2012
01 Jul 2012

Development of a Low-Latency and Real-Time Automatic Speech Recognition System
Chee Siang Leow ... Hiromitsu Nishizaki
-
Chee Siang Leow, et. al.Chee Siang Leow ... Hiromitsu Nishizaki
13 Oct 2020
13 Oct 2020

Effect of Face Masks on Automatic Speech Recognition Accuracy for Mandarin
Xiaoya Li ... Yu Huang
Applied Sciences | VOL. 14
Xiaoya Li, et. al.Xiaoya Li ... Yu Huang
12 Apr 2024
Applied Sciences | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Wavelet feature selection of audio and imagined/vocalized EEG signals for ANN based multimodal ASR system

Abstract

Talk to us

Similar Papers

More From: Biomedical Signal Processing and Control