Abstract
Current tests to measure whether a person can understand speech require behavioral responses from the person, which is in practice not always possible (e.g. young children). Therefore there is a need for objective measures of speech intelligibility. Recently, it has been shown that speech intelligibility can be measured by letting a person listen to natural speech, recording the electroencephalogram (EEG) and decoding the speech envelope from the EEG signal. Linear decoders are used, which is sub-optimal, as the human brain is a complex non-linear system and cannot easily be modeled by a linear decoder. We therefore propose an approach based on deep learning which can model complex non-linear relationships. Our approach is based on dilated convolutions as used in WaveNet to maximize the receptive field with regard to the number of tunable parameters. Comparison with a model based on a state of the art linear decoder and a convolutional baseline model shows that our proposed model significantly improves on both models (from 62.3% to 90.6% (p<; 0.001) and from 78.8% to 90.6% (p<; 0.001) respectively). Best results are achieved with a receptive field size between 250-500ms, which is longer than the optimal integration window for a linear decoder.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.