Abstract

Auditory stimulus reconstruction is a technique that predicts sound from neural activities. Decoding speech from human brains using neuronal networks allows us to understand auditory information processing and study hearing abilities effectively. However, current approaches produce low-quality reconstruction of speech which is problematic for applications of the reconstruction technology to practical uses such as evaluation of hearing characteristics of individuals or functional preservation during neurosurgical procedures. To tackle this issue, we used a self-attention (SA) module which are showing promising advances in deep neural network studies. We investigated the effect of including neural activity from a longer span on the reconstruction accuracy, and the learning behavior of self-attention modules. Our results showed that self-attention modules that focus on the presence or absence of auditory stimuli used relationships between features over a longer period (over 1000 ms) and produced better reconstructions than a convoluted neural network (CNN) model using features over a shorter period (300 ms) and a multilayer perceptron (MLP) model not using the features and reconstructed with attention to presence and absence of auditory stimuli. These findings imply that long and sparse relationships of time-series information on neural activities derived from SA modules improve the reconstruction performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call