Sound Decoding from Auditory Nerve Activity

Hemmert Werner

doi:10.3389/conf.fncom.2012.55.00092

Abstract

Event Abstract Back to Event Sound Decoding from Auditory Nerve Activity Marek Rudnicki1, 2*, Marcelo K. Zuffo2 and Werner Hemmert1 1 Technische Universität München, Electrical Engineering and Information Technology, Germany 2 University of São Paulo, Brazil In the inner ear sounds are converted to discrete action potentials and sent to the central nervous system. This transformation is non-linear and results in massive information loss. However, we still can hear and analyze sounds with high fidelity. This is because the crucial features of sounds are still present in the auditory nerve signals. Here we present a method which decodes sounds from a large population of simulated auditory nerve fibers (ANFs). We also use the procedure to reconstruct sounds from a model of an impaired cochlea. This way we are able to mimic how hearing impaired subjects perceive sounds. The problem of reconstructing stimuli from neural activity is usually only approached with relatively small numbers of spikes. Procedures usually rely on the optimization of a linear filter using the reverse correlation technique (Bialek et al. 1991). Our approach leverages responses of a large population of ANFs (close to the number present in the human ear) and non-linear reconstruction using an artificial neural network (ANN). We used the biophysical auditory periphery model from Zilany et al. (2009) which we adapted to replicate the human hearing range and thresholds. The particular ANN used was a multi-layer perceptron (MLP) with a single hidden layer. The input to the MLP was a 10 ms sliding window from multiple spike trains across 10 different characteristic frequencies. The output was a single value of the reconstructed signal. In this way we trained and tested the MLP with sounds below 2 kHz. Unfortunately, this approach did not work for frequencies above 2 kHz, because of the lacking phase information (phase locking) in spike trains. We therefore developed a two-stage algorithm to reconstruct high frequency signals. First, spike trains were converted to a spectrogram by MLPs. Second, the spectrogram was transformed to an acoustic signal using an iterative method (Decorsiere et al. 2011). In order to convert spike trains to a spectrogram we trained 51 MLPs. The input to each MLP was a sliding window of 5 ms from multiple spike trains and the output one of 51 frequency channels of a spectrogram. Characteristic frequencies of the input fibers corresponded to the frequency of the generated output channel. The system was trained with pure tones and a few seconds of speech samples. After training we were able to generate sound files from trains of a large population of nerve action potentials. The reconstructed speech was clearly understandable and well perceived. In addition, we reconstructed sounds from an impaired cochlear model and demonstrated how perception is degraded by outer hair cell loss. Our reconstruction is a valuable tool to evaluate how well speech is encoded in different models of the auditory system. We can also use it to illustrate acoustically the effects of hearing loss.

Full Text