Abstract

Retina, providing and preprocessing almost all the visual information for brain, is a much sophisticated neural system than we think. In neural coding, the central goal is to finding the relationship between a cause (e.g. a visual stimulus) and the effect (i.e., the neural response). Neural system identification aims to reveal the computational black boxes by learning the stimulus-response relationship. Traditional models of retinal system identification analysis the neural respond to artificial stimulus by models consisted of predefined components. The design of the model is limited by prior knowledge, and the artificial stimulus is too simple to be compared with the stimulus actually processed by the retina. As the most successful prediction models, deep neural networks, have demonstrated powerful computational ability in neural coding. In this study, to fill in the gap of explainable model that reveals how population of neurons work together to encode the larger field of dynamic natural scenes, we used a deep learning model to identify the computational elements of the retinal neural system that contribute to learning of the dynamics of natural visual scenes. Specifically, the proposed model can separate intricate spatiotemporal patterns by leveraging building blocks of the neural network model. By using physiological data with sequential natural scene inputs, we verify that the recurrent connection plays a key role in encoding complex dynamic visual scenes while learning biological computational underpinnings of the retina circuit. Our model subsequently learns both the shapes and location of the spatiotemporal receptive fields of ganglion cells. Moreover, we propose a method for evaluating the importance of the kernels learned by the network model in encoding dynamic visual scenes, which allows us to disentangle the dynamic computational structure of the retina. These results provide new insights into understanding the encoding mechanisms of retinal neurons and promote model design in machine learning for computational vision involving complex dynamic visual scenes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call