Abstract

Sparse coding models of natural images and sounds have been able to predict several response properties of neurons in the visual and auditory systems. While the success of these models suggests that the structure they capture is universal across domains to some degree, it is not yet clear which aspects of this structure are universal and which vary across sensory modalities. To address this, we fit complete and highly overcomplete sparse coding models to natural images and spectrograms of speech and report on differences in the statistics learned by these models. We find several types of sparse features in natural images, which all appear in similar, approximately Laplace distributions, whereas the many types of sparse features in speech exhibit a broad range of sparse distributions, many of which are highly asymmetric. Moreover, individual sparse coding units tend to exhibit higher lifetime sparseness for overcomplete models trained on images compared to those trained on speech. Conversely, population sparseness tends to be greater for these networks trained on speech compared with sparse coding models of natural images. To illustrate the relevance of these findings to neural coding, we studied how they impact a biologically plausible sparse coding network's representations in each sensory modality. In particular, a sparse coding network with synaptically local plasticity rules learns different sparse features from speech data than are found by more conventional sparse coding algorithms, but the learned features are qualitatively the same for these models when trained on natural images.

Highlights

  • An important goal of systems neuroscience is to discover and understand the principles that might govern sensory processing in the brain

  • Natural visual scenes can be well-represented by sparse distributions (Field, 1987), and coding strategies optimized for sparseness find local, oriented, bandpass features that match the receptive fields of simple cells in primary visual cortex (V1) (Olshausen and Field, 1996; Bell and Sejnowski, 1997; Rehn and Sommer, 2007; Rozell et al, 2008; Zylberberg et al, 2011)

  • We further demonstrate that the differences we find between the sparse structure of speech and that of images have significant consequences for coding schemes used to process these types of data, and for neural models of vision and audition

Read more

Summary

Introduction

An important goal of systems neuroscience is to discover and understand the principles that might govern sensory processing in the brain. Several principles have been proposed, such as reducing redundancy between neurons (Attneave, 1954; Barlow, 1961; Daugman, 1989; Atick and Redlich, 1992; Chechik et al, 2006), representing statistical dependencies between objects and events to guide action (Barlow, 2001), minimizing expended energy (Laughlin, 2001), maximizing entropy (Schneidman et al, 2006), and maximizing transmitted information (Laughlin, 1981; Bell and Sejnowski, 1995; DeWeese, 1996; Rieke et al, 1997; Hyvärinen and Hoyer, 2001; Karklin and Simoncelli, 2011) Each of these principles suggests that sensory systems should use the Sparse Structure of Sounds and Images statistical structure of sensory data from the animal’s environment to efficiently represent and process that data. A sparse coding model of natural images exhibits many of the non-classical receptive field effects found in V1 neurons in addition to learning similar classical receptive fields (Zhu and Rozell, 2013)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.