Deep generative learning of location-invariant visual word recognition

Maria Grazia Di Bono,Marco Zorzi

doi:10.3389/fpsyg.2013.00635

Abstract

It is widely believed that orthographic processing implies an approximate, flexible coding of letter position, as shown by relative-position and transposition priming effects in visual word recognition. These findings have inspired alternative proposals about the representation of letter position, ranging from noisy coding across the ordinal positions to relative position coding based on open bigrams. This debate can be cast within the broader problem of learning location-invariant representations of written words, that is, a coding scheme abstracting the identity and position of letters (and combinations of letters) from their eye-centered (i.e., retinal) locations. We asked whether location-invariance would emerge from deep unsupervised learning on letter strings and what type of intermediate coding would emerge in the resulting hierarchical generative model. We trained a deep network with three hidden layers on an artificial dataset of letter strings presented at five possible retinal locations. Though word-level information (i.e., word identity) was never provided to the network during training, linear decoding from the activity of the deepest hidden layer yielded near-perfect accuracy in location-invariant word recognition. Conversely, decoding from lower layers yielded a large number of transposition errors. Analyses of emergent internal representations showed that word selectivity and location invariance increased as a function of layer depth. Word-tuning and location-invariance were found at the level of single neurons, but there was no evidence for bigram coding. Finally, the distributed internal representation of words at the deepest layer showed higher similarity to the representation elicited by the two exterior letters than by other combinations of two contiguous letters, in agreement with the hypothesis that word edges have special status. These results reveal that the efficient coding of written words—which was the model's learning objective—is largely based on letter-level information.

Highlights

Visual word recognition and reading aloud is one of the cognitive domains where connectionist modeling has achieved its greatest success
The issue of how location-invariance might be computed from the native retinotopic code has recently attracted much interest (Dehaene et al, 2005; Dandurand et al, 2010; Hannagan et al, 2011), because it is closely tied to a lively debate on the nature of orthographic coding and on the coding of letter position during visual word recognition (e.g., Whitney, 2001; Grainger and van Heuven, 2003; Davis and Bowers, 2006; Gomez et al, 2008; Davis, 2010; Grainger and Ziegler, 2011)
Models of orthographic coding (e.g., Grainger and van Heuven, 2003; Grainger and Whitney, 2004; Gomez et al, 2008; Davis, 2010) share the assumption that visual word recognition is performed through the processing of constituent letters but differ on how letter position information is coded and whether the mapping between location-specific letter coding and location-invariant word representations requires the computation of an intermediate orthographic code, such as open bigrams

Summary

Introduction

Visual word recognition and reading aloud is one of the cognitive domains where connectionist modeling has achieved its greatest success. Most models stipulate that the identity and position of individual letters is coded in a way that is abstracted from the retinal input both in terms of shape and spatial location with respect to eye fixation. The latter assumption implies a location-invariant word-centered representation, with letters aligned according to a fixed template (e.g., left-justified slot-based coding). The issue of how location-invariance might be computed from the native retinotopic (eye-centered) code has recently attracted much interest (Dehaene et al, 2005; Dandurand et al, 2010; Hannagan et al, 2011), because it is closely tied to a lively debate on the nature of orthographic coding and on the coding of letter position during visual word recognition (e.g., Whitney, 2001; Grainger and van Heuven, 2003; Davis and Bowers, 2006; Gomez et al, 2008; Davis, 2010; Grainger and Ziegler, 2011)

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Psychology	Publication Date: Jan 1, 2013
Citations: 26	License type: cc-by

R Discovery Prime

R Discovery Prime

Deep generative learning of location-invariant visual word recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Psychology

Lead the way for us

Similar Papers

The Speed of Orthographic Processing during Lexical Decision: Electrophysiological Evidence for Independent Coding of Letter Identity and Letter Position in Visual Word Recognition
Marina Mariol ... Marie-Anne Schelstraete
Journal of Cognitive Neuroscience | VOL. 20
Marina Mariol, et. al.Marina Mariol ... Marie-Anne Schelstraete
01 Jul 2008
Journal of Cognitive Neuroscience | VOL. 20

Insights from letter position dyslexia on morphological decomposition in reading.
Naama Friedmann ... Roni Nisim
Frontiers in Human Neuroscience | VOL. 9
Naama Friedmann, et. al.Naama Friedmann ... Roni Nisim
03 Jul 2015
Frontiers in Human Neuroscience | VOL. 9

Masked form priming as a function of letter position: An evaluation of current orthographic coding models.
Stephen J Lupker ... Giacomo Spinelli
Journal of Experimental Psychology: Learning, Memory, and Cognition | VOL. 46
Stephen J Lupker, et. al.Stephen J Lupker ... Giacomo Spinelli
01 Dec 2020
Journal of Experimental Psychology: Learning, Memory, and Cognition | VOL. 46

Friends in Low-Entropy Places: Orthographic Neighbor Effects on Visual Word Identification Differ Across Letter Positions.
Sahil Luthra ... Jay G Rueckl
Cognitive science | VOL. 44
Sahil Luthra, et. al.Sahil Luthra ... Jay G Rueckl
01 Dec 2020
Cognitive science | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep generative learning of location-invariant visual word recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Psychology