Abstract

Abstract Statistical regularities in the environment impact cognition across domains. In semantics, distributional approaches posit that similarity between words can be derived from regularities of the contexts in which they appear. Here, we study how regularities in written text impact readers’ knowledge about orthography: Can similarity between characters be learned from the written environment? Adapting methods from distributional semantics, we model the contextual similarity among alphanumeric characters in a large text corpus. We find modest correlations between model-derived similarities with similarity derived from a behavioral experiment. Beyond this result, model-derived similarity from neural embedding models captures key aspects of orthographic knowledge, like case, letter identity and consonant–vowel status. We conclude that the text environment contains regularities that are relevant to readers and that statistical learning is a promising way for this information to be acquired. More broadly, our results imply that statistical regularities are relevant not only at the level of word semantics but also individual written characters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call