Abstract

Recent progress in machine-learning-based distributed semantic models (DSMs) offers new ways to simulate the apperceptive mass (AM; Kintsch, 1980) of reader groups or individual readers and to predict their performance in reading-related tasks. The AM integrates the mental lexicon with world knowledge, as for example, acquired via reading books. Following pioneering work by Denhière and Lemaire (2004), here, we computed DSMs based on a representative corpus of German children and youth literature (Jacobs et al., 2020) as null models of the part of the AM that represents distributional semantic input, for readers of different reading ages (grades 1–2, 3–4, and 5–6). After a series of DSM quality tests, we evaluated the performance of these models quantitatively in various tasks to simulate the different reader groups' hypothetical semantic and syntactic skills. In a final study, we compared the models' performance with that of human adult and children readers in two rating tasks. Overall, the results show that with increasing reading age performance in practically all tasks becomes better. The approach taken in these studies reveals the limits of DSMs for simulating human AM and their potential for applications in scientific studies of literature, research in education, or developmental science.

Highlights

  • After a series of distributed semantic (vector space) models (DSMs) quality tests, we evaluated the performance of these models quantitatively in various tasks to simulate the different reader groups’ hypothetical semantic and syntactic skills

  • On top of the child’s biosociocultural development, this hypothetical singular reading education will have measurable consequences for the child’s thoughts, feelings, and behavior. Such consequences can be assessed by various tests, and the child’s performance in these tests can be predicted via quantitative narrative and advanced sentiment analysis of the only text it knows. This is what this paper is about: using distributed semantic models (DSMs) trained on representative book corpora as potent null models of an important part of human semantic memory, or as Kintsch (1980) preferred to call it, the apperceptive mass (AM) of readers: this term highlights the integration of world knowledge as for example, acquired via reading books into semantic memory

  • We computed DSMs based on a representative corpus of German children and youth literature as null models of the AM for readers of different reading ages

Read more

Summary

INTRODUCTION

Imagine a child who has read only one book, the Bible. On top of the child’s biosociocultural development, this hypothetical singular reading education will have measurable consequences for the child’s thoughts (e.g., concrete and abstract concepts), feelings (e.g., basic and mixed emotions), and behavior (e.g., communication). Such consequences can be assessed by various tests (e.g., active/passive vocabulary, semantic arithmetic, and analogical reasoning), and the child’s performance in these tests can be predicted via quantitative narrative and advanced sentiment analysis of the only text it knows This is what this paper is about: using distributed semantic (vector space) models (DSMs) trained on representative book corpora as potent null models of an important part of human semantic memory, or as Kintsch (1980) preferred to call it, the apperceptive mass (AM) of readers: this term highlights the integration of world knowledge as for example, acquired via reading books into semantic memory. Having given an idea of what quantitative narrative and advanced sentiment analysis can be used for when analyzing hypothetical individual readers, we propose ways to examine the quality of more general DSMs, a necessary condition for using them as predictive models of reader group behavior

BOOK CORPORA AS READER GROUP MODELS
EMOTION CONCEPTS
MORPHOSYNTACTIC TESTS
Findings
DISCUSSION, LIMITATIONS, AND OUTLOOK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call