Humans monitor learning progress in curiosity-driven exploration

Alexandr Ten,Pierre-Yves Oudeyer,Pramod Kaushik,Jacqueline Gottlieb

doi:10.1038/s41467-021-26196-w

Abstract

Curiosity-driven learning is foundational to human cognition. By enabling humans to autonomously decide when and what to learn, curiosity has been argued to be crucial for self-organizing temporally extended learning curricula. However, the mechanisms driving people to set intrinsic goals, when they are free to explore multiple learning activities, are still poorly understood. Computational theories propose different heuristics, including competence measures (e.g., percent correct) and learning progress, that could be used as intrinsic utility functions to efficiently organize exploration. Such intrinsic utilities constitute computationally cheap but smart heuristics to prevent people from laboring in vain on unlearnable activities, while still motivating them to self-challenge on difficult learnable activities. Here, we provide empirical evidence for these ideas by means of a free-choice experimental paradigm and computational modeling. We show that while humans rely on competence information to avoid easy tasks, models that include a learning-progress component provide the best fit to task selection data. These results bridge the research in artificial and biological curiosity, reveal strategies that are used by humans but have not been considered in computational research, and introduce tools for probing how humans become intrinsically motivated to learn and acquire interests and skills on extended time scales.

Highlights

Curiosity-driven learning is foundational to human cognition
Given the limited time and resources available for investigation, the learner must carefully select which activity to engage with to enable discovery. Formal treatment of this “strategic student” problem prescribe how learners should allocate study time to maximize learning across a set of the activities[12,13] but show that the optimal allocation is very sensitive to the shape of the expected learning trajectory, which is not available to learners in practice[12]
We found no correlation between the wPC and wLP coefficients in the IG group (Pearson correlation of normalized coefficients, IG group: r(186) = −0.077, p = 0.298); EG group: r(175) = 0.062, p = 0.399; the normalization procedure is described in Methods, Computational modeling)

Summary

Introduction

Curiosity-driven learning is foundational to human cognition. By enabling humans to autonomously decide when and what to learn, curiosity has been argued to be crucial for selforganizing temporally extended learning curricula. We show that while humans rely on competence information to avoid easy tasks, models that include a learning-progress component provide the best fit to task selection data These results bridge the research in artificial and biological curiosity, reveal strategies that are used by humans but have not been considered in computational research, and introduce tools for probing how humans become intrinsically motivated to learn and acquire interests and skills on extended time scales. The studies have shown that humans and other animals seek to obtain information as a good in itself and this preference is encoded in neural systems of reward and motivation, suggesting that information is rewarding independently of material gains[3,4,5,6] While these findings tap into the intrinsic motivation behind curiosity, they are yet to capture the full scope of curiosity-driven investigations[7]. An expanding literature proposes that people prefer intermediate difficulty[16] in a range of conditions including curiosity about trivia questions[5,17], choices among sensorimotor activities[18], infant attention[19] and esthetic appreciation[20,21]

Methods

Results

Conclusion