Examining children's decisions to explore or exploit the environment provides a window into their developing metacognition and reflection capacities. Reinforcement learning, characterized by the balance between exploring new options (exploration) and utilizing known ones (exploitation), is central to this discussion. Children initially exhibit broad and intensive exploration, which gradually shifts toward exploitation as they grow. We review major theories and empirical findings, highlighting two main exploration strategies: random and directed. The former involves stochastic choices without considering information or rewards, while the latter is driven by reducing uncertainty for information gain. Behavioral tasks such as n-armed bandit, horizon, and patch foraging tasks are used to study these strategies. Findings on the n-armed bandit and horizon tasks showed mixed results on whether random exploration decreases over time. Directed exploration consistently decreases with age, but its emergence depends on task difficulty. In patch-foraging tasks, adults tend to overexploit (staying too long in one patch) and children overexplore (leaving too early), whereas adolescents display the most optimal balance. The paper also addresses open questions regarding the mechanisms supporting early exploration and the application of these strategies in real-life contexts like persistence. Future research should further investigate the relation between cognitive control, such as executive function and metacognition, and explore-exploit strategies, and examine their practical implications for adaptive learning and decision-making in children.
Read full abstract