Solving the Credit Assignment Problem: The Interaction of Explicit and Implicit Learning with Internal and External State Information

Wai-Tat Fu ,John R Anderson

doi:10.1184/r1/6618194.v1

Wai-Tat Fu , John R Anderson

PDF Available

https://doi.org/10.1184/r1/6618194.v1

Copy DOI

Export

Save

Cite

Publication Date: Apr 17, 2017

Citations: 3

Affiliation: Carnegie Mellon University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Solving the Credit Assignment Problem: The interaction of Explicit and Implicit learning with Internal and External State Information Wai-Tat Fu (wfu@cmu.edu) Human Factors Division and Beckman Institute University of Illinois at Urbana-Champaign 1 Airport Road, Savoy, IL 61874, USA John R. Anderson (ja+@cmu.edu) Department of Psychology Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 USA recognition of external state information (signs on the walls). Indeed, many have argued that real-world skills often involve the interplay between cognition (internal), perception, and action (external) that the understanding of these interactive skills requires careful study of how internal (memory) and external information (cues in the environment) are processed in the learning processes (Ballard, 1997; Fu & Gray, 2000; 2004; Gray & Fu, 2004; Larkin, 1989; Gray, Sims, Fu, & Schoelles, in press). The navigation problem above is an example of one of the most difficult situations in skill learning: when the learner has to perform a sequence of actions but only gets feedback on their success at the end of the sequence (e.g., when the destination is reached). This creates a credit-assignment problem, in which the learner has to assign credits to earlier actions that are responsible for eventual success. When actions are interdependent, either memory of previous actions or recognition of the correct problem state in the external environment is required to properly assign credits to the appropriate actions. In this article, we present results from an experiment in which we study how people learn to solve the credit-assignment problem in a simple but challenging example of such a situation. Our focus is on the recent proposal that humans exhibit two distinct learning processes and we apply it to learning of action sequences with delayed feedback: an explicit process (with awareness) that requires memory for actions and outcomes, and an implicit process (without awareness) that does not require such memory. We will first review research in some related areas that informed the design of our experiment. Abstract In most problem-solving activities, feedback is received at the end of an action sequence. This creates a credit-assignment problem where the learner must associate the feedback with earlier actions, and the interdependencies of actions require the learner to either remember past choices of actions (internal state information) or rely on external cues in the environment (external state information) to select the right actions. We investigated the nature of explicit and implicit learning processes in the credit-assignment problem using a probabilistic sequential choice task with and without external state information. We found that when explicit memory encoding was dominant, subjects were faster to select the better option in their first choices than in the last choices; when implicit reinforcement learning was dominant subjects were faster to select the better option in their last choices than in their first choices. However, implicit reinforcement learning was only successful when distinct external state information was available. The results suggest the nature of learning in credit assignment: an explicit memory encoding process that keeps track of internal state information and a reinforcement-learning process that uses state information to propagate reinforcement backwards to previous choices. However, the implicit reinforcement learning process is effective only when the valences can be attributed to the appropriate states in the system – either internally generated states in the cognitive system or externally presented stimuli in the environment. Introduction Consider a person navigating in a large office building. The person has to decide when to turn left or right at various hallway intersections. The sequence of decisions is interdependent – e.g., turning left at a particular hallway intersection will affect the decisions at the next intersections. The person may therefore need to keep track of previous actions to inform what actions to take in the future. In reality, memory of previous actions (internal state information) may not be necessary as people can explicitly seek information in the environment (external state information) to know where one is located or which direction to go to reach a destination (Fu & Gray, 2006). Learning to navigate is therefore likely to involve both the retention of internal state information (memory) and the Explicit and Implicit Learning Probability Learning and Classification There have been numerous studies on the learning of the probabilistic relationship between choices and their consequences. The simplest situation is the probability- learning experiment in which subjects guess which of the alternatives occurs and then receives feedback on their guesses (e.g., Estes, 1964). One robust finding is that subjects often “probability match”; that is, they will choose a particular alternative with the same probability that it is reinforced (e.g., Friedman et al., 1964). This leads many to propose that probability matching is the result of an implicit

Full Text