Abstract

Domain-general learning rules often enable decision makers to learn from outcome feedback which actions tend to achieve a desired goal. However, in novel and complex environments decision makers must explore how to learn, i.e., acquire procedural knowledge of how to elicit and evaluate outcome feedback that will enable them to navigate toward a desired goal despite the vastness of the set of possible policies. Using a dynamic business simulation, this study investigated: (1) whether and how frequently participants discovered an effective procedure to learn from outcome feedback that allowed them to navigate toward a policy that maximizes long-term business profit (and hence their monetary payoff from the experiment), and (2) whether high monetary incentives affected learning procedures and performance. We found that a number of participants discovered an effective learning procedure and succeeded in approximating the optimal policy. In line with the heuristic method, this learning procedure involved a simplification of the search space and the application of domain-general learning rules to this simplified space. Although the decision histories of about half of the participants feature the key aspect of the effective learning procedure—search among the different steady states of the dynamical system—implementation errors prevented many of the participants from realizing the full potential of the learning procedure. We found no evidence to suggest that high monetary incentives affect the effectiveness of learning. Overall, the study illustrates that a “prepared mind” can discover new, effective learning procedures, although their initial implementation may require substantial refinement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call