Risk preferences of learning algorithms

Andreas Haupt,Aroon Narayanan

doi:10.1016/j.geb.2024.09.013

Andreas Haupt, Aroon Narayanan

https://doi.org/10.1016/j.geb.2024.09.013

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Many economic decision-makers today rely on learning algorithms for important decisions. This paper shows that a widely used learning algorithm—ε-Greedy—exhibits emergent risk aversion, favoring actions with lower payoff variance. When presented with actions of the same expectated payoff, under a wide range of conditions, ε-Greedy chooses the lower-variance action with probability approaching one. This emergent preference can have wide-ranging consequences, from inequity to homogenization, and holds transiently even when the higher-variance action has a strictly higher expected payoff. We discuss two methods to restore risk neutrality. The first method reweights data as a function of how likely an action is chosen. The second method employs optimistic payoff estimates for actions that have not been taken often.

Full Text