There are rising concerns that reinforcement algorithms might learn tacit collusion in oligopolistic pricing, and moreover that the resulting ‘black box’ strategies would be difficult to regulate. Here, I exploit a strong connection between evolutionary game theory and reinforcement learning to show when the latter’s rest points are Bayes–Nash equilibria, but also to derive a system of Pigouvian taxes guaranteed to implement an (unknown) socially optimal outcome of an oligopoly pricing game. Finally, I illustrate reinforcement learning of equilibrium play via simulation, which provides evidence of the capacity of reinforcement algorithms to collude in a very simple setting, but the introduction of the optimal tax scheme induces a competitive outcome.
Read full abstract