Abstract

Models of learning and experimentation based on two-armed Poisson bandits addressed several important aspects related to strategic and motivational learning, but they are not suitable to study effects that accumulate over time. We propose a new class of models of strategic experimentation which are almost as tractable as exponential models, but incorporate such realistic features as dependence of the expected rate of news arrival on the time elapsed since the start of an experiment. In these models, the experiment is stopped before news is realized whenever the rate of arrival of news reaches a critical level. This leads to longer experimentation times for experiments with possible breakthroughs than for equivalent experiments with failures. In experimentation models with multiple players, either no player stops before the first failure is observed, or all players stop simultaneously before the first failure. We also demonstrate a crowding out effect in models with profitable breakthroughs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call