Animals learn both whether and when a reward will occur. Neural models of timing posit that animals learn the mean time until reward perturbed by a fixed relative uncertainty. Nonetheless, animals can learn to perform actions for reward even in highly variable natural environments. Optimal inference in the presence of variable information requires probabilistic models, yet it is unclear whether animals can infer such models for reward timing. Here, we develop a behavioral paradigm in which optimal performance required knowledge of the distribution from which reward delays were chosen. We found that mice were able to accurately adjust their behavior to the SD of the reward delay distribution. Importantly, mice were able to flexibly adjust the amount of prior information used for inference according to the moment-by-moment demands of the task. The ability to infer probabilistic models for timing may allow mice to adapt to complex and dynamic natural environments.