Abstract

Simulation has been widely used for static system designs, but it is rarely used in making online decisions because of the time delay of executing simulation. We consider a system with stochastic binary outcomes which can be predicted via a logistic model depending on scenarios and decisions. The goal is to identify all feasible decisions conditioning on any online scenario. We propose to learn offline the relationship among scenarios, decisions, and binary outcomes. An information gradient (IG) policy is developed to sequentially allocate offline simulation budget. We show that the maximum likelihood estimator produced via IG policy is consistent and asymptotically normal. Numerical results on synthetic data and a case study demonstrate the superior performance of IG policy than benchmark policies. Moreover, we find that IG policy tends to sample the location near boundaries of design space due to its higher Fisher information, and that the time complexity of IG policy is linear to the number of design points and simulation budget.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call