Abstract

There is evidence that humans can be more efficient than existing algorithms at searching for good solutions in high-dimensional and non-convex design or control spaces, potentially due to our prior knowledge and learning capability. This work attempts to quantify the search strategy of human beings to enhance a Bayesian optimization (BO) algorithm for an optimal design and control problem. We consider the sequence of human solutions as generated from BO, and propose to recover the algorithmic parameters of BO by maximizing the likelihood of the observed solution path. The method is different from inverse reinforcement learning (where an optimal control solution is learned based on human demonstrations) in that the latter requires near-optimal solutions from humans, while we only require the existence of a good search strategy. The method is first verified through simulation studies and then applied to the human solutions crowdsourced through a gamification of the problem under study [1]. We learn BO parameters from a player with a demonstrated good search strategy and show that applying the BO algorithm with these parameters to the game noticeably improves the convergence of the search from using a default BO setting.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call