Abstract
Strömbom et al. elucidated an algorithm in which a sheepdog can skillfully control a flock of sheep to guide them to a destination. This is called the Herding Algorithm, and it models the behavior of a sheepdog in two ways: “driving”, which guides a flock of sheep to a destination, and “collecting”, which brings the sheep together into one flock. In this model, Go et al. showed that an agent (sheepdog) could herd a flock of sheep with an inference model generated by reinforcement learning (RL). However, in their previous study, RL learned only the movement behavior to the positions at which the agent performs “driving” and “collecting” in the discretized environmental state and behavioral space. In this study, we have assumed a continuous environmental state and behavioral space. We have confirmed that even if the agent's herding behavior is the learning target, the proposed inference model generated by deep RL can herd sheep.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have