Abstract

This paper takes into account general repeated security games with no prior knowledge, i.e., the game payoffs and the attacker’s behavior model are unknown, and limited observability. Besides the traditional “regret” criterion“, reallocation times” is introduced as an additional criterion that provides a more comprehensive evaluation of the defense strategies. For such games, a novel Random-Walk Perturbations with Uniform Exploration (RWP-UE) algorithm is proposed and we deduce the corresponding upper bound of the expected regret and expected reallocation times. Theoretical analysis shows that the RWP-UE algorithm achieves not only low regret with the same magnitude as existing achievements but also fewer reallocation times. Experiments are carried out against four types of attackers, and the results illustrate that the RWP-UE algorithm achieves superior performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call