Abstract

Advanced persistent threat (APT) attackers apply multiple sophisticated methods to continuously and stealthily steal information from the targeted cloud storage systems and can even induce the storage system to apply a specific defense strategy and attack it accordingly. In this paper, the interactions between an APT attacker and a defender allocating their central processing units (CPUs) over multiple storage devices in a cloud storage system are formulated as a Colonel Blotto game. The Nash equilibria of the CPU allocation game are derived for both symmetric and asymmetric CPUs between the APT attacker and the defender to evaluate how the limited CPU resources, the data storage size and the number of storage devices impact the expected data protection level and the utility of the cloud storage system. A CPU allocation scheme based on “hotbooting” policy hill-climbing that exploits the experiences in similar scenarios to initialize the quality values to accelerate the learning speed is proposed for the defender to achieve the optimal APT defense performance in the dynamic game without being aware of the APT attack model and the data storage model. A hotbooting deep ${Q}$ -network-based CPU allocation scheme further improves the APT detection performance for the case with a large number of CPUs and storage devices. Simulation results show that our proposed reinforcement learning-based CPU allocation can improve both the data protection level and the utility of the cloud storage system compared with the ${Q}$ -learning-based CPU allocation against APTs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call