Abstract. As penetration testing is a crucial technique for the rapid evolution of cybersecurity to identify system vulnerabilities, traditional penetration testing methods face the difficulty of large amounts of manual labor and professional knowledge, which makes them difficult to scale. This paper proposes a highly automated and efficient penetration testing approach based on the MASK-SALT-DQN algorithm. MASK-SALT-DQN is a reinforcement learning model that effectively optimizes penetration path planning by valid action masking and sample enhancement. Through valid action masking, the MASK-SALT-DQN model filters out redundant and invalid actions in the solution space, reducing the complexity of the action space and improving the convergence speed and efficiency. In addition, the sample enhancement method increases the frequency of critical exploitation actions and positive rewards in the sparse-reward environment, speeding up the learning process of the agent. Experiments have been conducted on the NASim attack simulation network to demonstrate the advantages of the proposed model over the baseline method. The experimental results show that our model has a significant performance advantage over baseline methods in large networks, and it can be applied to the field of scalable and efficient automated penetration testing.
Read full abstract