Abstract

Website hacking is a frequent attack type used by malicious actors to obtain confidential information, modify the integrity of web pages or make websites unavailable. The tools used by attackers are becoming more and more automated and sophisticated, and malicious machine learning agents seem to be the next development in this line. In order to provide ethical hackers with similar tools, and to understand the impact and the limitations of artificial agents, we present in this paper a model that formalizes web hacking tasks for reinforcement learning agents. Our model, named Agent Web Model, considers web hacking as a capture-the-flag style challenge, and it defines reinforcement learning problems at seven different levels of abstraction. We discuss the complexity of these problems in terms of actions and states an agent has to deal with, and we show that such a model allows to represent most of the relevant web vulnerabilities. Aware that the driver of advances in reinforcement learning is the availability of standardized challenges, we provide an implementation for the first three abstraction layers, in the hope that the community would consider these challenges in order to develop intelligent web hacking agents.

Highlights

  • As the complexity of computer systems and networks significantly increased during the last decades, the number of vulnerabilities inside a system has increased in a similar manner

  • Several attacks aim to break confidentiality by accessing sensitive or confidential information; others aim at compromising integrity, either to cause damage and annoyance or as a preparatory step before carrying out further action; and, attacks may target the availability of a service, for instance, overloading a web service with many requests in order to cause a denial of service (DOS)

  • We presented a model, named Agent Web Model, that defines web hacking at different levels of abstraction

Read more

Summary

Introduction

As the complexity of computer systems and networks significantly increased during the last decades, the number of vulnerabilities inside a system has increased in a similar manner. Since it is inevitable that AI and ML will be applied in offensive security, developing a sound understanding of the main characteristics and limitations of such tools will be helpful to be prepared against such attacks Such autonomous web hacking agents will be useful for human white hat hackers in carrying out legal penetration testing tasks and replacing the labor-intensive and expensive work of human experts. Aware that a strong and effective driver for the development of new and successful reinforcement learning agents is the availability of standardized challenges and benchmarks, we use our formalization to implement a series of challenges at different level of abstractions and with increasing complexity We make these challenges available following the standards of the field.

Web hacking
Capture the flag
Reinforcement learning
Related work
From web hacking to CTF
From CTF to a game
From a game to a RL problem
The Agent Web Model
Levels of abstraction
Level1: link layer
Level2: hidden link layer
Result
Level3: dynamic content layer
Level4: web method layer
Level5
Level6: server structure layer
Level7: server modification layer
Disclosure
Implementation of the Agent Web Model
Ethical considerations
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.