Abstract
Anomaly detection has been researched in different areas and application domains. The main difficulty is to identify the outliers from the normals in case of encountering an input that has unique features and new values. In order to accomplish this task, the research focusses on using Machine Learning and Deep Learning techniques. In the world of the Internet, we are facing a similar problem to identify whether a website request contains malicious activity or just a normal request. Web Application Firewall (WAF) systems provide such protection against malicious requests using a rule based approach. In recent years, anomaly based solutions have been integrated in addition to rule based systems. Still, such solutions can only provide security up to a point and such techniques can generate false-positive results that leave the backend systems vulnerable and most of the time rules based protection can be bypassed with simple tricks (eg. encoding, obfuscation). The main focus of the research is WAF systems that employ single and stacked LSTM layers which are based on character sequences of user supplied data and revealing hyper-parameter values for optimal results. A semi-supervised approach is used and trained with PayloadAllTheThings dataset containing real attack payloads and only normal payloads of HTTP Dataset CSIC 2010 are used. The success rate of the technique - whether the user input is identified as malicious or normal - is measured using F1 scores. The proposed model demonstrated high F1 scores and success in terms of detection and classification of the attacks.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have