Abstract

The ever increasing number of vulnerabilities and reported attacks on Web systems clearly illustrate the need for better understanding of malicious cyber activities, which will allow better protection, detection, and service recovery in the cyberspace. In this paper we use three supervised machine learning methods, Support Vector Machines (SVM), and decision trees based J48 and PART, to classify attacker activities aimed at Web systems. The empirical analysis is based on four datasets, each in duration of four to five months, collected by high-interaction honeypots. Malicious Web sessions are characterized with forty three different features (i.e., session attributes) extracted from Web server logs. Our results show that the supervised learning methods can be used to efficiently distinguish attack sessions from vulnerability scan sessions, with very high probability of detection and very low probability of false alarms. Furthermore, we follow the principle of Occam's razor, that is, we seek for the simplest possible model that can successfully classify malicious Web sessions. Our results show that attacks differ from vulnerability scans only in a small number of features (i.e., session attributes). In particular, depending on the data set, classification of malicious activities can be performed using from four to six features without significantly affecting learners' performance compared to when all 43 features are used. Decision tree based methods J48 and PART perform better than SVM across all datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.