Abstract

Behavioural malware detection is a field where malware is detected by its behaviour rather than analysing the code of the program (binary calls etc.). AVG company provided us with a dataset containing behavioural features in order to improve their linear binary classifier. This classifier was a linear classifier, that needed to have false positive rate (FPR) on a subset of processes lower than 0.05%.This paper proposes an efficient feature representation for the training of this classifier on large-scale datasets using Support Vector Machines (SVM). We invented a memory efficient feature representation that can deal with large-scale datasets on a single machine and experimentally shown that the training times are even better than less efficient feature representations. Also we successfully created linear classifier, that has better true positive rate (TPR) on every operating point.The results are publicly available in the library LIBOCAS which is available on the internet as an open-source program and our research has updated it with binary feature handling methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call