Abstract

Web log data analysis is important in intrusion detection. Various machine learning techniques have been applied. However, compared to abundant researches on machine learning, ways to extract features from log data are still under research. In this paper, we present an effective feature extraction approach by leveraging Byte Pair Encoding (BPE) and Term Frequency-Inverse Document Frequency (TF-IDF). We have applied this approach on various downstream machine learning algorithms and proved its usefulness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call