DATA MINING IMPLEMENTATION FOR DETECTION OF ANOMALIES IN NETWORK TRAFFIC PACKETS USING OUTLIER DETECTION APPROACH

Kurnia Setiawan,Arief Wibowo

doi:10.33387/jiko.v6i2.6092

Abstract

The large number of data packet records of network traffic can be used to evaluate the quality of a network as well as to analyze the occurrence of anomalies in the network, both related to network security and network performance. Based on the data obtained, the occurrence of anomalies in computer networks can not be detected specifically on which traffic packets. Meanwhile, to monitor network traffic packets manually will require a lot of time and resources, making it difficult to detect potential anomaly events more specifically. This study analyzes network packet traffic data to see records that include anomalies with an outlier detection approach, using the Isolation Forest algorithm to detect outliers on network traffic packet data, with the result that minority data are of the outliers type of 1,643 records (4.86%), while inliers are 32,098 records (95.13%). Then check and filter the expert attributes that contain expert information. The outlier detection results were classified using 5 algorithms as comparison, namely Random Forest Classifier, Support Vector Machine, Decision Tree Classifier, K-Nearest Neighbor, and Bernoulli Naive Bayes. The Random Forest algorithm has the highest score for accuracy, macro average precision, and macro average f1-score, namely 0.9962067330488383; 0.78; and 0.82. The classification model can be used to classify samples with labels "inliers", "outliers", "Error", and "warning outliers". There are labels that have scores for precision, recall, and f1-scrore that are not too high, namely the labels â€œerrorâ€ (0.50; 1.00; and 0.67) and â€œwarning outlierâ€ (0.64; 0 .70; 0.67). The resulting classification model is used for prototype development that facilitates the process of investigating potential network traffic packet anomalies more specifically.

Full Text