Abstract

Accurate network traffic classification at early stage is very important for 5G network applications. During the last few years, researchers endeavored hard to propose effective machine learning model for classification of Internet traffic applications at early stage with few packets. Nevertheless, this essential problem still needs to be studied profoundly to find out effective packet number as well as effective machine learning (ML) model. In this paper, we tried to solve the above-mentioned problem. For this purpose, five Internet traffic datasets are utilized. Initially, we extract packet size of 20 packets and then mutual information analysis is carried out to find out the mutual information of each packet onnflow type. Thereafter, we execute 10 well-known machine learning algorithms using crossover classification method. Two statistical analysis tests, Friedman and Wilcoxon pairwise tests, are applied for the experimental results. Moreover, we also apply the statistical tests for classifiers to find out effective ML classifier. Our experimental results show that 13–19 packets are the effective packet numbers for 5G IM WeChat application at early stage network traffic classification. We also find out effective ML classifier, where Random Forest ML classifier is effective classifier at early stage Internet traffic classification.

Highlights

  • During the last few years, early stage Internet traffic classification received a lot of importance in the area of network traffic classification, from the perspective of features extraction technique, mostly researcher’s proposed machine learning models, which were based on features extraction on a whole network flow in [1,2,3]

  • We will explain the mutual information analysis results of HIT Trace I dataset including four subdatasets and HIT Trace II dataset results, give the result analysis of applied methods to validate the effectiveness of packets, and lastly give the results of statistical test for effective machine learning (ML) classifier

  • It is observed in the result that all the applied classifiers got continuously increment results using all the number of packets except Support Vector Machine (SMO) classifiers, continuously giving random results using all numbers of packets, while OneR classifiers give very poor identification result in the first 12 packets and their results are continuously increasing

Read more

Summary

Introduction

During the last few years, early stage Internet traffic classification received a lot of importance in the area of network traffic classification, from the perspective of features extraction technique, mostly researcher’s proposed machine learning models, which were based on features extraction on a whole network flow in [1,2,3]. In 2005, Moore et al in [4] presented a feature extraction method which is followed by many researchers for features extraction for their research. They used the whole flow traffic and extracted 248 statistical features, such as the packet sizes and maximum, minimum, and average statistical feature values. Machine learning classifiers can get very effective performance results using these statistical features [5] These features extraction methods showed high performance results in the identification of anomaly detection [6]. These feature extraction methods are not very effective in reality. It is very important to classify Internet traffic at early stage keeping in view of the security policies and quality of service (QoS)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call