Abstract

Real-time Internet traffic flow classification is important in managing network resources in accordance to Quality of Service (QoS) requirements. The centralized network’s control in Software Defined Networking (SDN) provides a platform for Internet Service Provider (ISP) to perform specific actions on the classified flows through routing and scheduling. Though machine learning (ML) can be the alternative to Deep Packet Inspection (DPI) in classifying SDN traffic flows, several problems, such as classifier’s accuracy, computational complexity, multi-class imbalanced data, and concept drift, need to be addressed in order to have a reliable solution. Therefore, this work has proposed a hybrid filter-wrapper feature selection (FS) algorithm, named Filter-Wrapper Feature Selection (FWFS). The algorithm selects robust features that represent minority classes and resistant to concept drift and is also computationally inexpensive by discarding irrelevant features before further processing with wrapper function. Based on the performance evaluation, the feature selection process of FWFS is computationally inexpensive; i.e. 59.6s, which produces a classifier with an overall accuracy of 98.9%. The result is better than state-of-the-art FS algorithm, Efficient Feature Optimization Approach (EFOA) which requires >400s to select features which can produced a classifier with 97.7% accuracy. In addition to the high overall accuracy, the classifier trained with features selected by FWFS has better F-measure values for each classes including minority classes; i.e. >0.8 in MULTIMEDIA and INTERACTIVE which consist only 0.15% and 0.03% instances, respectively, of the total 377,526 instances in the dataset. Furthermore, the classifier is stable and reliable for classifying new data; i.e. 98.7% accuracy for classifying new data and F-measure of more than 0.8 in every class. The classifier model will be embedded in the SDN-ISP traffic classification solution which provides insights for resource allocations and traffic scheduling in the network.

Highlights

  • Quality of Service (QoS) management requires accurate Internet traffic classification in order to manage network resources effectively

  • The idea of redundancy is not measured based on correlation metrics, but depends on the learning algorithm, i.e. the feature is selected if it has additional information reflected in the increased accuracy when adding it to the selected feature subset

  • The overall accuracy of both Filter-Wrapper Feature Selection (FWFS) and Fan and Liu [7] is comparable; 98.9% and 98.1% respectively. Comparing their per class performance shows that the F-measure by FWFS is slightly lower in P2P, DATABASE, and MULTIMEDIA

Read more

Summary

INTRODUCTION

Quality of Service (QoS) management requires accurate Internet traffic classification in order to manage network resources effectively. Internet traffic classification has been rigorously studied, there are two main issues that still need to be solved, namely multi-class imbalanced data and concept drift The approaches to both problems can be on data level and algorithm level, which involves data pre-processing and algorithms design respectively. The authors in [22]–[24] tackled this problem by selecting robust features while [25]–[27] focused on designing improved algorithms to ensure classifier’s performance over a long period of time In this work, these problems will be addressed on the data level by proposing a hybrid feature selection (FS) algorithm. The contributions of this work are as follows: 1) Proposed a hybrid filter-wrapper feature selection algorithm, named FWFS, which selects robust features for network traffic classification.

RELATED WORKS
PERFORMANCE EVALUATIONS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call