Internet Traffic Classification with Federated Learning

Hyunsu Mun,Youngseok Lee

doi:10.3390/electronics10010027

Abstract

As Internet traffic classification is a typical problem for ISPs or mobile carriers, there have been a lot of studies based on statistical packet header information, deep packet inspection, or machine learning. Due to recent advances in end-to-end encryption and dynamic port policies, machine or deep learning has been an essential key to improve the accuracy of packet classification. In addition, ISPs or mobile carriers should carefully deal with the privacy issue while collecting user packets for accounting or security. The recent development of distributed machine learning, called federated learning, collaboratively carries out machine learning jobs on the clients without uploading data to a central server. Although federated learning provides an on-device learning framework towards user privacy protection, its feasibility and performance of Internet traffic classification have not been fully examined. In this paper, we propose a federated-learning traffic classification protocol (FLIC), which can achieve an accuracy comparable to centralized deep learning for Internet application identification without privacy leakage. FLIC can classify new applications on-the-fly when a participant joins in learning with a new application, which has not been done in previous works. By implementing the prototype of FLIC clients and a server with TensorFlow, the clients gather packets, perform the on-device training job and exchange the training results with the FLIC server. In addition, we demonstrate that federated learning-based packet classification achieves an accuracy of 88% under non-independent and identically distributed (non-IID) traffic across clients. When a new application that can be classified dynamically as a client participates in learning was added, an accuracy of 92% was achieved.

Highlights

Internet traffic classification is a representative research topic that has been significantly studied
As federated learning relies on stochastic gradient descent (SGD) for optimization, the objective of federated-learning traffic classification protocol (FLIC) is given in Equation (1)
Even if the total amount of data distributed to all clients is the same, in the four-class non-independent and identically distributed (non-individual distribution (IID)) experiment distributed to more clients (Figure 11b), FLIC classifies applications with a small amount of data better

Summary

Introduction

Internet traffic classification is a representative research topic that has been significantly studied. While machine learning is useful for Internet traffic classification under packet encryption, ISPs or mobile carriers need to collect application packets and their information from users for the training process. The central server computes only the aggregated average of the training data gathered from each device. It protects privacy as the user data resides on the device. Federated learning is promising because of privacy concerns, it has not been fully understood for the feasibility of Internet traffic classification under the realistic data model with unbalanced, independent and individual distribution (IID) characteristics. By considering the above requirements, we propose a federated-learning Internet traffic classification framework (FLIC) that can label packets into applications dynamically. In the environment of non-IID traffic distribution and of dynamically increasing clients, FLIC achieved 88% and 92% accuracy, respectively

Related Work

Architecture

Protocol

Training Model and Feature Vector

Federated Optimization

Dynamic Classification

Performance Evaluation

Datasets

AIM SCP

Accuracy by the Number of FLIC Participants

Accuracy under Non-IID Traffic Distribution

Accuracy by Clients Local Epochs

Dynamic Traffic Classification with Federated Learning

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Dec 28, 2020
Citations: 28	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Internet Traffic Classification with Federated Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Machine Learning Feature Based Job Scheduling for Distributed Machine Learning Clusters
Haoyu Wang ... Haiying Shen
IEEE/ACM Transactions on Networking | VOL. 31
Haoyu Wang, et. al.Haoyu Wang ... Haiying Shen
01 Feb 2023
IEEE/ACM Transactions on Networking | VOL. 31

Job scheduling for large-scale machine learning clusters
Haoyu Wang ... Zetian Liu
-
Haoyu Wang, et. al.Haoyu Wang ... Zetian Liu
23 Nov 2020
23 Nov 2020

A review on machine learning–based approaches for Internet traffic classification
Ola Salman ... Imad H Elhajj
Annals of Telecommunications | VOL. 75
Ola Salman, et. al.Ola Salman ... Imad H Elhajj
22 Jun 2020
Annals of Telecommunications | VOL. 75

Cross-Layer Self-Similar Coflow Scheduling for Machine Learning Clusters
Guang Yang ... Mingwei Xu
-
Guang Yang, et. al.Guang Yang ... Mingwei Xu
01 Jul 2018
01 Jul 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Internet Traffic Classification with Federated Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics