Abstract

In order to improve mobile data transparency, various approaches have been proposed to inspect network traffic generated by mobile devices and detect exposure of personally identifiable information (PII), ad requests, etc. State-of-the-art approaches use features extracted from HTTP packets and train classifiers in a centralized way: users collect and label network packets on their mobile devices, then upload data to a central server; the server uses the data contributed by all users to train a packet classifier. However, training datasets from network traffic collected on user devices may contain sensitive information that users may not want to upload. In this article, we propose a federated learning approach to mobile packet classification, which enables devices to collaboratively train a global model, without uploading the training data collected on devices. We apply our framework to two packet classification tasks (i.e., to predict PII exposure or ad requests in individual packets) and we demonstrate its effectiveness in terms of classification performance, communication and computation cost, using three real-world datasets. Methodological challenges we address in the process include model and feature selection, as well as tuning the federated learning parameters specifically for our packet classification tasks. We also discuss privacy limitations and mitigation approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call