Abstract

According to the critical requirements of the Internet, a wide range of privacy-preserving technologies are available, e.g. proxy sites, virtual private networks, and anonymity tools. Such mechanisms are challenged by traffic-classification endeavors which are crucial for network-management tasks and have recently become a milestone in their privacy-degree assessment, both from attacker and designer standpoints. Further, the new Internet era is characterized by the capillary distribution of smart devices leveraging high-capacity communication infrastructures: this results in huge amount of heterogeneous network traffic, i.e. big data. Hence, herein we present BDeH, a novel hierarchical framework for traffic classification of anonymity tools. BDeH is enabled by big data-paradigm and capitalizes the machine learning workhorse for operating with encrypted traffic. In detail, our proposal allows for seamless integration of data parallelism provided by big-data technologies with model parallelism enabled by hierarchical approaches. Results prove that the so-achieved double parallelism carries no negative impact on traffic-classification effectiveness at any granularity level and achieves non negligible performance enhancements with respect to non-hierarchical architectures ( $+4.5\%$ F-measure). Also, it significantly gains over either pure data or pure model parallelism (resp. centralized) approaches by reducing both training completion time—up to $78\%$ (resp. $90\%$ )—and cloud-deployment cost—up to $31\%$ (resp. $10\%$ ).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.