Abstract

Predicting internet user demographics based on traffic behavior analysis can provide effective clues for the decision making of network administrators. Nonetheless, most of the existing researches overly rely on hand-crafted features, and they also suffer from the shallowness of information mining and the limitation in prediction targets. This paper proposes Argus, a hierarchical neural network solution to the prediction of Internet user demographics through traffic analysis. Argus is a hierarchical neural-network structure composed of an autoencoder for embedding and a fully-connected net for prediction. In the embedding layer, the high-level features of the input data are learned, with a customized regularization method to enforce their discriminative power. In the classification layer, the embeddings are converted into the label predictions of the sample. An integrated loss function is provided to Argus for end-to-end learning and architecture control. Argus has exhibited promising performances in experiments based on real-world dataset, where most of the metrics outperform those achieved by common machine learning techniques on multiple prediction targets. Further experiments reveal that the integrated loss function is capable of promoting Argus performance, and the contribution of a specific loss component during the training process is validated. Empirical settings for hyper parameters are given according to the experiments.

Highlights

  • The Internet is unquestionably a colossal information bank nowadays, where a gargantuan amount of mankind’s information is deposited with no aspect of human life spared

  • Provided with the dataset containing the trace of usage, proper data mining techniques can infer the demographics of the user to a certain degree, which has motivated a wide range of research in search of better mining techniques for user behavioral information mining

  • Among the information mining techniques, traffic behavioral based prediction of Internet user demographics (TPID) leverage the analysis of captured network traffic data to build up mapping

Read more

Summary

Introduction

The Internet is unquestionably a colossal information bank nowadays, where a gargantuan amount of mankind’s information is deposited with no aspect of human life spared. The omnipresent information of internet users (abbreviated as ”user” for the rest of this paper) is so strikingly copious, that information miners such as Internet service providers (ISPs) are forced to leverage advanced mining techniques to obtain refined user information for better quality of service (QoS). Provided with the dataset containing the trace of usage (such as website access logs, network traffic and so on), proper data mining techniques can infer the demographics of the user to a certain degree, which has motivated a wide range of research in search of better mining techniques for user behavioral information mining. Among the information mining techniques, traffic behavioral based prediction of Internet user demographics (TPID) leverage the analysis of captured network traffic data to build up mapping

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.