A heuristics for HTTP traffic identification in measuring user dissimilarity

Adeyemi R Ikuesan,Mazleena Salleh,Hein S Venter,Steven M Furnell,Shukor Abd Razak

doi:10.1007/s42454-020-00010-2

Abstract

The prevalence of HTTP web traffic on the Internet has long transcended the layer 7 classification, to layers such as layer 5 of the OSI model stack. This coupled with the integration-diversity of other layers and application layer protocols has made identification of user-initiated HTTP web traffic complex, thus increasing user anonymity on the Internet. This study reveals that, with the current complex nature of Internet and HTTP traffic, browser complexity, dynamic web programming structure, the surge in network delay, and unstable user behavior in network interaction, user-initiated requests can be accurately determined. The study utilizes HTTP request method of GET filtering, to develop a heuristic algorithm to identify user-initiated requests. The algorithm was experimentally tested on a group of users, to ascertain the certainty of identifying user-initiated requests. The result demonstrates that user-initiated HTTP requests can be reliably identified with a recall rate at 0.94 and F-measure at 0.969. Additionally, this study extends the paradigm of user identification based on the intrinsic characteristics of users, exhibited in network traffic. The application of these research findings finds relevance in user identification for insider investigation, e-commerce, and e-learning system as well as in network planning and management. Further, the findings from the study are relevant in web usage mining, where user-initiated action comprises the fundamental unit of measurement.

Highlights

User identification transcends the branded tags of network and application layer identifiers of the TCP/IP stack
The reliance on unique identifiers at best gives a profile of the system and generic usage profile of the user, which further complicates the complexity of unveiling anonymity
Research on user identification in areas of behavioral inference (Fan et al 2014; Yang 2010; Adeyemi et al 2014), biometric dynamics (Ernsberger et al 2017; Ikuesan and Venter 2018; Ikuesan et al 2019), and network traffic analysis (Li et al 2013a; Melnikov and Schönwälder 2010a; Adeyemi et al 2016) are methods adapted for user identification through pattern extraction

Summary

Introduction

User identification transcends the branded tags of network and application layer identifiers of the TCP/IP stack. User profiling methods (Yang 2010; Adeyemi et al 2016) attempt to establish facts based on the assumed/ predefined salient features of the subject/object under observation Such behavior-knowledge-based identifiers are applicable to network traffic profiling (Li et al 2013a; Hlavacs et al 1999). This study, attempts to develop a heuristic methodology that can cut through the estimation biases in HTTP traffic modeling, through empirical observation and statistical correlation of actual user-initiated requests. To the best of our understanding, this is the first study that presents user-initiated HTTP traffic extraction, which can be directly applied to practical web usage profiling, and user behavior modeling This approach falls within the general knowledge of intelligent technology and analytics, as well as the complexities of the human and artificial systems, all within the broad domain of the intelligent systems integration.

Related work

Theoretical background

Intelligent-heuristic methodology

Ground truth for data analysis

Methodology for user traffic pattern dissimilarity measurement

Result and discussion

Conclusion and future work

Compliance with ethical standards

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Human-Intelligent Systems Integration	Publication Date: Jun 2, 2020
Citations: 6	License type: open-access

R Discovery Prime

R Discovery Prime

A heuristics for HTTP traffic identification in measuring user dissimilarity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Human-Intelligent Systems Integration

Lead the way for us

Similar Papers

Interval Set Clustering of Web Users with Rough K-Means
Pawan Lingras ... Chad West
Journal of Intelligent Information Systems | VOL. 23
Pawan Lingras, et. al.Pawan Lingras ... Chad West
01 Jul 2004
Journal of Intelligent Information Systems | VOL. 23

Performance Evaluation of the MapReduce-based Parallel Data Preprocessing Algorithm in Web Usage Mining with Robot Detection Approaches
Mitali Srivastava ... P K Mishra
IETE Technical Review | VOL. 39
Mitali Srivastava, et. al.Mitali Srivastava ... P K Mishra
28 Apr 2021
IETE Technical Review | VOL. 39

Techniques for Understanding User Usage Behavior on the Internet
Abhijit R Joshi ... Aparna Ranade-Halbe
International Journal of Computer Applications | VOL. 92
Abhijit R Joshi, et. al.Abhijit R Joshi ... Aparna Ranade-Halbe
18 Apr 2014
International Journal of Computer Applications | VOL. 92

Research on the Application of Web Mining in E-Learning System
Long Wang ... Liangdong Qu
-
Long Wang, et. al.Long Wang ... Liangdong Qu
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A heuristics for HTTP traffic identification in measuring user dissimilarity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Human-Intelligent Systems Integration