Abstract
The prevalence of HTTP web traffic on the Internet has long transcended the layer 7 classification, to layers such as layer 5 of the OSI model stack. This coupled with the integration-diversity of other layers and application layer protocols has made identification of user-initiated HTTP web traffic complex, thus increasing user anonymity on the Internet. This study reveals that, with the current complex nature of Internet and HTTP traffic, browser complexity, dynamic web programming structure, the surge in network delay, and unstable user behavior in network interaction, user-initiated requests can be accurately determined. The study utilizes HTTP request method of GET filtering, to develop a heuristic algorithm to identify user-initiated requests. The algorithm was experimentally tested on a group of users, to ascertain the certainty of identifying user-initiated requests. The result demonstrates that user-initiated HTTP requests can be reliably identified with a recall rate at 0.94 and F-measure at 0.969. Additionally, this study extends the paradigm of user identification based on the intrinsic characteristics of users, exhibited in network traffic. The application of these research findings finds relevance in user identification for insider investigation, e-commerce, and e-learning system as well as in network planning and management. Further, the findings from the study are relevant in web usage mining, where user-initiated action comprises the fundamental unit of measurement.
Highlights
User identification transcends the branded tags of network and application layer identifiers of the TCP/IP stack
The reliance on unique identifiers at best gives a profile of the system and generic usage profile of the user, which further complicates the complexity of unveiling anonymity
Research on user identification in areas of behavioral inference (Fan et al 2014; Yang 2010; Adeyemi et al 2014), biometric dynamics (Ernsberger et al 2017; Ikuesan and Venter 2018; Ikuesan et al 2019), and network traffic analysis (Li et al 2013a; Melnikov and Schönwälder 2010a; Adeyemi et al 2016) are methods adapted for user identification through pattern extraction
Summary
User identification transcends the branded tags of network and application layer identifiers of the TCP/IP stack. User profiling methods (Yang 2010; Adeyemi et al 2016) attempt to establish facts based on the assumed/ predefined salient features of the subject/object under observation Such behavior-knowledge-based identifiers are applicable to network traffic profiling (Li et al 2013a; Hlavacs et al 1999). This study, attempts to develop a heuristic methodology that can cut through the estimation biases in HTTP traffic modeling, through empirical observation and statistical correlation of actual user-initiated requests. To the best of our understanding, this is the first study that presents user-initiated HTTP traffic extraction, which can be directly applied to practical web usage profiling, and user behavior modeling This approach falls within the general knowledge of intelligent technology and analytics, as well as the complexities of the human and artificial systems, all within the broad domain of the intelligent systems integration.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.