Abstract

In the Web of Things (WoT) environment, Web traffic logs contain valuable information of how people interact with smart devices and Web servers. Mining the wealth of information available in the Web access logs has theoretical and practical significance for many important applications like network optimization and security management. The first critical step of the mining task is modeling the relationships among HyperText Transfer Protocol (HTTP) requests for accessing Web objects to investigate the behavior of Web clients. In this paper, we introduce the request dependency graph (RDG), a graph representation of the relationships among HTTP requests. Conceptually, a directed link from A to B in the graph means that the accessing of Web object B is caused by the accessing of A, i.e., B depends on A. We propose a methodology to establish such a graph by mining the temporal and causal information among aggregated HTTP requests. To demonstrate the value and effectiveness of the proposed model, we design and implement an algorithm for primary requests identification, which is a critical task of Web usage mining, based on the RDG. Evaluation results from a large-scale real-world Web access log shows that the RDG is a useful tool for Web usage mining.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.