Abstract

Sophisticated Web robots sport a wide variety of functionality and visiting characteristics, constituting a significant percentage of the requests serviced by a Web server. Unlike human clients that retrieve information off a site by navigating links and ignoring irrelevant information, Web robots may collect many different types of resources, and employ varying navigation strategies to find the knowledge on the site they desire. Thus, the resource request patterns of their visits are unpredictable and cannot be inferred based on our knowledge of human request patterns. In this paper, we perform an analysis on the types of resources requested by Web robots using recent Web logs from an academic Web server. We study the distribution of response sizes and response codes, the types of resources requested, and popularity of resources for requests from Web robots. Throughout, we contrast our findings against human resource request patterns. We find reasons to suggest that robots severely handicaps the ability of Web server caches to operate with high performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.