An integrated method for real time and offline web robot detection

Derek Doran,Swapna S Gokhale

doi:10.1111/exsy.12184

Derek Doran, Swapna S Gokhale

Open Access

https://doi.org/10.1111/exsy.12184

Copy DOI

Journal: Expert Systems	Publication Date: Nov 8, 2016
Citations: 30	License type: publisher-specific, author manuscript

Affiliation: Wright State University, University of Connecticut

Abstract

AbstractRecent academic and industry reports confirm that web robots dominate the traffic seen by web servers across the Internet. Because web robots crawl in an unregulated fashion, they may threaten the privacy, function, performance, and security of web servers. There is therefore a growing need to be able to identify robot visitors automatically, in offline and in real time, to assess their impact and to potentially protect web servers from abusive bots. Yet contemporary detection approaches, which rely on syntactic log analysis, finding statistical variations between robot and human traffic, analytical learning techniques, or complex software modifications may not be realistic to implement or remain effective as the behavior of robots evolve over time. Instead, this paper presents a novel detection approach that relies on the differences in the resource request patterns of web robots and humans. It rationalizes why differences in resource request patterns are expected to remain intrinsic to robots and humans despite the continuous evolution of their traffic. The performance of the approach, adoptable for both offline and real time settings with a simple implementation, is demonstrated by playing back streams of actual web traffic with varying session lengths and proportions of robot requests.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An integrated method for real time and offline web robot detection

Abstract

Talk to us

Similar Papers

More From: Expert Systems

Lead the way for us

Similar Papers

A Density Based Clustering Approach to Distinguish Between Web Robot and Human Requests to a Web Server
...
-
, et. al. ...
19 Nov 2014
19 Nov 2014

Web robot detection techniques: overview and limitations
Derek Doran ... Swapna S Gokhale
Data Mining and Knowledge Discovery | VOL. 22
Derek Doran, et. al.Derek Doran ... Swapna S Gokhale
26 Jun 2010
Data Mining and Knowledge Discovery | VOL. 22

Web services enterprise security architecture
Carlos Gutiérrez ... Eduardo Fernández-Medina
-
Carlos Gutiérrez, et. al.Carlos Gutiérrez ... Eduardo Fernández-Medina
11 Nov 2005
11 Nov 2005

정확도 높은 검색 엔진을 위한 문서 수집 방법
Eun-Yong Ha ... Ho-Yeong Hwang
The KIPS Transactions:PartA | VOL. 10A
Eun-Yong Ha, et. al.Eun-Yong Ha ... Ho-Yeong Hwang
01 Oct 2003
The KIPS Transactions:PartA | VOL. 10A

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An integrated method for real time and offline web robot detection

Abstract

Talk to us

Similar Papers

More From: Expert Systems