Abstract
Web bots vary in sophistication based on their purpose, ranging from simple automated scripts to advanced web bots that have a browser fingerprint, support the main browser functionalities, and exhibit a humanlike behaviour. Advanced web bots are especially appealing to malicious web bot creators, due to their browserlike fingerprint and humanlike behaviour that reduce their detectability. This work proposes a web bot detection framework that comprises two detection modules: (i) a detection module that utilises web logs, and (ii) a detection module that leverages mouse movements. The framework combines the results of each module in a novel way to capture the different temporal characteristics of the web logs and the mouse movements, as well as the spatial characteristics of the mouse movements. We assess its effectiveness on web bots of two levels of evasiveness: (a) moderate web bots that have a browser fingerprint and (b) advanced web bots that have a browser fingerprint and also exhibit a humanlike behaviour. We show that combining web logs with visitors’ mouse movements is more effective and robust toward detecting advanced web bots that try to evade detection, as opposed to using only one of those approaches.
Highlights
Web bots are an integral part of the web, since they allow the automation of several vital tasks, some of which would have otherwise been impossible to perform
The framework was tested in two phases: the first phase, where we evaluate the framework in its ability to detect advanced web bots as opposed to moderate ones (Section 7.1), and the second phase, where the framework is evaluated in a more real-world scenario, where suspected moderate and advanced web bots cannot be always isolated before being passed to the detection models (Section 7.2)
This work proposed a framework for detecting web bots that present a browser fingerprint and a humanlike behaviour
Summary
Web bots are an integral part of the web, since they allow the automation of several vital tasks, some of which would have otherwise been impossible to perform They are responsible for numerous browsing automation processes, such as web indexing, website monitoring (validation of hyperlinks and HTML code), data extraction for commercial purposes, and feed fetching web content. Some of these tasks require web bots to visit web servers repeatedly and, in some cases, for prolonged periods of time. Versions of web bots were simple scripts [14] These bots were easy to detect by comparing their fingerprints with the ones of common browsers. The introduction of browsing automation tools, such as Selenium, enabled the effortless creation of more sophisticated web bots that can support the majority of the features that common browsers offer as well
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.