Abstract

Attacks against websites are increasing rapidly with the expansion of web services. More and more diversified web services make it difficult to prevent such attacks due to many known vulnerabilities in websites. To overcome this problem, it is necessary to collect latest attacks using decoy web honeypots and to implement countermeasures against malicious threats. Web honeypots collect not only malicious accesses by attackers but also benign accesses such as those by web search crawlers. Thus, it is essential to develop a means of identifying malicious accesses automatically from mixed collected data including both malicious and benign accesses. In this study, we have focused on detection of crawlers whose accesses has been increasing rapidly. A related study proposed a crawler detection scheme in which crawlers are identified based on the features of well-known crawlers such as Google crawlers. However, the diversity of crawler accesses has been increasing rapidly, and adapting to that diversity is a challenging task. Therefore, we have adapted AntTree, a bio-inspired clustering scheme that has high scalability and adaptability, for crawler detection. Through our evaluations using data collected in a real network, we show that AntTree can detect crawlers more precisely than a conventional scheme.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call