Abstract

In this paper, based on our previous multi-pattern uniform resource locator (URL) binary-matching algorithm called HEM, we propose an improved multi-pattern matching algorithm called MH that is based on hash tables and binary tables. The MH algorithm can be applied to the fields of network security, data analysis, load balancing, cloud robotic communications, and so on—all of which require string matching from a fixed starting position. Our approach effectively solves the performance problems of the classical multi-pattern matching algorithms. This paper explores ways to improve string matching performance under the HTTP protocol by using a hash method combined with a binary method that transforms the symbol-space matching problem into a digital-space numerical-size comparison and hashing problem. The MH approach has a fast matching speed, requires little memory, performs better than both the classical algorithms and HEM for matching fields in an HTTP stream, and it has great promise for use in real-world applications.

Highlights

  • Multiple-pattern string matching algorithms based on uniform resource locator (URL) rule sets are widely used in firewall, network traffic analysis, data acquisition, web server load balancing, firewall blacklists, e-mail classification, spam detection, intrusion detection, URLbased content classification [1], and other fields

  • Robotics research has widened from an original focus on a single robot to areas of controlling multiple robots simultaneously, into swarm robotics [2], network robotics [3] and cloud robotics [4,5,6,7]

  • MH proposes using a hash method combined with a binary search method to match the target data, an approach that improves the matching speed while requiring less memory overhead

Read more

Summary

Introduction

Multiple-pattern string matching algorithms based on uniform resource locator (URL) rule sets are widely used in firewall, network traffic analysis, data acquisition, web server load balancing, firewall blacklists, e-mail classification, spam detection, intrusion detection, URLbased content classification [1], and other fields. These robotics areas have similar requirements for performing URL or string matching based on specific protocols. As the network data-flow rate has increased year over year, each of these areas require an algorithm that can conform to tens of thousands or even millions of rules while still achieving a processing capacity of 10 Gbps. A multi-pattern hash-binary hybrid algorithm for URL matching in the HTTP protocol with such demands. We conducted a series of explorations into multi-pattern string matching based on the characteristics of the HTTP protocol

Introduction to the HTTP protocol
Related work
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.