Abstract

Text analytics has become increasingly important in the past few years because of the substantial growth in the amount of research, business, and government needs. An efficient text analytics system is likely to require high-powered regular expression matching (REGEX), as REGEX operations dominate the whole execution time. Some approaches have exploited the parallelism of graphic processing units (GPUs) and field-programmable logic arrays (FPGAs) to boost REGEX's performance. Nevertheless, those approaches still used finite-state automaton to detect the given patterns while automation structure is naturally inadequate for parallel processing. In this paper, we propose a completely different hardware architecture of REGEX that employs a bitmap index instead of the finite-state automaton. Internal logic gates/registers and embedded memory of FPGA are used to construct the query processing units and a bitmap index, respectively. The experimental results on an Intel Arria V FPGA prove that our REGEX is fully operational at 100 MHz and can process a 64-character query inside a 64-KB text data within 43.76 μs. The throughput achieved, therefore, reaches 11.98 Gbps.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call