Abstract

Membership determination of text strings has been an important procedure for analyzing textual data of a tremendous amount, especially when time is a crucial factor. Bloom filter has been a well-known approach for dealing with such a problem because of its succinct structure and simple determination procedure. As determination of membership with classification is becoming increasingly desirable, parallel Bloom filters are often implemented for facilitating the additional classification requirement. The parallel Bloom filters, however, tend to produce additional false-positive errors since membership determination must be performed on each of the parallel layers. We propose a scheme based on CMAC, a neural network mapping, which only requires a single-layer calculation to simultaneously obtain information of both the membership and classification. A hash function specifically designed for text strings is also proposed. The proposed scheme could effectively reduce false-positive errors by converging the range of membership acceptance to the minimum for each class during the neural network mapping. Simulation results show that the proposed scheme committed significantly less errors than the benchmark, parallel Bloom filters, with limited and identical memory usage at different classification levels.

Highlights

  • We propose a scheme based on cerebellar model articulation controller (CMAC), a neural network mapping, which only requires a single-layer calculation to simultaneously obtain information of both the membership and classification

  • Text strings are widely used as identifiers in our daily lives, such as Internet access accounts and passwords, email addresses, car license plates and credit cards, which are employed for coding parts, processes and products in manufacturing systems as well as service industries

  • To avoid multilayer membership checking for determining membership with classification and meet the performance criteria, we propose a scheme based on a neural network mapping known as the cerebellar model articulation controller (CMAC) [13], which can provide membership and classification information with a single-layer calculation

Read more

Summary

Introduction

Text strings are widely used as identifiers in our daily lives, such as Internet access accounts and passwords, email addresses, car license plates and credit cards, which are employed for coding parts, processes and products in manufacturing systems as well as service industries. A combinatorial Bloom filter was proposed using multiple sets of hash functions to code an input element into a binary array as the group identity of the element This approach, utilizing only a single-bit array, is capable of achieving membership determination with classification by using a considerable amount of hash. CMAC was first developed for controlling robotic manipulators, which was later recognized as a neural network paradigm due to its capabilities of learning and generalization This particular type of neural network mapping comprises an associative array of real numbers, while the Bloom filter is composed of bit values in a similar layer. Parallel Bloom filters versus proposed scheme string is a member with a class of whatever the corresponding layer designates Such a situation could be induced by non-member strings, which are referred to as false-positive errors. The programming or learning phase is performed in an off-line mode for obtaining a suitable array vector, while the checking phase is an on-line operation that takes query strings and provides prompt responses

The parallel Bloom filters
The proposed scheme
Experimental results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call