Abstract
Summary Compressed pattern matching (CPM) refers to the task of locating all the occurrences of a pattern (or set of patterns) inside the body of compressed text. In this type of matching, pattern may or may not be compressed. CPM is very useful in handling large volume of data especially over the network. It has many applications in computational biology, where it is useful in finding similar trends in DNA sequences; intrusion detection over the networks, big data analytics etc. Various solutions have been provided by researchers where pattern is matched directly over the uncompressed text. Such solution requires lot of space and consumes lot of time when handling the big data. Various researchers have proposed the efficient solutions for compression but very few exist for pattern matching over the compressed text. Considering the future trend where data size is increasing exponentially day-by-day, CPM has become a desirable task. This paper presents a critical review on the recent techniques on the compressed pattern matching. The covered techniques includes: Word based Huffman codes, Word Based Tagged Codes; Wavelet Tree Based Indexing. We have presented a comparative analysis of all the techniques mentioned above and highlighted their advantages and disadvantages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.