Abstract

Let S be a string over a finite, ordered alphabet Σ. For any substring S ′ of S, the set of distinct characters contained in S ′ is called its fingerprint. The text fingerprinting indexing problem is to construct a data structure for the string S in advance, so that on given any input subset C of Σ, we can answer the following queries efficiently: (1) determine if C represents a fingerprint of some substrings in S; (2) find all maximal substrings of S whose fingerprint is C. The best known results solved these two queries in Θ ( | Σ | ) and Θ ( | Σ | + K ) time, respectively, where K is the number of maximal substrings. In this paper, we propose two improved algorithms for the text fingerprinting indexing problem. The first one solves the two queries in O ( | C | log n ) and O ( | C | log n + K ) time, respectively. For the second one, the query time complexities are further reduced to O ( | C | log ( | Σ | / | C | ) ) and O ( | C | log ( | Σ | / | C | ) + K ) . Both results answer an open problem proposed by Amir et al.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.