Abstract

AbstractThis paper revisits the problem of indexing a text S[1..n] to support searching substrings in S that match a given pattern P[1..m] with at most k errors. A naive solution either has a worst-case matching time complexity of Ω(m k) or requires Ω(n k) space. Devising a solution with better performance has been a challenge until Cole et al. [5] showed an O(n logk n)-space index that can support k-error matching in O(m + occ + logk n loglogn) time, where occ is the number of occurrences. Motivated by the indexing of DNA, we investigate in this paper the feasibility of devising a linear-size index that still has a time complexity linear in m. In particular, we give an O(n)-space index that supports k-error matching in O(m + occ + (logn)\(^{k({\it k}+1)}\) loglogn) worst-case time. Furthermore, the index can be compressed from O(n) words into O(n) bits with a slight increase in the time complexity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.