Abstract

This paper revisits the problem of indexing a text S [ 1 . . n ] for pattern matching with up to k errors. A naive solution either has a worst-case matching time complexity of Ω ( m k ) or requires Ω ( n k ) space, where m is the length of the pattern. Devising a solution with better performance has been a challenge until Cole et al. (2004) [5] showed an O ( n log k n ) -space index that can support k-error matching in O ( m + occ + log k n log log n ) time, where occ is the number of occurrences. Motivated by the indexing of long sequences like DNA, we have investigated the feasibility of devising a linear-size index that still has a time complexity linear in pattern length. This paper in particular presents an O ( n ) -space index that supports k-error matching in O ( m + occ + ( log n ) k ( k + 1 ) log log n ) worst-case time. This index can be further compressed from O ( n ) words into O ( n ) bits with a slight increase in the time complexity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.