Abstract
This paper proposes an approximate string matching with k-mismatches when calculating the generalized edit distance. When the edit distance is generalized, more sophisticated string matching can be provided. However, the execution time increases because of the bundle of complex computations for calculating complicated edit distances. The computational costs for finding which steps or edit distances are over k-mismatches cannot be significant in the generalized edit distance metric. Therefore, we can reduce the execution time by determining steps over k-mismatches and then skipping them. The diagonal step calculations using the pruning register skips unnecessary distance calculations over k-mismatches. The overhead of control statements and reordered memory accesses can be amortized by skipping multiple steps. Even though the proposed skipping method requires additional overhead, the proposed scheme's practical embodiments show that the execution time of string matching is reduced significantly when k is small.
Highlights
In the field of computer science, information retrieval is a fundamental problem
We show the experimental results depending on different edit distance metrics
When adopting the generalized edit distance metrics considering the visual similarity in shapes or keyboard character positions, the proposed skipping method can show better performance than the dynamic programming for small k-mismatches and the method using the reordered data structure
Summary
In the field of computer science, information retrieval is a fundamental problem. Notably, string matching is essential to digital information retrieval. Despite additional overhead in the diagonal step calculations and pruning register accesses, experiments show that the proposed skipping method can reduce the execution time of approximate string matching when k is small. DðXa; YbÀ 1Þk þ insertionðybÞ when DðXa; YbÀ 1Þk k: In Eq (4), when the edit distance of a data-dependent previous step (D(Xα−1, Yβ−1)k, D(Xα −1, Yβ)k, and D(Xα, Yβ−1)k) is over k, there is no need to evaluate its operation for calculating D
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.