Abstract

Finding patterns in biological sequences owns a significant impact on many real-world applications such as biological sequence analysis, text indexing, stream data mining, and sensor networking. The problem of Pattern Matching with Wildcards and Length Constraints is to find all locations of occurrences of a pattern P in a text T, which can be a biological sequence, text string, etc. The user can specify a varying range for the number of wildcards between every two consecutive letters in P and also the length constraints of P. Another constraint is the one-off condition, where every literal in T can only be used once for matching with P. The on-line version of this problem is to find out an occurrence of the given pattern that satisfies all constraints as soon as the occurrence appears in the input of T so far. There is an algorithm SAIL to find the optimal solution for the on-line version of this problem. However, SAIL only handles exact pattern matching. In this paper, we propose an efficient on-line algorithm for approximate pattern matching with wildcards and length constraints, which is a more general problem than exact matching. We apply dynamic programming in our algorithm and prove that our algorithm is correct.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.