Abstract

Exact string matching is one of the critical issues in the field of computer science. This study proposed a hybrid string matching algorithm called E- AbdulRazzaq. This algorithm used the best properties of two original algorithms; AbdulRazzaq and Berry-Ravindran Algorithms. The proposed algorithm showed an efficient performance in the number of attempts and number of character comparison when compared the original and recent to the standard algorithms. The proposed algorithm was applied in several types of databases, which are DNA sequences, Protein sequences, XML structures, Pitch characters, English texts, and Source codes. The Pitch database was the best match for E-AbdulRazzaq with the number of attempts involving long and short patterns, while the DNA database was the worst match. No data is specified as the best or worst with the E-AbdulRazzaq algorithm in terms of the character comparisons. The E-AbdulRazzaq algorithms ranked first in most databases when using short and long patterns, in terms of number of attempts and character comparisons.

Highlights

  • String matching is a searching operation carried out to check the optimal alignment by comparing two finite-length strings

  • The performance results of the E-AR algorithm and the original algorithms are compared in terms of the number of attempts and character comparisons when using short and long pattern lengths with different data types and sizes

  • The E-AR algorithm obtains the fewest number of attempts in short patterns because this algorithm relies on the efficient functions of the AbdulRazzaq algorithm and of the BR algorithm

Read more

Summary

Introduction

String matching is a searching operation carried out to check the optimal alignment by comparing two finite-length strings. There are five steps preprocessing of the AbdulRazzaq algorithm which are, Prime and composite numbers functions, the Boyer-Moore bad character (bmBc) step, the quick search bad character (qsBc) step and the hashing step. If matching is occurred, compared the hashing characters in the pattern and text window. The Berry-Ravindran algorithm is a hybrid of the Zhu-Takaoka and Quick-Search algorithms and characterized by left-right character comparisons [4]. This algorithm has two phases, which are preprocessing and searching. When matching or mismatching occurs the shifting process depends on the two characters of text window (m+1 and m+2) and the shifting value obtained from the brBc table in the preprocessing phase

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call