Coarse grain load balance algorithm for detecting similar regions in DNA and proteins sequences

Manhal Elfadil Eltayeeb Elnour ,Muhammad Shafie Abd Latif ,Ismail Fauzi Isnin

doi:10.17485/ijst/2014/v7i5/49477

Manhal Elfadil Eltayeeb Elnour , Muhammad Shafie Abd Latif + Show 1 more

https://doi.org/10.17485/ijst/2014/v7i5/49477

Copy DOI

Abstract

The tremendous quantity and quality of data obtained by conformations of DNA and protein sequences makes their analysis very time consuming, complex, expensive and impractical. Therefore, the feasible way to identify new sequences is to compare them with well-known sequences available in established genetic databank. Comparing sequences may reveal functional, structural, and evolutionary analogies between sequences. Needleman and Wunsch (NW), and Smith and Waterman (SW) algorithms are pioneers in dynamic programming matrix for comparing two sequences with gap penalty function. However, for long sequences both methods contribute toward time and space complexity. FASTA and BLAST are heuristic method based on hitting techniques for fast detection of similar region; unfortunately they produced results with no sensitivity. There remains a need for an efficient method(s) that can detect similar regions in two sequences with accurate results and reasonable time. In this paper, we extend an existing approach to develop an efficient parallel algorithm for pairwise local sequence alignment. Our method is based on load-balancing algorithm with the CPU scheduling technique in order to accelerate the calculation of data-dependency problem in sequences alignment. Using X86-based PC with eight logical processors we able to apply 4 MBP on the proposed algorithm with a speedup of 33% increase compared to the original SW algorithm.

Full Text