Abstract

A variant of the Longest Common Subsequence (LCS) problem is the LCS problem with multiple substring-exclusion constraints (M-STR-EC-LCS), which has great importance in many fields especially in bioinformatics. This problem consists to compute the LCS of two strings X and Y of length n and m respectively that excluded a set of d constraints P={P1,P2,…,Pd} of total length r. Recently, Wang et al. proposed a sequential solution based on the dynamic programming technique that requires O(nmr) execution time and space. To the best of our knowledge, there is no parallel solutions for this problem. This paper describes new efficient parallel algorithms on Coarse Grained Multicomputer model (CGM) to solve this problem. Firstly, we propose a multi-level Direct Acyclic Graph (DAG) that determines the correct evaluation order of sub-problems in order to avoid redundancy due to overlap. Secondly, we propose two CGM parallel algorithms based on our DAG. The first algorithm is based on a regular partitioning of the DAG and requires O(nmrp) execution time with O(p) communication rounds where p is the number of processors used. Its main drawback is high idleness time of processors because due to the dependencies between the nodes in the DAG, over time it has many idle processors. The second algorithm uses an irregular partitioning of the DAG that minimizes this idleness time by allowing the processors to stay active as long as possible. It requires O(nmrp) execution time with O(kp) communication rounds. k is a constant integer allowing to setup the irregular partitioning. The both algorithms require O(r|Σ|p) preprocessing time where |Σ| is the length of the alphabet. The experimental results performed show a good agreement with theoretical predictions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call