A coarse-grained multicomputer parallel algorithm for the sequential substring constrained longest common subsequence problem

Vianney Kengne Tchendji,Hermann Bogning Tepiele,Mathias Akong Onabid,Jean Frédéric Myoupo,Jerry Lacmou Zeutouo

doi:10.1016/j.parco.2022.102927

Abstract

In this paper, we study the sequential substring constrained longest common subsequence (SSCLCS) problem. It is widely used in the bioinformatics field. Given two strings X and Y with respective lengths m and n, formed on an alphabet Σ and a constraint sequence C formed by ordered strings (c1,c2,…,cl) with total length r, the SSCLCS problem is to find the longest common subsequence D between X and Y such that D contains in an ordered way c1,c2,…,cl. To solve this problem, Tseng et al. proposed a dynamic-programming algorithm that runs in Omnr+(m+n)|Σ| time. We rely on this work to propose a parallel algorithm for the SSCLCS problem on the Coarse-Grained Multicomputer (CGM) model. We design a three-dimensional partitioning technique of the corresponding dependency graph to reduce the latency time of processors by ensuring that at each step, the size of the subproblems to be performed by processors is small. It also minimizes the number of communications between processors. Our solution requires Onmr+(m+n)|Σ|p execution time with O(p) communication rounds on p processors. The experimental results show that our solution speedups up to 59.7 on 64 processors. This is better than the CGM-based parallel techniques that have been used in solving similar problems.

Full Text