Abstract

SummaryBiological sequence comparison is often used as an auxiliary task in the analysis of genetic material. Pairwise comparison algorithms like Smith‐Waterman evaluate two strings representing sequences of proteins, DNA or RNA to obtain optimal alignment between them. Many applications have been proposed to address the sequence comparison problem, prioritizing the use of graphics cards and proprietary languages such as CUDA. In this paper, we propose and evaluate MASA‐OpenCL, an OpenCL solution for comparing long DNA sequences that is based on the MASA sequence alignment framework, with pruning capability proportional to the similarity of the sequences compared. The results of MASA‐OpenCL were compared to its CUDA counterpart (MASA‐CUDAlign) and, in most cases, MASA‐OpenCL achieved better performance. In order to better understand the behavior of MASA‐OpenCL, we performed a statistical analysis considering 11 comparisons of sequences with high, medium and low similarity in 4 GPUs. As a result, we obtained a multiple linear regression model that considers (a) the sizes of the sequences, (b) the similarity between them, (c) the computational power of the GPU, and (d) the GPU memory bandwidth. We used this model to predict the performance in two other GPUs, with low error rates.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.