Abstract

Pairwise statistical significance (PSS) has been recognized as a very useful method for homology detection. It can help in estimating whether the output of sequence alignment is evolutionarily link or just arisen by accident. However, pairwise statistical significance estimation (PSSE) poses a big challenge in terms of performance and scalability since it is both computationally intensive and data intensive to construct the empirical score distribution during the estimation. This paper presents a software library for estimating pairwise statistical significance in parallel, named Par-PSSE, implemented in C++ using OpenMP, MPI paradigms and their hybrids. Further, we apply the parallelization technique to estimate non-conservative PSS using standard, sequence-specific, and position-specific substitution matrices. These extensions have been found superior compared to the standard pairwise statistical significance in term of retrieval accuracy. Through distributing the compute-intensive kernels of the pairwise statistical significance estimation across multiple computational units, we achieve a speedup of up to 621.73× over the corresponding sequential implementation when using1024 cores.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.